- That AI Thing
- Posts
- weekend ai reads for 2026-01-30
weekend ai reads for 2026-01-30
direct links are available on the web at https://thataithing.beehiiv.com/p/weekend-ai-reads-for-2026-01-30
š° ABOVE THE FOLD: FORECASTING
Itās Over ā For the first time, an AI has placed in the top 1% of a major forecasting tournament. Here is how it works. / Will Ward, The Oracle by Polymarket, Substack, archive (11 minute read)
Mantic doesnāt work as a single call to a language model. We make many, many different calls during the workflow. You can imagine a factory line with lots of different workers doing different jobs: breaking down the question, doing research, pursuing different lines of inquiry, then bringing it all together into a clean, well-informed prediction.
Just got divorced ā why am I seeing wedding content? / Mashable (12 minute read)
Emerging research suggests that algorithmic systems do more than match users with content; they're also shaping people's identities.
Researchers describe this phenomenon as āalgorithmic persistence,ā in which systems continue to serve content tied to a presumed identity long after it is no longer applicable. Klein notes that because recommender systems are optimized for engagement rather than accuracy, they have little incentive to recalibrate unless user behavior changes significantly, something many people don't know how to do, or even realize is necessary.
The quant shop ā AI lab convergence / The Financial Times (19 minute read)
In both cases the core technical job is actually identical. You approximate a latent conditional distribution and act on it under constraints. And in both cases you only find out if youāre any good out-of-sample ā in backtests and live PnL for quants; on held-out benchmarks and real users for LLMs.
What Does ChatGPT Make of Historical Stock Returns? Extrapolation and Miscalibration in LLM Stock Return Forecasts / Purdue University, Emory University, arxiv (46 minute read)
LLM forecasts appear optimistic relative to historical and future realized returns. When prompted for 80% confidence interval predictions, LLM responses are better calibrated than survey evidence but are pessimistic about outliers, leading to skewed forecast distributions. The findings suggest LLMs manifest common behavioral biases when forecasting expected returns but are better at gauging risks than humans.
How Google Maps quietly allocates survival across Londonās restaurants - and how I built a dashboard to see through it / Laurenās data Substack, Substack, archive (13 minute read)
š» QUOTES OF THE WEEK
When so-called Responsible AI or AI ethics is defined in ways that avoid confronting exploitation, war, colonial extraction, gendered and sexual violence, and other systems of oppression, then what are we even trying to do as a community?
I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit
š„ FOR EVERYONE
My Claude Code Psychosis / Jasmine Sun, Substack, archive (13 minute read)
Itās addictive to express a vision and see it instantly appear, getting into the build/test/iterate loop at an electrifying rate. Thereās an apt joke that Claude Code is GPT-4o for nerds: it reflects your desires and makes them real, providing the rush of creation with minimal sweat.
related (1), Your App Subscription Is Now My Weekend Project / Roberto Selbach (3 minute read)
All of these $10/month apps are suddenly a weekend project for me. Iām an engineer, but I have never written a single macOS application. Iāve never even read Swift code in my life, and yet, I now can get an app up and running in a couple of hours. This is crazy.
related (2), Four Games, One Afternoon / Ivan Designs (7 minute read)
Like My New Blouse? Thanks, Itās AI / Wall Street Journal, archive (7 minute read)
A new Alice + Olivia spring collection features long pleated skirts and silk bomber jackets with an ethereal tarot-card print that was made with the help of Leonardo AI and Adobe Firefly. The team fed the AI images of a similar, hand-painted print the brand made in 2017 and asked for a new āgoddess, French-inspiredā one. Arianna Moultrie, the brandās senior print and concept designer, used Photoshop to fix small mistakes the AI made, like the blurring of a lionās face.
For These Women, Grokās Sexualized Images Are Personal / Rolling Stone (12 minute read)
The realistic nature of the images spooked her. The edits werenāt obviously exaggerated or cartoonish. In our call, Mayes repeated, in shock, that the image edits closely resembled her body, down to the dip of her collarbone to the proportions of her chest and waist. āTruth to be told, on social media, I said, āThis is not me,āā she admits. āBut, my mind is like, āThis is not too far from my body.āā
itās not just Grok, Dozens of nudify apps found on Google and Appleās app stores ā The Tech Transparency Project identified dozens of apps that can create nonconsensual sexualized deepfakes beyond xAIās Grok. / The Verge (8 minute read)
Lights, camera, algorithm: Why Indian cinema is awash with AI / BBC Future (12 minute read)
For Anchalia, however, AI was an enabler. His filmās budget was less than 15% of a traditional Bollywood production, with 95% of the 75-minute long movie generated by AI. After the trailer dropped, the computer-generated titular heroine Naisha even landed an endorsement deal with a Hyderabad-based jewellery brand.
try as it might, A.i. will never top this / YouTube
š FOUNDATIONS
Skills, Tools and MCPs - Whatās The Difference? / Artificial Ignorance, Substack, archive (13 minute read)
Think about building a travel planning agent:
The Tools are your flight APIs, hotel APIs, and calendar integrations - the actual actions you can take. These might be simple function calls or full MCP servers that expose complex travel services.
The MCPs provide the infrastructure to access multiple travel services through a consistent interface. Instead of hardcoding each airlineās API, you connect to MCP servers that handle the messy details of different booking systems.
The Skill is āTravel Planningā - the expertise about how to actually help someone plan a trip. It knows to ask about preferences first, check multiple options, consider proximity between hotel and activities, handle date conflicts, and maintain context about budget constraints. The Skill leverages Tools through the MCP infrastructure, but provides the strategic knowledge that those layers donāt have.
The mental shift is from āhow do I prompt this?ā to āwhat capabilities, infrastructure, and expertise does my system architecture need?ā In an ideal world, youāre not crafting bespoke prompts anymore - youāre creating reusable systems.
Clawdbot Showed Me What the Future of Personal AI Assistants Looks Like / MacStories (13 minute read)
we tried it and removed it after a couple of hours; it was scary how much it could do
it has been renamed Moltbot, after Anthropic understandably objected to the name
related (1), A backdoor was the āmost downloadedā skill for viral Clawdbot ā āInfinite liability surfaceā / The Stack (9 minute read)
related (2), Hundreds of Clawdbot instances were exposed on the internet. Hereās how to not be one of them / JP Caparas, Medium, archive (6 minute read)
How to Become AI-Native: 8 Habits to Break in 2026 ā Execution is cheap now. These 8 habits quietly block speed, learning, and real progress in an AI-driven world. / AI Fire (15 minute read)
š FOR LEADERS
The Palantirization of everything / Andreessen Horowitz (19 minute read)
inclusion of this is not ā in any way ā advocating for the existence of Palantir; or Andreessen Horowitz for that matter
The real lesson from Palantir is about product architecture:
- Unified data model and permissioning layer
- Common workflow engine and UI primitives
- Configuration over code wherever possible
Inside KPMG's $450 million COVID boondoggle thatās becoming a secret weapon for the AI revolution / Fortune (14 minute read)
Why Weāve Tried to Replace Developers Every Decade Since 1969 ā The Pattern That Frustrates Everyone / Caimito (11 minute read)
The right question isnāt āWill this eliminate our need for developers?ā The right questions are:
1. Will this help our developers work more effectively on complex problems?
2. Will this enable us to build certain types of solutions faster?
3. Does this reduce time spent on repetitive tasks so developers can focus on unique challenges?
4. Will our team need to learn new skills to use this effectively?
These questions acknowledge that development involves irreducible complexity while remaining open to tools that provide genuine leverage.
š FOR EDUCATORS
via kemal, Pedagogy of AI as Normal Technology / Chicago Center for Teaching and Learning, The University of Chicago (11 minute read)
I see AI as similar to a calculator or debuggers; theyāre just technology. AI is potentially quite a bit more powerful than a lot of the tools I just mentioned, but fundamentally itās not different. So, I have an explicit policy in my syllabus with an eye toward that. I donāt just permit my students to use AI; I actively encourage it.
To avoid accusations of AI cheating, college students turn to AI ā Students are taking new measures, such as dumbing down their work, spying on themselves and using AI āhumanizerā programs, to beat accusations of cheating with artificial intelligence. / NBC News (15 minute read)
A Professor Trusted ChatGPT With 2 Years of WorkāThen 1 Click Wiped It All Away ā He relied on ChatGPT for course materialāuntil disabling data sharing permanently deleted years of work. / Inc (3 minute read)
š FOR TECHNOLOGISTS
Advanced Claude Code techniques / Lennyās Newsletter, Substack, archive (5 minute read)
The āappend system promptā command in Claude Code is severely underused. This powerful command lets you inject context before any user interaction begins. By combining it with file reading commands like cat, you can load entire directories of documentation and diagrams into Claudeās context.
related, Claude Code Templates ā Ready-to-use configurations for your Claude Code projects
Beyond Generative: The Rise Of Agentic AI And User-Centric Design / Smashing Magazine (21 minute read)
Claude Codeās hooks feature can run scripts when the AI stops generating content. This allows you to automatically check for TypeScript errors, linting issues, or code quality problems and feed those errors back to Claude to fix. You can even set up conditional commits when code passes all checks, eliminating manual steps in your workflow.
related, Why Designers Can No Longer Trust the Design Process / Hatch Conference, YouTube (25 minute video)
In this talk from Hatch Conference, Jenny Wen, Design Lead at Anthropic and former Director of Design at Figma, explains why that model no longer fits the reality of modern design work.
This is not a rejection of research or strategy. It is a call to stop worshipping process artifacts and start trusting designer judgment again.
I Stopped Reading Code. My Code Reviews Got Better. / Source Code, Every (14 minute read)
This is code review done the compound engineering way: Agents review in parallel, findings become decisions, and every correction teaches the system what to catch next time.
Iāll show you how I set it up, how it caught a critical bug I would have missed, and how you can startāeven without custom tooling.
š FOR FUN
The Turing Reel - Runway Research ā Can you tell real video from AI? We showed 1,000 people two videos from the same frame - one real, one generated. Less than 10% could tell the difference. Try it yourself.
easier when looking at still images from each video
Isometric NYC / cannoneyed blog (21 minute read)
I find the usual conversations about AI and creativity to be pretty boring - weāve been talking about cameras and sampling for years now, and Iām not particularly interested in getting mired down in the muck of the morality and economics of it all. Iām really only interested in one question: Whatās possible now that was impossible before?
the project; really fun if youāre somewhat familiar with New York City, Isometric NYC
What I Learned Making 34 Novels with Claude Sonnet / triptych (Andrew Wooldridge), Write As (11 minute read)
Another theme was that instead of having the good guy win or the bad guy win, the AI would try to seek some third option ā a compromise between the two opposing sides ideas. It was very strange to see these things happen time after time. So if you need to have more unique storylines, I suggest you give your prompts advice to avoid those names and situations.
all the books are available for free at his library
So Long Sucker - AI Deception Benchmark ā Which AI Lies Best?
A game theory classic designed by John Nash that requires betrayal to win. Now a benchmark for AI deception.
š§æ AI-ADJACENT
What Are You Designed to Do? / Psychology Today (12 minute read)
4 Ways to Discover What Youāre Designed to Do
1. Study your defaults under pressure.
2. Look beyond your job title.
3. Experiment, donāt declare.
4. Refine through service.
ā