That AI Thing
Posts
weekend ai reads for 2026-01-30

weekend ai reads for 2026-01-30

January 30, 2026

direct links are available on the web at https://thataithing.beehiiv.com/p/weekend-ai-reads-for-2026-01-30

📰 ABOVE THE FOLD: FORECASTING

It’s Over — For the first time, an AI has placed in the top 1% of a major forecasting tournament. Here is how it works. / Will Ward, The Oracle by Polymarket, Substack, archive (11 minute read)

Mantic doesn’t work as a single call to a language model. We make many, many different calls during the workflow. You can imagine a factory line with lots of different workers doing different jobs: breaking down the question, doing research, pursuing different lines of inquiry, then bringing it all together into a clean, well-informed prediction.

Just got divorced — why am I seeing wedding content? / Mashable (12 minute read)

Emerging research suggests that algorithmic systems do more than match users with content; they're also shaping people's identities.

Researchers describe this phenomenon as “algorithmic persistence,” in which systems continue to serve content tied to a presumed identity long after it is no longer applicable. Klein notes that because recommender systems are optimized for engagement rather than accuracy, they have little incentive to recalibrate unless user behavior changes significantly, something many people don't know how to do, or even realize is necessary.

The quant shop — AI lab convergence / The Financial Times (19 minute read)

In both cases the core technical job is actually identical. You approximate a latent conditional distribution and act on it under constraints. And in both cases you only find out if you’re any good out-of-sample — in backtests and live PnL for quants; on held-out benchmarks and real users for LLMs.

What Does ChatGPT Make of Historical Stock Returns? Extrapolation and Miscalibration in LLM Stock Return Forecasts / Purdue University, Emory University, arxiv (46 minute read)

LLM forecasts appear optimistic relative to historical and future realized returns. When prompted for 80% confidence interval predictions, LLM responses are better calibrated than survey evidence but are pessimistic about outliers, leading to skewed forecast distributions. The findings suggest LLMs manifest common behavioral biases when forecasting expected returns but are better at gauging risks than humans.

How Google Maps quietly allocates survival across London’s restaurants - and how I built a dashboard to see through it / Lauren’s data Substack, Substack, archive (13 minute read)

📻 QUOTES OF THE WEEK

When so-called Responsible AI or AI ethics is defined in ways that avoid confronting exploitation, war, colonial extraction, gendered and sexual violence, and other systems of oppression, then what are we even trying to do as a community?

Bhaskar Mitra (source)

I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit

Andrej Karpathy (source)

👥 FOR EVERYONE

My Claude Code Psychosis / Jasmine Sun, Substack, archive (13 minute read)

It’s addictive to express a vision and see it instantly appear, getting into the build/test/iterate loop at an electrifying rate. There’s an apt joke that Claude Code is GPT-4o for nerds: it reflects your desires and makes them real, providing the rush of creation with minimal sweat.

related (1), Your App Subscription Is Now My Weekend Project / Roberto Selbach (3 minute read)

All of these $10/month apps are suddenly a weekend project for me. I’m an engineer, but I have never written a single macOS application. I’ve never even read Swift code in my life, and yet, I now can get an app up and running in a couple of hours. This is crazy.

related (2), Four Games, One Afternoon / Ivan Designs (7 minute read)

Like My New Blouse? Thanks, It’s AI / Wall Street Journal, archive (7 minute read)

A new Alice + Olivia spring collection features long pleated skirts and silk bomber jackets with an ethereal tarot-card print that was made with the help of Leonardo AI and Adobe Firefly. The team fed the AI images of a similar, hand-painted print the brand made in 2017 and asked for a new “goddess, French-inspired” one. Arianna Moultrie, the brand’s senior print and concept designer, used Photoshop to fix small mistakes the AI made, like the blurring of a lion’s face.

For These Women, Grok’s Sexualized Images Are Personal / Rolling Stone (12 minute read)

The realistic nature of the images spooked her. The edits weren’t obviously exaggerated or cartoonish. In our call, Mayes repeated, in shock, that the image edits closely resembled her body, down to the dip of her collarbone to the proportions of her chest and waist. “Truth to be told, on social media, I said, ‘This is not me,’” she admits. “But, my mind is like, ‘This is not too far from my body.’”

it’s not just Grok, Dozens of nudify apps found on Google and Apple’s app stores — The Tech Transparency Project identified dozens of apps that can create nonconsensual sexualized deepfakes beyond xAI’s Grok. / The Verge (8 minute read)

Lights, camera, algorithm: Why Indian cinema is awash with AI / BBC Future (12 minute read)

For Anchalia, however, AI was an enabler. His film’s budget was less than 15% of a traditional Bollywood production, with 95% of the 75-minute long movie generated by AI. After the trailer dropped, the computer-generated titular heroine Naisha even landed an endorsement deal with a Hyderabad-based jewellery brand.

try as it might, A.i. will never top this / YouTube

📚 FOUNDATIONS

Skills, Tools and MCPs - What’s The Difference? / Artificial Ignorance, Substack, archive (13 minute read)

Think about building a travel planning agent:

The Tools are your flight APIs, hotel APIs, and calendar integrations - the actual actions you can take. These might be simple function calls or full MCP servers that expose complex travel services.

The MCPs provide the infrastructure to access multiple travel services through a consistent interface. Instead of hardcoding each airline’s API, you connect to MCP servers that handle the messy details of different booking systems.

The Skill is “Travel Planning” - the expertise about how to actually help someone plan a trip. It knows to ask about preferences first, check multiple options, consider proximity between hotel and activities, handle date conflicts, and maintain context about budget constraints. The Skill leverages Tools through the MCP infrastructure, but provides the strategic knowledge that those layers don’t have.

The mental shift is from “how do I prompt this?” to “what capabilities, infrastructure, and expertise does my system architecture need?” In an ideal world, you’re not crafting bespoke prompts anymore - you’re creating reusable systems.

Clawdbot Showed Me What the Future of Personal AI Assistants Looks Like / MacStories (13 minute read)

we tried it and removed it after a couple of hours; it was scary how much it could do
it has been renamed Moltbot, after Anthropic understandably objected to the name
related (1), A backdoor was the “most downloaded” skill for viral Clawdbot — “Infinite liability surface” / The Stack (9 minute read)
related (2), Hundreds of Clawdbot instances were exposed on the internet. Here’s how to not be one of them / JP Caparas, Medium, archive (6 minute read)

How to Become AI-Native: 8 Habits to Break in 2026 — Execution is cheap now. These 8 habits quietly block speed, learning, and real progress in an AI-driven world. / AI Fire (15 minute read)

🚀 FOR LEADERS

The Palantirization of everything / Andreessen Horowitz (19 minute read)

inclusion of this is not — in any way — advocating for the existence of Palantir; or Andreessen Horowitz for that matter

The real lesson from Palantir is about product architecture:

- Unified data model and permissioning layer

- Common workflow engine and UI primitives

- Configuration over code wherever possible

Inside KPMG's $450 million COVID boondoggle that’s becoming a secret weapon for the AI revolution / Fortune (14 minute read)

Why We’ve Tried to Replace Developers Every Decade Since 1969 — The Pattern That Frustrates Everyone / Caimito (11 minute read)

The right question isn’t “Will this eliminate our need for developers?” The right questions are:

1. Will this help our developers work more effectively on complex problems?
2. Will this enable us to build certain types of solutions faster?
3. Does this reduce time spent on repetitive tasks so developers can focus on unique challenges?
4. Will our team need to learn new skills to use this effectively?

These questions acknowledge that development involves irreducible complexity while remaining open to tools that provide genuine leverage.

🎓 FOR EDUCATORS

via kemal, Pedagogy of AI as Normal Technology / Chicago Center for Teaching and Learning, The University of Chicago (11 minute read)

I see AI as similar to a calculator or debuggers; they’re just technology. AI is potentially quite a bit more powerful than a lot of the tools I just mentioned, but fundamentally it’s not different. So, I have an explicit policy in my syllabus with an eye toward that. I don’t just permit my students to use AI; I actively encourage it.

To avoid accusations of AI cheating, college students turn to AI — Students are taking new measures, such as dumbing down their work, spying on themselves and using AI “humanizer” programs, to beat accusations of cheating with artificial intelligence. / NBC News (15 minute read)

A Professor Trusted ChatGPT With 2 Years of Work—Then 1 Click Wiped It All Away — He relied on ChatGPT for course material—until disabling data sharing permanently deleted years of work. / Inc (3 minute read)

📊 FOR TECHNOLOGISTS

Advanced Claude Code techniques / Lenny’s Newsletter, Substack, archive (5 minute read)

The “append system prompt” command in Claude Code is severely underused. This powerful command lets you inject context before any user interaction begins. By combining it with file reading commands like cat, you can load entire directories of documentation and diagrams into Claude’s context.

related, Claude Code Templates — Ready-to-use configurations for your Claude Code projects

Beyond Generative: The Rise Of Agentic AI And User-Centric Design / Smashing Magazine (21 minute read)

Claude Code’s hooks feature can run scripts when the AI stops generating content. This allows you to automatically check for TypeScript errors, linting issues, or code quality problems and feed those errors back to Claude to fix. You can even set up conditional commits when code passes all checks, eliminating manual steps in your workflow.

related, Why Designers Can No Longer Trust the Design Process / Hatch Conference, YouTube (25 minute video)

In this talk from Hatch Conference, Jenny Wen, Design Lead at Anthropic and former Director of Design at Figma, explains why that model no longer fits the reality of modern design work.

This is not a rejection of research or strategy. It is a call to stop worshipping process artifacts and start trusting designer judgment again.

I Stopped Reading Code. My Code Reviews Got Better. / Source Code, Every (14 minute read)

This is code review done the compound engineering way: Agents review in parallel, findings become decisions, and every correction teaches the system what to catch next time.

I’ll show you how I set it up, how it caught a critical bug I would have missed, and how you can start—even without custom tooling.

🎉 FOR FUN

The Turing Reel - Runway Research — Can you tell real video from AI? We showed 1,000 people two videos from the same frame - one real, one generated. Less than 10% could tell the difference. Try it yourself.

easier when looking at still images from each video

Isometric NYC / cannoneyed blog (21 minute read)

I find the usual conversations about AI and creativity to be pretty boring - we’ve been talking about cameras and sampling for years now, and I’m not particularly interested in getting mired down in the muck of the morality and economics of it all. I’m really only interested in one question: What’s possible now that was impossible before?

the project; really fun if you’re somewhat familiar with New York City, Isometric NYC

What I Learned Making 34 Novels with Claude Sonnet / triptych (Andrew Wooldridge), Write As (11 minute read)

Another theme was that instead of having the good guy win or the bad guy win, the AI would try to seek some third option – a compromise between the two opposing sides ideas. It was very strange to see these things happen time after time. So if you need to have more unique storylines, I suggest you give your prompts advice to avoid those names and situations.

all the books are available for free at his library

So Long Sucker - AI Deception Benchmark — Which AI Lies Best?

A game theory classic designed by John Nash that requires betrayal to win. Now a benchmark for AI deception.

🧿 AI-ADJACENT

What Are You Designed to Do? / Psychology Today (12 minute read)

4 Ways to Discover What You’re Designed to Do

1. Study your defaults under pressure.
2. Look beyond your job title.
3. Experiment, don’t declare.
4. Refine through service.

⋄