weekend ai reads for 2026-01-30

šŸ“° ABOVE THE FOLD: FORECASTING

It’s Over — For the first time, an AI has placed in the top 1% of a major forecasting tournament. Here is how it works. / Will Ward, The Oracle by Polymarket, Substack, archive (11 minute read)

Mantic doesn’t work as a single call to a language model. We make many, many different calls during the workflow. You can imagine a factory line with lots of different workers doing different jobs: breaking down the question, doing research, pursuing different lines of inquiry, then bringing it all together into a clean, well-informed prediction.

Emerging research suggests that algorithmic systems do more than match users with content; they're also shaping people's identities.

Researchers describe this phenomenon as ā€œalgorithmic persistence,ā€ in which systems continue to serve content tied to a presumed identity long after it is no longer applicable. Klein notes that because recommender systems are optimized for engagement rather than accuracy, they have little incentive to recalibrate unless user behavior changes significantly, something many people don't know how to do, or even realize is necessary.

The quant shop — AI lab convergence / The Financial Times (19 minute read)

In both cases the core technical job is actually identical. You approximate a latent conditional distribution and act on it under constraints. And in both cases you only find out if you’re any good out-of-sample — in backtests and live PnL for quants; on held-out benchmarks and real users for LLMs.

LLM forecasts appear optimistic relative to historical and future realized returns. When prompted for 80% confidence interval predictions, LLM responses are better calibrated than survey evidence but are pessimistic about outliers, leading to skewed forecast distributions. The findings suggest LLMs manifest common behavioral biases when forecasting expected returns but are better at gauging risks than humans.

 

šŸ“» QUOTES OF THE WEEK

When so-called Responsible AI or AI ethics is defined in ways that avoid confronting exploitation, war, colonial extraction, gendered and sexual violence, and other systems of oppression, then what are we even trying to do as a community?

Bhaskar Mitra (source)

 

I really am mostly programming in English now, a bit sheepishly telling the LLM what code to write... in words. It hurts the ego a bit

Andrej Karpathy (source)

 

šŸ‘„ FOR EVERYONE

My Claude Code Psychosis / Jasmine Sun, Substack, archive (13 minute read)

It’s addictive to express a vision and see it instantly appear, getting into the build/test/iterate loop at an electrifying rate. There’s an apt joke that Claude Code is GPT-4o for nerds: it reflects your desires and makes them real, providing the rush of creation with minimal sweat.

All of these $10/month apps are suddenly a weekend project for me. I’m an engineer, but I have never written a single macOS application. I’ve never even read Swift code in my life, and yet, I now can get an app up and running in a couple of hours. This is crazy.

Like My New Blouse? Thanks, It’s AI / Wall Street Journal, archive (7 minute read)

A new Alice + Olivia spring collection features long pleated skirts and silk bomber jackets with an ethereal tarot-card print that was made with the help of Leonardo AI and Adobe Firefly. The team fed the AI images of a similar, hand-painted print the brand made in 2017 and asked for a new ā€œgoddess, French-inspiredā€ one. Arianna Moultrie, the brand’s senior print and concept designer, used Photoshop to fix small mistakes the AI made, like the blurring of a lion’s face.

The realistic nature of the images spooked her. The edits weren’t obviously exaggerated or cartoonish. In our call, Mayes repeated, in shock, that the image edits closely resembled her body, down to the dip of her collarbone to the proportions of her chest and waist. ā€œTruth to be told, on social media, I said, ā€˜This is not me,ā€™ā€ she admits. ā€œBut, my mind is like, ā€˜This is not too far from my body.ā€™ā€

For Anchalia, however, AI was an enabler. His film’s budget was less than 15% of a traditional Bollywood production, with 95% of the 75-minute long movie generated by AI. After the trailer dropped, the computer-generated titular heroine Naisha even landed an endorsement deal with a Hyderabad-based jewellery brand.

  • try as it might, A.i. will never top this / YouTube

 

šŸ“š FOUNDATIONS

Skills, Tools and MCPs - What’s The Difference? / Artificial Ignorance, Substack, archive (13 minute read)

Think about building a travel planning agent:

The Tools are your flight APIs, hotel APIs, and calendar integrations - the actual actions you can take. These might be simple function calls or full MCP servers that expose complex travel services.

The MCPs provide the infrastructure to access multiple travel services through a consistent interface. Instead of hardcoding each airline’s API, you connect to MCP servers that handle the messy details of different booking systems.

The Skill is ā€œTravel Planningā€ - the expertise about how to actually help someone plan a trip. It knows to ask about preferences first, check multiple options, consider proximity between hotel and activities, handle date conflicts, and maintain context about budget constraints. The Skill leverages Tools through the MCP infrastructure, but provides the strategic knowledge that those layers don’t have.

The mental shift is from ā€œhow do I prompt this?ā€ to ā€œwhat capabilities, infrastructure, and expertise does my system architecture need?ā€ In an ideal world, you’re not crafting bespoke prompts anymore - you’re creating reusable systems.

How to Become AI-Native: 8 Habits to Break in 2026 — Execution is cheap now. These 8 habits quietly block speed, learning, and real progress in an AI-driven world. / AI Fire (15 minute read)

 

šŸš€ FOR LEADERS

The Palantirization of everything / Andreessen Horowitz (19 minute read)

  • inclusion of this is not — in any way — advocating for the existence of Palantir; or Andreessen Horowitz for that matter

The real lesson from Palantir is about product architecture:

- Unified data model and permissioning layer

- Common workflow engine and UI primitives

- Configuration over code wherever possible

Why We’ve Tried to Replace Developers Every Decade Since 1969 — The Pattern That Frustrates Everyone / Caimito (11 minute read)

The right question isn’t ā€œWill this eliminate our need for developers?ā€ The right questions are:

1. Will this help our developers work more effectively on complex problems?
2. Will this enable us to build certain types of solutions faster?
3. Does this reduce time spent on repetitive tasks so developers can focus on unique challenges?
4. Will our team need to learn new skills to use this effectively?

These questions acknowledge that development involves irreducible complexity while remaining open to tools that provide genuine leverage.

 

šŸŽ“ FOR EDUCATORS

via kemal, Pedagogy of AI as Normal Technology / Chicago Center for Teaching and Learning, The University of Chicago (11 minute read)

I see AI as similar to a calculator or debuggers; they’re just technology. AI is potentially quite a bit more powerful than a lot of the tools I just mentioned, but fundamentally it’s not different. So, I have an explicit policy in my syllabus with an eye toward that. I don’t just permit my students to use AI; I actively encourage it.

To avoid accusations of AI cheating, college students turn to AI — Students are taking new measures, such as dumbing down their work, spying on themselves and using AI ā€œhumanizerā€ programs, to beat accusations of cheating with artificial intelligence. / NBC News (15 minute read)

A Professor Trusted ChatGPT With 2 Years of Work—Then 1 Click Wiped It All Away — He relied on ChatGPT for course material—until disabling data sharing permanently deleted years of work. / Inc (3 minute read)

 

šŸ“Š FOR TECHNOLOGISTS

Advanced Claude Code techniques / Lenny’s Newsletter, Substack, archive (5 minute read)

The ā€œappend system promptā€ command in Claude Code is severely underused. This powerful command lets you inject context before any user interaction begins. By combining it with file reading commands like cat, you can load entire directories of documentation and diagrams into Claude’s context.

Claude Code’s hooks feature can run scripts when the AI stops generating content. This allows you to automatically check for TypeScript errors, linting issues, or code quality problems and feed those errors back to Claude to fix. You can even set up conditional commits when code passes all checks, eliminating manual steps in your workflow.

In this talk from Hatch Conference, Jenny Wen, Design Lead at Anthropic and former Director of Design at Figma, explains why that model no longer fits the reality of modern design work.

This is not a rejection of research or strategy. It is a call to stop worshipping process artifacts and start trusting designer judgment again.

I Stopped Reading Code. My Code Reviews Got Better. / Source Code, Every (14 minute read)

This is code review done the compound engineering way: Agents review in parallel, findings become decisions, and every correction teaches the system what to catch next time.

I’ll show you how I set it up, how it caught a critical bug I would have missed, and how you can start—even without custom tooling.

 

šŸŽ‰ FOR FUN

The Turing Reel - Runway Research — Can you tell real video from AI? We showed 1,000 people two videos from the same frame - one real, one generated. Less than 10% could tell the difference. Try it yourself.

  • easier when looking at still images from each video

Isometric NYC / cannoneyed blog (21 minute read)

I find the usual conversations about AI and creativity to be pretty boring - we’ve been talking about cameras and sampling for years now, and I’m not particularly interested in getting mired down in the muck of the morality and economics of it all. I’m really only interested in one question: What’s possible now that was impossible before?

  • the project; really fun if you’re somewhat familiar with New York City, Isometric NYC

What I Learned Making 34 Novels with Claude Sonnet / triptych (Andrew Wooldridge), Write As (11 minute read)

Another theme was that instead of having the good guy win or the bad guy win, the AI would try to seek some third option – a compromise between the two opposing sides ideas. It was very strange to see these things happen time after time. So if you need to have more unique storylines, I suggest you give your prompts advice to avoid those names and situations.

So Long Sucker - AI Deception Benchmark — Which AI Lies Best?

A game theory classic designed by John Nash that requires betrayal to win. Now a benchmark for AI deception.

 

🧿 AI-ADJACENT

What Are You Designed to Do? / Psychology Today (12 minute read)

4 Ways to Discover What You’re Designed to Do

1. Study your defaults under pressure.
2. Look beyond your job title.
3. Experiment, don’t declare.
4. Refine through service.

 

ā‹„