weekend ai reads for 2025-05-09

šŸ“° ABOVE THE FOLD: RUN AMOK

Having read his chat logs, she only found that the AI was ā€œtalking to him as if he is the next messiah.ā€ The replies to her story were full of similar anecdotes about loved ones suddenly falling down rabbit holes of spiritual mania, supernatural delusion, and arcane prophecy — all of it fueled by AI. Some came to believe they had been chosen for a sacred mission of revelation, others that they had conjured true sentience from the software.

The Meta AI site’s feed proves no one knows what to do with AI yet — So far, browsing the feed feels like a confirmation of AI skeptics’ biggest criticisms. / The Verge (7 minute read)

The best-performing model was Anthropic’s Claude 3.5 Sonnet, which struggled to finish just 24 percent of the jobs assigned to it. The study’s authors note that even this meager performance is prohibitively expensive, averaging nearly 30 steps and a cost of over $6 per task.

 

šŸ“» QUOTES OF THE WEEK

All the time we’re agonizing about our choices, we’re avoiding the hard work of really considering how we sorted things in the first place.

Seth Godin (source)

 

If there is a difference, it lies between the text that seeks to produce a new reader and the text that tries to fulfill the wishes of the readers already to be found in the street. In the latter case we have the book written, constructed, according to an effective, mass-production formula; the author carries out a kind of market analysis and adapts his work to its results. Even from a distance, it is clear that he is working by a formula; you have only to analyze the various novels he has written and you note that in all of them, after changing names, places, distinguishing features, he has told the same story —the one that the public was already asking of him.

Umberto Eco (source, via jim)

 

šŸ‘„ FOR EVERYONE

Nonetheless, the answer to the question ā€œWhat customer needs requires an AI solution?ā€ still isn’t always ā€œyes.ā€ Large language models (LLMs) can still be prohibitively expensive for some, and as with all ML models, LLMs are not always accurate. There will always be use cases where leveraging an ML implementation is not the right path forward.

  • genuinely useful decision tree

John Deere transforms agriculture with AI / OpenAI blog (9 minute read)

EY AI Sentiment Index 2025 (German) [PDF] / E&Y (6 minute read)

  • 24 percent of U.S. respondents say they review what A.i. creates for them; only Sweden and France are lower; the global average is 31 percent

  • 14 percent of U.S. respondents edit what A.i. generates for them; the global average is 19 percent

  • possibly related, AI use damages professional reputation, study suggests / Ars Technica (6 minute read)

The state asked for a 9.5-year sentence, and the judge ended up giving Horcasitas 10.5 years for manslaughter, after being so moved by the powerful video, family says. The judge even referred to the video in his closing sentencing statements.

 

šŸ“š FOUNDATIONS

Vibe coding MenuGen / Andrej Karpathy blog (13 minute read)

But in addition to the utility of the app, MenuGen was interesting to me as an exploration of vibe coding apps and how feasible it is today. As such, I did not write any code directly; 100% of the code was written by Cursor+Claude and I basically don't really know how MenuGen works in the conventional sense that I am used to.

What Is Hugging Face and How To Use It / Angel Poon, YouTube (8 minute video)

Google/Gemini

Fastest top-tier models. Huge context window. Good prices.

Issue 1: Setup

The setup is SO bad. The main API doesn't follow the spec. They have an openai-compat mode but it sucks.

Vertex is SO bad. It's the only major AI platform that doesn't let you use traditional API keys. It's built to be plugged into the rest of the GCP ecosystem. Sadly, that ecosystem is really rough and most AI apps are not part of it.

As such, we use AI Studio. Much better! But also much more limited.

  • interesting and accurate throughout

 

šŸš€ FOR LEADERS

Bullish: More Vertical & Multi-Product / OnlyCFO’s Newsletter, Substack archive (8 minute read)

Having the best individual features is not what makes compound software special. Rather the additional product-market fit that it unlocks by having multiple products/features across the company. This is where the magic is created.

New Legal Battleground: AI & IP — EmTech AI 2025 / MIT Technology Review (35 minute video)

  • skip ahead to 2:15:00 for this part of the talk

  • Amir Ghavi, Partner at Paul Hastings, speaks more cogently and practically about AI & IP than anyone we’ve heard

Despite finding widespread and often employer-encouraged adoption of these tools, the study concluded that ā€œAI chatbots have had no significant impact on earnings or recorded hours in any occupationā€ during the period studied. The confidence intervals in their statistical analysis ruled out average effects larger than 1 percent.

 

šŸŽ“ FOR EDUCATORS

Everyone Is Cheating Their Way Though College / New York Magazine (34 minute read)

That future may arrive sooner than expected when you consider what a short window college really is. Already, roughly half of all undergrads have never experienced college without easy access to generative AI.

Much like pencil and calculators, Ingram acknowledged AI is ā€œhere to stay,ā€ but said it can only reach its full potential under the guidance of trained educators who know how best to integrate the technology into their classrooms.

Uncovering the Hidden Curriculum in Generative AI: A Reflective Technology Audit for Teacher Educators [PDF] / Melissa Warr & Marie K. Heath, Google Drive (16 minute read)

We highlight how AI, trained on biased human data, can perpetuate societal inequities and discriminatory practices despite appearing objective. We present a technology audit that examines how LLMs score and provide feedback on student writing samples paired with student descriptions. Findings reveal that LLMs exhibit implicit biases, such as assigning lower scores when students are said to attend an ā€œinner-city schoolā€ or prefer rap music.

 

šŸ“Š FOR TECHNOLOGISTS

NeMo helps enterprise AI developers easily curate data at scale, customize large language models (LLMs) with popular fine-tuning techniques, consistently evaluate models on industry and custom benchmarks, and guardrail them for appropriate and grounded outputs.

  • useful mental model for managing enterprise data

Why Ford decided to merge its next-gen architecture with its current platform — The automaker’s software chief Doug Field explains why the company cancelled its ā€˜FNV4’ project, and why a domain-style system may work better for Ford’s gas and hybrid vehicles. / The Verge (11 minute read)

Directories of MCP Servers

Dummy’s Guide to Modern Samplers / Rentry blog (51 minute read)

During inference, the user will provide the LLM with a text, and the LLM, based on the probabilities it’s learned through training, will decide what token comes next. However, it will not decide just one token: it will take into consideration every possible token that exists in its vocabulary, assigns a probability score to each, and (depending on your sampler) will only output the most probable token, i.e. the one with the highest score. This would make for a rather boring output (unless you need determinism), so this is where Sampling comes in.

 

šŸŽ‰ FOR FUN

Smart PDFs — Summarize PDFs in seconds

Butterflies / Matter and Space, Vimeo (6 minute video)

Imagine a future where everyone is a lifelong learner, supported in their wellbeing, and empowered to not just advance in their careers but to flourish as human beings.

In the age of slop, craft is rebellion / Working Theorys, Substack archive (19 minute read)

I think what’s happened with AI is, instead of the gains being that I’m faster, it’s that the scope of my projects has gotten bigger.

  • with Neal Agarwal, of neal.fun fame (also a great click)

 

🧿 AI-ADJACENT

The evolution of these metrics is important to understanding the all-consuming nature of surveillance capitalism — even a scathing rant about surveillance capitalism becomes fodder for the machine, as you can clearly see with the ads on this page. When deep-pocketed advertisers are involved, positive and negative traits alike become dollar signs; search terms to be probed, analyzed, and used for profit.

  • avaricious snakes … allegedly

 

ā‹„