That AI Thing
Posts
weekend ai reads for 2025-02-07

weekend ai reads for 2025-02-07

February 07, 2025

behind the scenes: one way in which we avoid accidentally ending up on that guy’s website is a bookmarklet that converts links on a webpage from that site into “xcancel.com” links; you can install the bookmarklet at this page, if you want

📰 ABOVE THE FOLD: THIS WEEK’S MODEL NEWS

OpenAI launches ‘deep research’ tool that it says can match research analyst | OpenAI / The Guardian (5 minute read)

point, The End of Search, The Beginning of Research / One Useful Thing (Ethan Mollick), Substack archive (11 minute read)

It wove together difficult and contradictory concepts, found some novel connections I wouldn’t expect, cited only high-quality sources, and was full of accurate quotations. I cannot guarantee everything is correct (though I did not see any errors) but I would have been satisfied to see something like it from a beginning PhD student

counterpoint, Deep Research, Deep [BS], and the potential (model) collapse of science / Marcus on AI (Gary Marcus), Substack archive (7 minute read)

Unfortunately, virtually everything that Deep Research produces will pass a LGTM test (“it ‘looks good to me’”); people will assume it’s legit, even when it is isn’t. Few people will fact-check its output carefully, since it will all superficially look ok.

Gemini 2.0 model updates: 2.0 Flash, Flash-Lite, Pro Experimental / Google blog (5 minute read)

Mistral Small 3 / Mistral blog (5 minute read)

Mistral Small 3 has far fewer layers than competing models, substantially reducing the time per forward pass. At over 81% accuracy on MMLU and 150 tokens/s latency, Mistral Small is currently the most efficient model of its category.

S1: The $6 R1 Competitor? / Tim Kellogg (6 minute read)

After sifting their dataset of 56K examples down to just the best 1K, they found that the core 1K is all that’s needed to achieve o1-preview performance on a 32B model. Adding data didn’t raise performance at all.

the examples are apparently very science-y but matching o1-preview with 32B is impressive

Run DeepSeek-R1 Dynamic 1.58-bit / Unsloth blog (12 minute read)

The 1.58bit quantization should fit in 160GB of VRAM for fast inference (2x H100 80GB), with it attaining around 140 tokens per second for throughput and 14 tokens/s for single user inference. You don't need VRAM (GPU) to run 1.58bit R1, just 20GB of RAM (CPU) will work however it may be slow.

you can probably run R1 on your (above average) laptop

📻 QUOTES OF THE WEEK

It’s not enough for a few individuals to be ahead of the curve in adopting the new skills. Bessen explains that “what matters to a mill, an industry, and to society generally is not how long it takes to train an individual worker but what it takes to create a stable, trained workforce”. Today, every company that is going to be touched by this revolution (which is to say, every company) needs to put its shoulder to the wheel. We need an AI-literate workforce. What is programming, after all, but the way that humans get computers to do our bidding? The fact that “programming” is getting closer and closer to human language, that our machines can understand us rather than us having to speak to them in their native tongue of 0s and 1s, or some specialized programming language pidgin, should be cause for celebration.

Tim O’Reilly (source)

👥 FOR EVERYONE

Inside a network of AI-generated newsletters targeting “small town America” — Good Daily, which operates in 47 states and 355 towns and cities across the U.S., is run by one person. / Nieman Journalism Lab (16 minute read)

These automated agents “read the news” in every town where Good Daily operates, curate the most relevant stories, summarize them, edit and approve the copy, format it into a newsletter, and publish. “Local news should be local. The problem is, at this point, there are economic challenges keeping that from happening. Smaller communities rarely can support enough staff to run a traditional news organization,” said Henderson, who currently runs Good Daily from New York City. “I see technology, and LLMs specifically, as our best shot to fix this.”

“Just give me the f***ing links!”—Cursing disables Google’s AI overviews / Ars Technica (5 minute read)

For instance, when searching for “how do you turn off [adjective] Google AI results,” a variety of curse word adjectives reliably disabled the AI Overviews, while adjectives like “dumb” or “lousy” did not. Inserting curse words randomly at any point in the search query seems to have a similar effect.

How Indigenous engineers are using AI to preserve their culture / NBC News (8 minute read)

Indigenous people make up less than 0.005% of the tech workforce in the U.S., hold only 0.4% of bachelor’s degrees in computer science every year and have one board member at the top 200 tech companies. In 2022, Native-founded companies only received a mere 0.02% of total venture capital funding.

Bill Gates on AI and Innovation / Harvard Magazine (6 minute read)

For instance, Gates noted that one of the “sins of computing” is its ability to reinforce false narratives. AI, he acknowledged, could either exacerbate or mitigate this problem, depending on how it is deployed.

AI-Generated Slop Is Already In Your Public Library / 404 Media (11 minute read)

Edith Wheeler Memorial Library in Monroe, CT, told me. “If you're going to say, ‘we have 15,000 ebooks on our platform,’ and 5,000 of those are low quality, AI generated or stuff that's just put on there without any kind of like oversight or selection criteria being followed, what are you actually offering to us?”

📚 FOUNDATIONS

AI is Creating a Generation of Illiterate Programmers / Namanyay’s blog (5 minute read)

Previously, every error message used to teach me something. Now? The solution appears magically, and I learn nothing. The dopamine hit of instant answers has replaced the satisfaction of genuine understanding.

Get exactly what you want from ChatGPT: AI prompt engineer best advice — ‘Think about giving instructions to a child’ / CNBC (7 minute read)

Generating image descriptions and alt-text with AI / Dries Buytaert (18 minute read)

LLMs come in many forms, but for this project, I focused on image-to-text and multi-modal models. Both types of models can analyze images and generate text, either by describing images or answering questions about them.

…

In short, the vision encoder sees the image, while the language encoder describes it.

🚀 FOR LEADERS

AI in the workplace: A report for 2025 / McKinsey Digital (43 minute read)

C-suite leaders participating in our survey are more than twice as likely to say employee readiness is a barrier to adoption as they are to blame their own role. But as previously noted, employees indicate that they are quite ready.

GenAI and the future enterprise / Deloitte Insights (34 minute read)

When it comes to generative AI strategies, the stakes are high—and so is the uncertainty. These three-year scenarios can help organizations plan their gen AI strategies.

useful throughout

Data readiness in the age of generative AI — Six data essentials to stay ahead of the pack [PDF] / Accenture (14 minute read)

we’re suckers for anything that promotes the importance of data, if you hadn’t noticed

🎓 FOR EDUCATORS

What exactly is an AI-Empowered University System? — And why would you empower your university with OpenAI? / AI Log, Substack archive (8 minute read)

related, OpenAI and the CSU system bring AI to 500,000 students & faculty / OpenAI blog (5 minute read)
we think it’s public information but we’re not sure; but the price per student is lower than you think

Nvidia CEO Jensen Huang says everyone should get an AI tutor / Fortune (8 minute read)

“It actually empowers me, and gives me the confidence to go tackle more and more ambitious things,” he continued. “It’s going to empower you; it’s going to make you feel confident. I feel more empowered today, more confident to learn something today.”

That Other Guy’s DOGE Is Running Highly Sensitive Government Data Through AI: Report / Gizmodo (8 minute read)

[His] team of Stasi programmers at DOGE are feeding highly sensitive data from the Department of Education through artificial intelligence tools, according to an extremely predictable, yet still shocking, report from the Washington Post.

📊 FOR TECHNOLOGISTS

o3-mini is really good at writing internal documentation / Simon Willison (3 minute read)

and we’re not
also works for adding comments to code

How I use LLMs as a staff engineer / Sean Goedecke (6 minute read)

AI From Prototype to Production — Your step-by-step guide to scaling generative [PDF] / Google (6 minute read)

Explore, Curate and Vector Search Any Hugging Face Dataset with Nomic Atlas / Hugging Face (6 minute read)

Nomic’s official Hugging Face connector lets you import, explore, and curate any of these datasets in Hugging Face with only a few clicks. This makes it easy for anyone to see what's in these datasets, to create embeddings from them, and to search and organize these massive and important datasets in new ways.

Atlas has a free tier

Constitutional Classifiers: Defending against universal jailbreaks / Anthropic (9 minute read)

The demo will be live from Feb 3, 2025 to Feb 10, 2025. It includes a feedback form where you can contact us to report any successful jailbreaks as well as information on our Responsible Disclosure Policy, which we ask that participants follow. We’ll announce any successes and the general results of the demo in an update to this post.

🎉 FOR FUN

cellm: Use LLMs in Excel formulas / getcellm, GitHub (11 minute read)

For example, you can write =PROMPT(A1, “Extract all person names mentioned in the text.”) in a cell’s formula and drag the cell to apply the prompt to many rows. Cellm is useful when you want to use AI for repetitive tasks that would normally require copy-pasting data in and out of a chat window many times.

Codename Goose — Your on-machine AI agent, automating engineering tasks seamlessly.

open source, free
a bit janky at times but fun to test with

The Beatles won a Grammy last night, thanks to AI / Tech Crunch (3 minute read)

Pickle — Your AI body double in zoom calls

Pickle captures your best look with AI, creating a digital double that's always Zoom-ready.

Earkick — Your Free Personal AI Therapist

Measure & improve your mental health in real time with your personal AI chat bot. No sign up. Available 24/7. Daily insights just for you!

it’s no Virginia Satir, but it’s an interesting way to spend 10 minutes

BookRead — Read Books Effortlessly

Recap your previous readings effortlessly and pick up right where you left off.

🧿 AI-ADJACENT

Almost one in 10 people use the same four-digit PIN / ABC News (8 minute read)

1234
good visualizations showing how most people try to avoid additional mental load

A New VR Game Puts You in the Middle of Real English Premier League Plays — The new game from Rezzil called Premier League Player uses spatial data captured in real soccer games to recreate critical plays in virtual reality. / Wired archive (18 minute read)

Nobody Cares / Grant Slatton blog (8 minute read)

I want to live in a community where everyone cares.

⋄