That AI Thing
Posts
weekend ai reads for 2024-12-06

weekend ai reads for 2024-12-06

December 06, 2024

📰 ABOVE THE FOLD: THEY MADE A MOVIE ABOUT WHY THIS IS A BAD IDEA

Astronauts on Long Missions Will Need Personal AI Assistants / Universe Today (7 minute read)

‎2001: A Space Odyssey (1968) written by Stanley Kubrick and Arthur C. Clarke, based on Clarke’s short story “The Sentinel”, directed by Stanley Kubrick

Meta Military Chatbot Gives “Worthless” Guidance for Airstrike Scenario — The marketing of a new military tech tool powered by Meta’s artificial intelligence is “irresponsible” and “clumsy,” experts said. / The Intercept (11 minute read)

‎Terminator 2: Judgment Day (1991) written by James Cameron and William Wisher, directed by James Cameron

AI in policing: The end of crime? — How is AI used in policing and does this result in a more just society? / Rose Tinted Tech, Substack (sorry) (16 minute read)

‎Minority Report (2002) written by Scott Frank and Jon Cohen, based on the novella by Philip K. Dick, directed by Steven Spielberg

AI is pushing the boundaries of modern manufacturing / Volvo Group (6 minute read)

In the dynamic environment of our truck production, an invention from Volvo Group researchers has been introduced setting a new standard for human and robot collaboration. The new automated transporters are a unique AI-powered solution with remarkably user-friendly design.

‎I, Robot (2004) written by Jeff Vintar and Akiva Goldsman, based on the book by Isaac Asimov, directed by Alex Proyas

The confusing reality of AI friends — Millions of people are turning to AI for companionship. They are finding the experience surprisingly meaningful, unexpectedly heartbreaking, and profoundly confusing, leaving them to wonder, ‘Is this real? And does that matter?’ / The Verge (46 minute read)

‎Her (2013) written and directed by Spike Jonze

📻 QUOTES OF THE WEEK

If tech workers are doing measurable work, they’re doing easy work. They’re doing repetitious work. This work isn’t the most valuable work, and you should minimize it when possible.

Dave Anderson (source)

People have too inflated sense of what it means to “ask an AI” about something. The AI are language models trained basically by imitation on data from human labelers. Instead of the mysticism of “asking an AI”, think of it more as “asking the average data labeler” on the internet.

Andrej Karpathy (source)

👥 FOR EVERYONE

via mark, Getting Work Done with GenAI: Just do the opposite — 10 contrarian rules that may actually work / Darren Oberst, Medium (17 minute read)

Use Small Models, Not Large Ones. The basic equation in favor of small model use is a statement in the form of 80% of the value/accuracy at 1% of the cost.

First impressions of the new Amazon Nova LLMs (via a new llm-bedrock plugin) / Simon Willison (12 minute read)

official release, Introducing Amazon Nova: Frontier intelligence and industry leading price performance / AWS News Blog (13 minute read)
technical report, The Amazon Nova family of models: Technical report and model card / Amazon Science (10 minute read)

Fugatto, World’s Most Flexible Sound Machine, Debuts — Using text and audio as inputs, a new generative AI model from NVIDIA can create any combination of music, voices and sounds. / Nvidia Blog (28 minute read)

Github page with many more examples

Marc Benioff says AI’s future is agents, not chatbots — The Salesforce CEO said technology is “hitting the upper limits of the LLMs” that power AI chatbots / Quartz (4 minute read)

related, Microsoft quietly assembles the largest AI agent ecosystem—and no one else is close / Venture Beat (9 minute read)

This is where employees create Copilot agents to share their documents or presentations with their team or other partners, so that others can interact with the content and ask questions about it.

Quantization matters / Aider blog (3 minute read)

Open source models are often available at dozens of different quantizations. Most seem to only modestly decrease code editing skill, but stronger quantizations do have a real impact.

Does Prompt Formatting Have Any Impact on LLM Performance? / Microsoft & MIT, arXiv (41 minute read)

Experiments show that GPT-3.5-turbo’s performance varies by up to 40% in a code translation task depending on the prompt template, while larger models like GPT-4 are more robust to these variations. Our analysis highlights the need to reconsider the use of fixed prompt templates, as different formats can significantly affect model performance.

Our research into Large Language Models (LLMs), GPT-based models in particular, reveals that prompt formatting preferences vary by model. As demonstrated in Figure 5, GPT-3.5-turbo prefers JSON, whereas GPT-4 favors Markdown.

📚 FOUNDATIONS

The Modern Artificial Intelligence Primer / Booz Allen Hamilton (49 minute read)

surprisingly readable primer; PDF is at link

LLMOps Database — A curated knowledge base of real-world LLMOps implementations, with detailed summaries and technical notes. / ZenML

sort by technology, industry, etc.
submit your case study

AI Agents: A New Architecture for Enterprise Automation / Menlo Ventures (12 minute read)

🚀 FOR LEADERS

AI Slowdown Is Everyone Else’s Opportunity — Businesses will benefit from some much-needed breathing space to figure out how to deliver that all-important return on investment. / Bloomberg (7 minute read)

if you feel like you’re behind, maybe this is a time to catch-up
want help? we know people that can help.

AI-Native Applications: A Framework for Evaluating the Future of Enterprise Software / Sapphire Ventures (28 minute read)

2024: The State of Generative AI in the Enterprise — The enterprise AI landscape is being rewritten in real time. As pilots give way to production, we surveyed 600 U.S. enterprise IT decision-makers to reveal the emerging winners and losers. / Menlo Ventures (17 minute read)

Data-Informed, NOT Data-Driven / Ant Murphy (14 minute read)

Riddled with confirmation bias, nothing has really changed under the surface. We’re still running on opinions, but now we have pretty numbers on a slide deck to rationalize them.

This is why we advocate for being data-informed rather than data-driven.

Use data to inform decisions. Don’t blindly take the results at face-value.

🎓 FOR EDUCATORS

Generative AI Use Cases in Education / Edtech Insiders

This page can be used to search and filter across the entire AI Tool database. Use it to find specific tools or tools that meet your areas of interest.

Universities Are Woefully Under-Resourced For AI Research. They’re Fighting To Change That. — Stanford University has 300 GPUs. Microsoft will have 1.8 million. Here’s what’s needed to make academia relevant in AI research. / Big Technology, Substack (sorry) (9 minute read)

New AI tool boosts student performance by nearly 50% / The Educator Australia (7 minute read)

Students recorded an average 47% improvement in final response quality, while 69% with low-scoring responses demonstrated deeper understanding by their final attempt.

A staggering 87% of students reengaged with the AI tool to improve low-scoring responses, with the tool’s “learning loop” encouraging students to refine their responses.

the company is Education Perfect

📊 FOR TECHNOLOGISTS

aisuite — Simple, unified interface to multiple Generative AI providers / andrewyng, GitHub (5 minute read)

It is a thin wrapper around python client libraries, and allows creators to seamlessly swap out and test responses from different LLM providers without changing their code.

technical debt is going to slowly decline as a major factor prior to major technical decisions, for reasons like AI Suite and MCP (below)

Introducing the Model Context Protocol / Anthropic (4 minute read)

Today, we're open-sourcing the Model Context Protocol (MCP), a new standard for connecting AI assistants to the systems where data lives, including content repositories, business tools, and development environments.

related (possibly) Building LLMs is probably not going be a brilliant business / Cal Paterson (11 minute read)

Data as an assembly line (w/ Cedric Chin) / The Analytics Engineering Podcast (51 minute podcast)

Cedric Chin runs Commoncog—a publication about accelerating business expertise. He joins Tristan to talk about the analytics development lifecycle, how organizations value (or misvalue) data, and why “data teams are not some IT helpdesk to be ignored.”

How to Improve Your AI Agent: A Guide for Founders / Product Hunt blog (7 minute read)

🎉 FOR FUN

GenChess / Google Labs

design a chess set with prompts

Someone Just Tricked AI Agent Into Sending Them ETH / U Today (8 minute read)

Some of the strategies used by the players involved convincing the AI that there was a serious vulnerability or gaslighting it about how transferring the funds would not break any rules.

World Labs’ AI can generate interactive 3D scenes from a single photo / Tech Crunch (6 minute read)

Fei-Fei Li’s startup, in case you’d forgotten
related, Google DeepMind unveils Genie 2, an AI that Generates Playable 3D Worlds / Maginative (5 minute read)
one of the main challenges in today’s AI-generated videos is creating realistic interactions between objects and characters; by incorporating genuine 3D models, this company’s approach could potentially lead to more convincing AI-generated videos

Invisible text that AI chatbots understand and humans can’t? Yep, it’s a thing. / Ars Technica (19 minute read)

We’ll talk about prompt injection more later in this post. For now, the point is that Rehberger’s inclusion of ASCII smuggling allowed his POCs to stow the confidential data in an invisible string appended to the URL.

Flight Deals Hunter / iMean AI

kayak (or booking), but with an ai bot

🧿 AI-ADJACENT

Pete & Bas - T-Pain / The Dor Brothers, YouTube (2 minute video)

The Dor Brothers do fun AI music videos
Pete & Bas’s original video is good too, even without Macron riding a kids’ bicycle; they probably look different to what you expect

⋄