weekend ai reads for 2025-01-10

šŸ“° ABOVE THE FOLD: SPORTS

  • global football, that is

Using what it claims is the largest video database of global youth football – with players logged from 28 countries – the company says it can now determine which young players most fit the description of current or recent top stars as defined by one of eight archetypes. These include the ideal ā€œbox-to-box midfielderā€, ā€œmodern No 9ā€, ā€œplaymaking No 10ā€ and ā€œinverted wing-backā€.

  • American football, that is

Wimbledon ditches human line judges for electronic line calling — The grass-court Grand Slam announced it will switch to electronic-line calling next year. / The Verge (6 minute read)

 

šŸ“» QUOTES OF THE WEEK

Because one agent is just software, two agents are an undebuggable mess.

Andriy Burkov (source)

We’re not adding any more software engineers next year because we have increased the productivity this year with Agentforce and with other AI technology that we’re using for engineering teams by more than 30%, to the point where our engineering velocity is incredible. I can’t believe what we’re achieving in engineering.

Marc Benioff (source)

 

šŸ‘„ FOR EVERYONE

This means that those with significant capital when labour-replacing AI started have a permanent advantage. They will wield more power than the rich of today—not necessarily over people, to the extent that liberal institutions remain strong, but at least over physical and intellectual achievements. Upstarts will not defeat them, since capital now trivially converts into superhuman labour in any field.

  • the message, as always: being poor is worse than not being poor

Once It Has Been Trained, Who Will Own My Digital Twin? / The Scholarly Kitchen (12 minute read)

  • spoiler: probably not you

  • a slightly less-than-usual curmudgeonly contextualizing of the o3 news

The AI Reporter That Took My Old Job Just Got Fired — A local newspaper in Hawaii experimented with AI-generated presenters to engage and boost its readership. After two months, the bots have been shelved. / Wired (8 minute read)

In one particularly stilted exchange about the pumpkin giveaway, Rose asked James, ā€œAnd how have these free pumpkins impacted the community?ā€ to which James responded, ā€œThe free pumpkins have brought joy to many.ā€

Mechanized minds: AI’s hidden impact on human thought — While we’re busy wondering whether machines will ever become conscious, we rarely stop to ask: What happens to us? / Big Think (18 minute read)

Instead of asking whether machines will ever become conscious, we might ask whether humans can become conscious enough to outgrow the ā€œartificial intelligenceā€ both inside them and in the machines around them.

 

šŸ“š FOUNDATIONS

Something’s Coming — This post is meant to be an explainer for friends and readers who haven’t been paying close attention to what’s been happening in AI. / John August (12 minute read)

How to Write AI Art Prompts? (Examples + Templates) / Hypotenuse AI (16 minute read)

  • the new models, especially the Fluxes, are even more funner when you prompt them well

  • an easy prompt to copy-paste into your LLM

  • even works with Gemma and Mistral

 

šŸš€ FOR LEADERS

Integrating AI Agents into Companies / Austin Vernon’s Blog (7 minute read)

  • valuable throughout

1. Massively increase the use of wikis and other written content.

…

2. Move from reviews to standardized pre-approvals and surveillance.

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks / Carnegie Mellon University, Duke University, and independent researchers, arXiv (58 minute read)

We test baseline agents powered by both closed API-based and open-weights language models (LMs), and find that with the most competitive agent, 24% of the tasks can be completed autonomously. This paints a nuanced picture on task automation with LM agents -- in a setting simulating a real workplace, a good portion of simpler tasks could be solved autonomously, but more difficult long-horizon tasks are still beyond the reach of current systems.

While we are committed to making these types of investments, we recognize that we cannot do everything ourselves. We need to tap into the platforms and expertise that technology companies can provide, as demonstrated by our recently announced collaboration between Tempus and Medtronic Structural Heart.

 

šŸŽ“ FOR EDUCATORS

What counts as cheating is determined, ultimately, by institutions and examiners. Many universities are already adapting their approach to assessment, penning ā€œAI-positiveā€ policies.

via george, Interactionalism: Re-Designing Higher Learning for the Large Language Agent Era / Mihnea C. Moldoveanu, George Siemens, arXiv (32 minute read)

We introduce Interactionalism as a new set of guiding principles and heuristics for the design and architecture of learning now available due to Generative AI (GenAI) platforms. Specifically, we articulate interactional intelligence as a net new skill set that is increasingly important when core cognitive tasks are automatable and augmentable by GenAI functions.

Educating Our Kids on AI | Free Live Event — January 23, 2025, 3-4 p.m. ET / Section School

Featuring Garrett Smiley, CEO & Co-founder of Sora Schools, John Danner, Co-founder of Project Read AI and Spark Space, and Ted Dintersmith, Founder of WhatSchoolCouldBe.org

  • not sure if this is ā€œeducating our kids about A.i.ā€ or ā€œeducating our kids using A.i.ā€ — let us know what you find out

 

šŸ“Š FOR TECHNOLOGISTS

How to Build a Truly Useful AI Product — Generative AI breaks the old startup playbook / Thesis, Every (12 minute read)

How I program with LLMs / David Crawshaw (23 minute read)

Agents / Chip Huyen (41 minute read)

This section will start with an overview of agents and then continue with two aspects that determine the capabilities of an agent: tools and planning. Agents, with their new modes of operations, have new modes of failure. This section will end with a discussion on how to evaluate agents to catch these failures.

 

šŸŽ‰ FOR FUN

Evaluating Large Language Models’ Capability to Launch Fully Automated Spear Phishing Campaigns: Validated on Human Subjects / Harvard Kennedy School, Avant Research Group, and independent researchers, arXiv (71 minute read)

A control group of arbitrary phishing emails, which received a click-through rate (recipient pressed a link in the email) of 12%, emails generated by human experts (54% click-through), fully AI-automated emails 54% (click-through), and AI emails utilizing a human-in-the-loop (56% click-through). Thus, the AI-automated attacks performed on par with human experts and 350% better than the control group.

  • not sure whether this is a stronger signal about AI’s strengths or the test group’s weaknesses

  • the lag between this and when spam filters catch up is going to be frustrating

  • it won’t

  • lots of interesting examples at the link

  • the last one, ā€œA young woman doing a complex floor gymnastics routine at the Olympics, featuring running and flips.ā€ is our new go-to GIF response for every situation

My Stupid Friend / Chrome Web Store

Reclaims the internet by replacing all instances of ā€˜ChatGPT’ with ā€˜my stupid friend.’

 

🧿 AI-ADJACENT

The 7 Coolest Mathematical Discoveries of 2024 / Scientific American (6 minute read)

They used information theory to find patterns in his music that help explain how Bach conveyed messages—including musical, mathematical and emotional information—through his works.

  • there is such a thing as model-overfitting

 

ā‹„