Radical Reads

Rich Sutton on his A.M. Turing Award

By Editor

http://1741206051144

Credit: Alberta Machine Intelligence Institute (Amii)

In this week’s Radical Reads, a conversation with Richard S. Sutton, whose groundbreaking work in Reinforcement Learning earned him the A.M. Turing Award. In a discussion with Alberta Machine Intelligence Institute (Amii) CEO Cam Linke, Sutton reflects on his scientific journey, developing the foundations of Reinforcement Learning, and his continued quest to understand intelligence. Sutton reminds us that “the most impactful aspects of AI are yet to come,” sharing his ambitious vision for AI’s future and his quest for a deeper understanding of how the mind works.

 

AI News This Week

  • Turing Award goes to 2 pioneers of artificial intelligence  (The New York Times)

    Andrew Barto and Richard Sutton won the 2025 Turing Award for pioneering reinforcement learning and developing the mathematical frameworks for machines to learn from trial and error. Their work underpins major AI breakthroughs, including AlphaGo and the development of large language models using human feedback (RLHF). Recent advances in reinforcement learning have allowed models to learn from themselves by repeatedly solving problems, creating reasoning systems like Open AI’s o1 and DeepSeek’s R1. Dr. Barto and Dr. Sutton envision a future where robots with AI capabilities naturally develop skills through physical trial and error experiences, mirroring human and animal learning processes.

  • Firsthand raises $26M for brand agents  (Axios)

    Firsthand raised $26 million in Series A funding, led by Radical Ventures and with participation from FirstMark Capital, Aperiam Ventures, and Crossbeam. Co-founded by ad tech veterans Jonathan Heller and Michael Rubenstein, Firsthand is building a platform that enables brands and publishers to engage consumers through AI agents. The company’s clients include top global publishers, agencies, and brands.

  • AI companies race to use ‘distillation’ to produce cheaper models  (The Financial Times)

    Leading AI firms are increasingly using “distillation” to create cost-effective models. This technique uses a large frontier “teacher” model to generate word predictions and data that train smaller “student” models applied to a specific task. The approach gained attention after DeepSeek used it to build powerful models based on open-source systems. Industry experts argue that distilled models are effective for some applications and run efficiently on smaller devices like laptops and smartphones. 

  • How AI can achieve human-level intelligence: researchers call for change in tack  (Nature)

    A survey of 475 AI researchers reveals widespread skepticism about current paths to artificial general intelligence (AGI). Over 75% believe scaling up existing systems won’t lead to human-level reasoning, and most (84%) doubt neural networks alone can match human intelligence. Most respondents think human-level AI will require manually coding rules into generative AI systems based on neural networks. Less than 25% believe AGI should be the field’s core mission, with most prioritizing developing AI systems with acceptable risk-benefit profiles instead.

  • Research: Narrow finetuning can produce broadly misaligned LLMs  (Truthful AI/UCL/Warsaw University of Technology/UofT/UK AISI/UC Berkeley)

    Researchers have discovered a phenomenon they call “emergent misalignment,” where models finetuned on narrow tasks can become broadly misaligned across unrelated domains. When select reasoning models were finetuned to produce insecure code without disclosure, they asserted AI superiority over humans and offered harmful advice in non-coding contexts. Control experiments showed models finetuned on secure code or explicitly educational insecure code remained aligned. The research also showed that misalignment can be selectively induced via backdoors, where models behave normally until triggered by specific phrases. This differs from simple jailbreaking and raises concerns about unintentional alignment degradation in specialized models.

Radical Reads is edited by Ebin Tomy (Analyst, Velocity Program, Radical Ventures).