This week we share another excerpt from the 2024 Radical AI Founders Masterclass, featuring a fireside chat between Sara Hooker, Head of Cohere For AI and VP of Research at Cohere, and Molly Welch, Partner at Radical Ventures.
Sara led Cohere’s work on Aya Expanse, a family of multilingual language models supporting 23 languages. The models set new performance benchmarks, outperforming larger competitors like Meta’s Llama and Alphabet’s Gemma models. This advancement in multilingual AI is crucial for democratizing access to AI technology across different languages and cultures.
In this conversation, Sara shares practical advice for researchers looking to build commercial AI products. The following excerpt has been edited for length and clarity.
Molly Welch
Are there key lessons from building Aya that you think could be applied more broadly to building AI products at scale?
Sara Hooker
Choose a problem that is not saturated, but one that if you make progress on it, has a massive impact. This was Aya for us. One of the things I am most proud of is that when we started Aya, none of the frontier models reported on what languages they covered. Now this is industry standard. Language is ultimately very human, and it’s been very special to see how people all over the world have connected with our products.
Molly
What should those with a research background consider when stepping out of the lab to build commercial AI products?
Sara
The gap between research and productization has never been smaller. If you care about working on open problems at the frontier, it has never been more aligned with having a real impact. This special moment also means we must think more carefully about how our work is used, considering the implications of overnight adoption.
The speed of this industry is very quick, but that makes things easier, especially when working with significant research or compute resources. However, you must plan ahead.
You need to be dynamic enough to have flexibility with your ideas, but you can’t lose your sense of what you think is important. In a startup context, you also need to have a goal, a metric, and an understanding of where you are going.
Molly
Are there practical tips for someone emerging from the lab and entering a startup context?
Sara
Rather than traditional KPIs, you want to measure progress and iteration. When you are trying to make an impact in the real world, you need something tied to the real world with quick feedback loops. You must have the sensibility to say, I just don’t think this is what we want to aim for. You must have a way to measure and iterate.
Watch the full conversation between Molly Welch and Sara Hooker here.
AI News This Week
-
What Donald Trump’s win means for AI (Time)
The next Trump administration’s AI policy is expected to focus on preserving U.S. leadership in AI development, particularly over China. The Trump White House is expected to roll back existing regulatory frameworks, including Biden’s AI Executive Order, emphasizing deregulation to boost infrastructure, data centers, and chip production while maintaining or tightening export controls on advanced technology to China. Despite campaign rhetoric, Trump is unlikely to repeal the CHIPS Act, which incentivizes semiconductor manufacturing on U.S. soil. His policy appears to be influenced by competing voices within his coalition, with figures like Vice President-elect J.D. Vance pushing for deregulation, while others, including Elon Musk, calling for AI safety measures.
-
Google DeepMind’s Demis Hassabis on his Nobel Prize: ‘It feels like a watershed moment for AI’ (Financial Times)
Google DeepMind CEO Demis Hassabis, recent winner of the Nobel Prize in Chemistry for AlphaFold’s protein structure predictions, sees this as a “watershed moment for AI” in scientific discovery. In an interview, Hassabis outlined ambitious plans beyond AlphaFold, including developing a virtual cell, advancing drug discovery, and solving major mathematical conjectures. Hassabis emphasized the importance of scientific understanding in AI development, particularly as we approach artificial general intelligence (AGI), which he believes could arrive within 5 to 20 years. He advocates for increased focus on understanding AI systems’ limitations and establishing proper controls.
-
A battle is raging over the definition of open-source AI (The Economist)
Open-source software has been vital to modern technology, but its application to AI faces significant challenges. The Open Source Initiative (OSI) argues that current practices from some model developers fall short of being truly open-source. For example, Meta’s Llama 3 is labeled “open-source,” but it includes usage restrictions and does not provide all the necessary code and data for building the model from scratch. Unlike traditional open-source software, which allows free access and modification of code, AI models pose unique challenges. The OSI has proposed new criteria for open-source AI, emphasizing four key freedoms: use, study, modify, and share. This debate carries regulatory implications, as governments weigh different requirements for open-source versus proprietary AI models.
-
AI’s $1.3 trillion future increasingly hinges on Taiwan (Bloomberg)
Taiwan has established itself as the world’s premier manufacturing base for AI hardware, becoming a “one-stop shop” for global tech leaders seeking to build their AI infrastructure. Beyond TSMC which manufactures most of the world’s high-performance AI chips, the island also hosts a comprehensive ecosystem of manufacturers specializing in servers, power systems, and cooling solutions — all critical components for AI development. While tensions with China remain the economy‘s greatest existential risk, Taiwan is poised to benefit from an AI market projected to reach $1.3 trillion by 2032.
-
Research: Collaborative search to adapt LLM experts via swarm intelligence ((University of Washington/Google Cloud AI Research/Google DeepMind)
Researchers have developed Model Swarms, a novel approach for adapting large language models (LLMs) through collective behavior inspired by swarm intelligence. Unlike existing methods that rely on model merging or extensive training data, Model Swarms treats each LLM as a “particle” that can move through weight space to optimize specific objectives. The system achieved up to 20% improvement over baseline approaches across various tasks, working effectively with as few as 200 examples. The method proved successful in four key areas: single-task optimization, multi-task domain adaptation, reward model optimization, and human interest alignment. Model Swarms offers a more flexible approach to model adaptation since it does not require assumptions about specific experts or how they should be combined.
Radical Reads is edited by Leah Morris (Senior Director, Velocity Program, Radical Ventures).