Radical Reads

Cracking the AI Chip Shortage – Strategies and Opportunities

By David Katz, Partner

Radical Ventures Partner David Katz was featured in a Financial Times article on the state of the AI chip market and the dynamics facing challengers. This week, David shares more on how the current AI chip market is impacting startups.

Underpinning recent advances in AI are high-performance processor chips. The global supply of these chips is heavily constrained, driving up costs and favouring global hyper-scalers with their own dedicated supplies, well funded startups with strategic value to the cloud providers, or large enterprise incumbents with the purchasing power to secure access to large amounts of compute power.

Access to tens of thousands of advanced graphics chips is crucial for companies training large AI models that can generate original text and analysis.

Without them, work on generative AI models like LLMs runs much slower and companies lose their competitive edge in this fast-moving space. Nvidia’s advanced graphic chips excel at doing lots of computations simultaneously, which is crucial for AI work. UBS analysts estimate an earlier version of ChatGPT required about 10,000 graphic chips. Chip costs vary, but Nvidia’s advanced AI chips are sold by some retailers for around $33,000 USD, though they can command higher prices in secondary markets amid the high demand.

While Nvidia is dominating this market, startups are seizing the opportunity to innovate, offering solutions to tackle the shortages. Radical Ventures portfolio company Untether AI has developed more power-efficient AI inference chips, while CentML is creating “virtual chips” by optimizing the existing scarce inventory of high performance and legacy chips for AI training and inference workloads. As AI continues to shape the future of software, all companies must carefully consider their compute strategy, factoring in price, performance, chip availability, infrastructure expertise, strategic cloud relationships and commercial approach.

AI News This Week

  • Transformers: the Google scientists who pioneered an AI revolution  (Financial Times)

    The transformer, introduced in the groundbreaking 2017 paper “Attention is All You Need,” is the pivotal technology driving today’s generative AI companies and the development of large language models. It builds upon decades of prior research and was developed collaboratively by individuals within disparate research groups at Google. Despite its success, the departure of these authors from Google highlights the challenges faced by large organizations when it comes to innovation. Aidan Gomez, one of the co-authors of the transformers paper and also a co-founder of Radical portfolio company Cohere, recognized that the potential of large language models was not being fully realized and set out to create Cohere with the vision of broadening access to this technology for all businesses.

  • Cohere is teaming up with McKinsey to bring AI to enterprise clients  (VentureBeat)

    Radical Ventures portfolio company Cohere and McKinsey have teamed up to help organizations integrate generative AI into their operations, redefine business processes, train and upskill their workforces, and use this emerging technology to tackle some of the toughest challenges enterprises currently face. The collaboration will be spearheaded on the McKinsey side by QuantumBlack, McKinsey’s AI division. Cohere and McKinsey intend to offer secure, enterprise-grade generative AI solutions tailored to McKinsey clients’ needs — including cloud and on-premises AI software that will safeguard a client’s data.

  • Policy: Voluntary commitments from tech giants to manage the risks posed by AI  (The White House)

    Amazon, Google, Meta, and Microsoft are among the companies that voluntarily committed to White House-prompted safeguards for their AI technology. These commitments promote transparent, safe, and secure AI development through eight measures such as letting independent experts test models, investing in cybersecurity, flagging societal risks, and watermarking AI-generated audio and visual content. This pledge will be in effect until Congress enacts official AI regulations.

  • Is GPT-4 getting worse over time?  (Arvind Narayanan and Sayash Kapoor)

    A new paper is being interpreted as saying GPT-4 has gotten worse since its release. In reality, this is a vast oversimplification of the findings of the study. Much of the performance degradation found in the original paper may come down to the choice of evaluation data (prime numbers). Behaviour drift should be on LLM users’ radars as code that is deployed to users might break if the model underneath changes its behaviour. The capabilities still exist, but may require new prompting strategies to elicit the same results.

  • Research Spotlight – Lost in the middle: how language models use long contexts  (Stanford University/University of California, Berkeley)

    A large part of AI system building is debugging. Understanding how language models use input data can help. Modern language models are able to process extensive prompts but limited knowledge exists about how well models leverage the information in these longer contexts. A recent study reveals that language models frequently encounter difficulties when key information is located within the middle of lengthy input contexts and see a further decline in performance as the input context lengthens. The researchers suggest new evaluation protocols for future long-context models. The takeaways support that performance can be optimized through more efficient prompting even if a model is designed to handle certain prompt styles or lengths.

Radical Reads is edited by Leah Morris (Senior Director, Velocity Program, Radical Ventures).