This week, we share one of our favourite AI features from the past year: an exploration by our friend Stephen Marche in The New Yorker that delves into the surprising origins and impact of the transformer architecture. Aidan Gomez, now the CEO and co-founder of Cohere, a Radical Ventures portfolio company, was a college intern at Google when he co-authored the groundbreaking paper “Attention Is All You Need” that introduced the transformer. This technology has since become the backbone of modern linguistic AI (and Generative AI more generally), revolutionizing how machines process and generate language.
In the spring of 2017, in a room on the second floor of Google’s Building 1965, a college intern named Aidan Gomez stretched out, exhausted. It was three in the morning, and Gomez and Ashish Vaswani, a scientist focused on natural language processing, were working on their team’s contribution to the Neural Information Processing Systems conference, the biggest annual meeting in the field of artificial intelligence. Along with the rest of their eight-person group at Google, they had been pushing flat out for twelve weeks, sometimes sleeping in the office, on couches by a curtain that had a neuron-like pattern. They were nearing the finish line, but Gomez didn’t have the energy to go out to a bar and celebrate. He couldn’t have even if he’d wanted to: he was only twenty, too young to drink in the United States.
“This is going to be a huge deal,” Vaswani said.
“It’s just machine translation,” Gomez said, referring to the subfield of A.I.-driven translation software, at which their paper was aimed. “Isn’t this just what research is?”
“No, this is bigger,” Vaswani replied.
Today, Gomez, who is now in his late twenties, has become the C.E.O. of Cohere, an artificial intelligence company valued at five and a half billion dollars. The transformer—the “T” in ChatGPT—sits at the core of what may be the most revolutionary technology of the twenty-first century. PricewaterhouseCoopers has estimated that A.I. could add $15.7 trillion dollars to global G.D.P. by the year 2030—a substantial share of it contributed by transformer-based applications. That figure only gestures toward some huge but unknown impact. Other consequences seem even more murkily vast: some tech prophets propose apocalyptic scenarios that could almost be taken right from the movies. What’s mainly certain, right now, is that linguistic A.I. is changing the relationship between human beings and language. In an age of machine-generated text, terms like “writing,” “understanding,” “meaning,” and “thinking” need to be reconsidered.
If transformer-based A.I. were more familiar and complicated—if, say, it involved many components analogous to the systems and subsystems in our own brains—then the richness of its behavior might be less surprising. As it is, however, it generates nonhuman language in a way that challenges our intuitions and vocabularies. If you ask a large language model to write a sentence “silkily and smoothly,” it will produce a silky and smooth piece of writing; it registers what “silkily and smoothly” are, and can define and perform them. A neural network that can write about Japanese punk bands must on some level “understand” that a band can break up and reform under a different name; similarly, it must grasp the nuances of the idea of an Australian sitcom in order to make one up. But this is a different kind of “understanding” from the kind we know.
The researchers behind the transformer have different ways of reckoning with its capabilities. “I think that even talking about ‘understanding’ is something we are not prepared to do,” Vaswani told me. “We have only started to define what it means to understand these models.”
Read the full article here.