Radical Blog

The AI revolution is coming to robots

By

http://humanoid

Humanoid robots developed by the US company Figure use OpenAI programming for language and vision. Credit: AP Photo/Jae C. Hong/Alamy

This week we feature excerpts from a feature in Nature delving into the potential of AI to transform robotics. The article explores how breakthroughs in AI will spur advancements in the physical domain. The report features Covariant, a Radical Ventures portfolio company.

For a generation of scientists raised watching Star Wars, there’s a disappointing lack of C-3PO-like droids wandering around our cities and homes. Where are the humanoid robots fuelled with common sense that can help around the house and workplace?

Rapid advances in artificial intelligence (AI) might be set to fill that hole. “I wouldn’t be surprised if we are the last generation for which those sci-fi scenes are not a reality,” says Alexander Khazatsky, a machine-learning and robotics researcher at Stanford University in California.

For most AI researchers branching into robotics, the goal is to create something much more autonomous and adaptable across a wider range of circumstances. This might start with robot arms that can ‘pick and place’ any factory product, but evolve into humanoid robots that provide company and support for older people, for example. “There are so many applications,” says Sidopoulos. 

The collaborators’ theory is that learning about the physical world in one robot body should help an AI to operate another — in the same way that learning in English can help a language model to generate Chinese, because the underlying concepts about the world that the words describe are the same. This seems to work. The collaboration’s resulting foundation model, called RT-X, which was released in October 2023, performed better on real-world tasks than did models the researchers trained on one robot architecture.

Many researchers say that having this kind of diversity is essential. “We believe that a true robotics foundation model should not be tied to only one embodiment,” says Peter Chen, an AI researcher and co-founder of Covariant, an AI firm in Emeryville, California.

Covariant is also working hard on scaling up robot data. The company, which was set up in part by former OpenAI researchers, began collecting data in 2018 from 30 variations of robot arms in warehouses across the world, which all run using Covariant software. Covariant’s Robotics Foundation Model 1 (RFM-1) goes beyond collecting video data to encompass sensor readings, such as how much weight was lifted or force applied. This kind of data should help a robot to perform tasks such as manipulating a squishy object, says Gopalakrishnan — in theory, helping a robot to know, for example, how not to bruise a banana.

Covariant has built up a proprietary database that includes hundreds of billions of ‘tokens’ — units of real-world robotic information — which Chen says is roughly on a par with the scale of data that trained GPT-3. “We have way more real-world data than other people, because that’s what we have been focused on,” Chen says. RFM-1 is poised to roll out soon, says Chen, and should allow operators of robots running Covariant’s software to type or speak general instructions, such as “pick up apples from the bin”.

What the future holds depends on who you ask. Brooks says that robots will continue to improve and find new applications, but their eventual use “is nowhere near as sexy” as humanoids replacing human labour. But others think that developing a functional and safe humanoid robot that is capable of cooking dinner, running errands and folding the laundry is possible — but could just cost hundreds of millions of dollars. “I’m sure someone will do it,” says Khazatsky. “It’ll just be a lot of money, and time.”

Read the full article here.

AI News This Week

  • Google’s AI search leaves publishers scrambling  (New York Times)

    Google’s introduction of AI Overviews, which provide content summaries directly in search results, has sparked controversy and concern among publishers. These summaries may decrease site visits, threatening traditional news revenue models. Given Google’s critical role in driving web traffic, completely withdrawing from the platform could be counterproductive. In response, publishers are adopting direct engagement strategies, such as text messaging and exclusive content, to offset potential traffic losses.

  • The way whales communicate is closer to human language than we realized  (MIT Technology Review)

    Research led by MIT’s CSAIL and published in Nature Communications reveals that sperm whales’ communication is more complex than previously known. Analyzing over 8,700 codas from about 60 whales using AI, the team discovered patterns resembling elements of human language. This study highlights the nuanced and structured nature of whale codas, suggesting deeper insights into their social interactions and paving the way for further decoding of animal communications. 

  • Tech giants form industry group for AI chip development  (TechCrunch)

    In a response to Nvidia’s NVLink dominance, Tech giants including Intel, Google, Microsoft, Meta, AMD, Hewlett Packard Enterprise, Cisco, and Broadcom amongst others have formed the Ultra Accelerator Link (UALink) Promoter Group. Nvidia’s NVLink feeds large datasets faster into models and rapidly exchanging data between GPUs. Until this week’s announcement, NVLink was the industry standard for facilitating high-speed, multi-GPU communication. The tech companies behind UALink are looking to connect accelerators through an open — non NVIDIA exclusive — protocol, aiming to create a new way to connect specialized computer chips used in data centers, AI accelerators, to improve communication. UALink 1.0, will allow up to 1,024 of these AI accelerators to work together more efficiently, which helps speed up the flow of data and reduces delays.

  • China’s $47B semiconductor fund puts chip sovereignty front and center  (TechCrunch)

    China has launched a $47.5 billion semiconductor fund, its largest yet, to achieve chip self-sufficiency and reduce reliance on foreign technology. Known as the Big Fund III, it supports large-scale wafer manufacturing and High Bandwidth Memory (HBM) chips for AI, 5G, and IoT. This fund surpasses previous investments and reflects China making chip sovereignty a priority amid global semiconductor tensions, particularly involving Taiwan and its dominant chipmaker, TSMC. 

  • Research: The Platonic Representation Hypothesis  (MIT)

    Recent research from MIT supports the idea that as AI systems grow in size, they develop increasingly similar ways of understanding the world, a concept known as the ‘Platonic Representation Hypothesis.’ This hypothesis suggests that larger AI systems converge in their representations of reality, leading to a more unified understanding. The study compared 78 different vision models and found that more capable models align more closely, particularly across different tasks and modalities. The findings indicate that as AI models become more generalized and capable, they perform better and produce more accurate and consistent representations of the world.

Radical Reads is edited by Ebin Tomy (Analyst, Radical Ventures)