The researchers taught the robot, called Mobile ALOHA (an acronym for “a low-cost open-source hardware teleoperation system for bimanual operation”), seven different tasks requiring a variety of mobility and dexterity skills, such as rinsing a pan or giving someone a high five.
To teach the robot how to cook shrimp, for example, the researchers remotely operated it 20 times to get the shrimp into the plan, flip it, and then serve it. They did it slightly differently each time so the robot learned different ways to do the same task, says Zipeng Fu, a PhD Student at Stanford, who was project co-lead.
The robot was then trained on these demonstrations, as well as other human-operated demonstrations for different types of tasks that have nothing to do with shrimp cooking, such as tearing off a paper towel or tape collected by an earlier ALOHA robot without wheels, says Chelsea Finn, an assistant professor at Stanford University, who was an advisor for the project. This “co-training” approach, in which new and old data are combined, helped Mobile ALOHA learn new jobs relatively quickly, compared with the usual approach of training AI systems on thousands if not millions of examples. From this old data, the robot was able to learn new skills that had nothing to do with the task at hand, says Finn.
While these sorts of household tasks are easy for humans (at least when we’re in the mood for them), they are still very hard for robots. They struggle to grip and grab and manipulate objects, because they lack the precision, coordination, and understanding of the surrounding environment that humans naturally have. However, recent efforts to apply AI techniques to robotics have shown a lot of promise in unlocking new capabilities. For example, Google’s RT-2 system combines a language-vision model with a robot, which allows humans to give it verbal commands.
“One of the things that’s really exciting is that this recipe of imitation learning is very generic. It’s very simple. It’s very scalable,” says Finn. Collecting more data for robots to try to imitate could allow them to handle even more kitchen-based tasks, she adds.
“Mobile ALOHA has demonstrated something unique: relatively cheap robot hardware can solve really complex problems,” says Lerrel Pinto, an associate professor of computer science at NYU, who was not involved in the research.
Mobile ALOHA shows that robot hardware is already very capable, and underscores that AI is the missing piece in making robots that are more useful, adds Deepak Pathak, an assistant professor at Carnegie Mellon University, who was also not part of the research team.
Pinto says the model also shows that robotics training data can be transferable: training on one task can improve its performance for others. “This is a strongly desirable property, as when data increases, even if it is not necessarily for a task you care about, it can improve the performance of your robot,” he says.
Next the Stanford team is going to train the robot on more data to do even harder tasks, such as picking up and folding crumpled laundry, says Tony Z. Zhao, a PhD student at Stanford who was part of the team. Laundry has traditionally been very hard for robots, because the objects are bunched up in shapes they struggle to understand. But Zhao says their technique will help the machines tackle tasks that people previously thought were impossible.