Thursday, May 23, 2024

What is Eric Jang up to? All Roads Lead to Robotics

Eric Jang (1X Technologies, formerly Halodi Robotics), from his blog post of Mar. 3, All Roads Lead to Robotics. He links to this video:

He comments:

Because we take an end-to-end neural network approach to autonomy, our capability scaling is no longer constrained by how fast we can write code. All of the capabilities in this video involved no coding, it was just learned from data collected and trained on by our Android Operations team.

He also remarks:

1X is the first robotics company (to my knowledge) to have our data collectors train the capabilities themselves. This really decreases the time-to-a-good-model, because the people collecting data can get very fast feedback on how good their data is and how much data they actually need to solve the robotic task. I predict this will become a widespread paradigm in how robot data is collected in the future.

The main substance of his post is entitled "All AI Software Converges to Robotics Software." Why would/might that be so?

ML deployed in a pure software environment is easier because the world of bits is predictable. You can move some bits from A to B and trust that they show up at their destination with perfect integrity. You can make an API call to some server over the Internet and assume that it will just work. Even if it fails, the set of failure modes are known ahead of time so you can handle all of them.

In robotics, all of the information outside of the robot is unknown. Your future sensor observations, given your actions, are unknown. You also don’t know where you are, where anything else is, what will happen if you make contact with something, whether the light turned on after you flipped the switch, or whether you even flipped the switch at all. Even trivial things like telling the difference between riding an elevator down vs. being hoisted up in a gantry is hard, as the forces experienced by the inertial measurement unit (IMU) sensor look similar in both scenarios. A little bit of ignorance propagates very quickly, and soon your robot ends up on the floor having a seizure because it thinks that it still has a chance at maintaining balance.

As our AI software systems start to touch the real world, like doing customer support or ordering your Uber for you, they will run into many of the same engineering challenges that robotics faces today; the longer a program interacts with a source of entropy, the less formal guarantees we can make about the correctness of our program’s behavior. Even if you are not building a physical robot, your codebase ends up looking a lot like a modern robotics software stack. I spend an unreasonable amount of my time implementing more scalable data loaders and logging infrastructure, and making sure that when I log data, I can re-order all of them into a temporally causal sequence for a transformer. Sound familiar? [...]

If you accept the premise that the engineering and infrastructure problems in LLMs are the same as those in robotics, then we should expect that disembodied AGI and robotic AGI happen at roughly the same time. The hardware is ready and all of the pieces are already there in the form of research papers published over the last 10 years.

I've been having similar thoughts, though not with respect to robotics. Rather, I've been thinking about the role of symbolic computing in robust and flexible systems. Given that the world is full of so-called edge cases, at least some of them very important and fruitful, the problems of LLMs will not be solved through add-ons that provide various symbolic capacities, no matter how clever. In the end, it is going to be necessary to re-construct the LLM with symbolic means, and that will prove to be an unending task, as latent space is ever-evolving.

There's more in the post, but this remark stuck out at me:

Any startup that raised 10-100M USD to train their own big neural network from scratch in the last 2 years ended up paying an enormous capex cost for something that basically every AI startup gets for free today. [...] As such, I think the vast majority of successful startups will be the ones that can nimbly ride the tide of open-source weights.

No comments:

Post a Comment