upvote
Isn’t this showing that LLMs can write code to control robots, not that they can actually directly control them? If I’m reading the hand tracking example right, the LLM is not actually in the control loop. Is this wrong?
reply
Yeah, the mechanism by which the LLMs control the robot is by writing code. I suppose they could also issue direct joint sequences, but I thought that they're so good at writing code already, might as well do that. So if they 'wanted' to they could write code with an explicit joint sequence they calculate in-context. That one seems more difficult.

So they can go 'slow', by taking a camera image, controlling the robot, repeating. Or they can write code that runs closer to the robot in a loop, either way. I thought the latter was somehow more impressive, and that's what you see in the hand-tracking example.

reply
Hm, the latter is certainly more practical, but feels less interesting to me because we already know that LLMs can write code. To me, exploring the limits of the LLM-in-the-loop approach would be more interesting. Cool project regardless, thanks for sharing!
reply