Elon thinks it would be too expensive to have to write code for every task you might ask one of these to do, they want it to be fully autonomous.
Their engineers aren't behind keyboards typing C++, they're wearing VR headsets and feeding the data to a LLM, although even that is probably too specific for Elon's long term plans. Obviously he doesn't want to have to have people repeat actions hundreds of times before the dumb robots figure it out. Especially for "simple" tasks like serving drinks at press events.