Coding could be treated as a low stakes (time & money consequences for retries) closed loop system where most other tasks cannot.
If it screws up booking your flight/hotel room, how does the agent verify this, and even if it verifies.. there is an actual cost to changes/cancellations.
Similar with agentic e-commerce, lots of ability to screw that up and just seems ripe for fraud / being picked off by bad actors.
Unfortunately, travel keeps getting less flexible, with worse cancelation policies.
I can STILL replicate this behavior in Google AI summaries 10% of the time:
"is <SOMEPLANT> ok for cats"
to which it replies: "Yes, <SOMEPLANT LONG SCIENTIFIC NAME VERBOSE PHRASING> is toxic for cats"
The other one going around this weekend: "how long hot dogs on grill"
Summary: "The hot dogs on your grill are likely around 5-6 inches long .. "
So scale this category of error to unsupervised agents with access to your credit card.