upvote
It’s really wild watching LLMs construct those calls. They batch so many different checks and stuff into a single tool call, delimit them with markers, etc.

The crazy thing to me is that this kind of “composition of small tools to create something bigger” is the biggest vindication of the Unix philosophy I can think of.

I have to wonder how much of that behavior was trained into the model and how much it is the secret herbs and spices they toss into the harnesses own system prompts.

reply
Totally breaks the permission model in Claude Code.
reply
Personally I really dislike when the agents generate super long composed shell commands because they are really hard to audit. ffmpeg I'd whitelist, but if it makes a mistake in some super long chained git command it can have pretty scary consequences.
reply