I highly doubt that your tool is like this:
> git branch -vv | grep ': gone]'| grep -v "*" | awk '{ print $1; }' | xargs -r git branch -d
Or:
> ffmpeg -i main_course.mp4 -i reaction_cam.mov \ -filter_complex \ "[1:v]scale=480:270[pip_scaled]; \ [0:v][pip_scaled]overlay=W-w-20:20[pip_video]; \ [pip_video]drawtext=text='LIVE RECORDING':fontcolor=white:fontsize=24:box=1:boxcolor=black@0.6:x=30:y=30[final_video]; \ [0:a][1:a]amix=inputs=2:duration=first:dropout_transition=2[final_audio]" \ -map "[final_video]" -map "[final_audio]" \ -c:v libx264 -crf 21 -preset fast \ -c:a aac -b:a 192k \ output_production.mp4
LLMs generate these for breakfast.
The crazy thing to me is that this kind of “composition of small tools to create something bigger” is the biggest vindication of the Unix philosophy I can think of.
I have to wonder how much of that behavior was trained into the model and how much it is the secret herbs and spices they toss into the harnesses own system prompts.
There are work arounds though and I am creating what I call knowledge triggers for Pi that are similar to claude's "PreToolUse" so having the agent use oak all the time is not an issue in my opinion.
The challenge for oak is why? Considering how I actually want to slow agents down so I can ensure it is doing the right thing and because the massive bottle kneck is the LLM themselves, speed when measured in milliseconds or even seconds will not concern many.
I thought oak was more of, we know how to prompt inject context based on code that is stored in oak for example, but faster operations can help, but the use case is limited. The missing piece for better/correct code is context at the right time.
There's a limit of how many simultaneous instructions an agent can follow (the exact number depends on the specific model so instructions that are fine for one model may overwhelm another). If this keeps happening, consider trimming your instructions or even better, solving it at the harness level (like intercepting and rewriting ripgrep calls to use your thing, like rtk [0] does in agents that supports this)
Overall, never leave to an agent an instruction that must be followed at all times. For example, doing things in a git hook beats a multi-command workflow every time the agent commit, etc.
Is this state of things forever? I don't think so. Very soon models will become so better this will be a non-problem