undefined

points

[-]

The usual problem is companies write an MCP server with 50 different tools, and each one has a schema, description, etc. Say each tool is 150 tokens, that's 150 * 50, or 7500 tokens, dumped into the beginning of every session. Compared to a text file that gets loaded on demand with command-line tool examples, so you still get close to the same amount of context, but you can control what tool definitions you pull in.

The other thing is the agent gets the entire MCP API response dumped into context as a tool response in JSON, which can be a lot. Compare that to shell commands where agents often `head` or `tail` or `grep` the response (which I kinda hate, but it does save tokens).

It also depends on whether the agent loads them on-demand or not (most modern agents do), and whether your MCP has a ton of tools or not. If your MCP only has 2 tools, and the responses aren't big, it's really not that much context.

The other thing that doesn't get talked about is the non-determinism of shell one-liners. There is a lot more non-determinism in shell tool calls; the AI can mess up commands, options, arguments. It can incorrectly filter output, miss output, miss return status, which results in re-running calls, polluting context, making results worse. Compare that to MCP calls which are more likely to succeed because they have a schema, well-defined errors, etc. Do you want less token use or more reliable results?

The thing is, you don't have to pick a side. I personally use both MCPs and CLIs at different times in different ways. Often I'll have the AI write a small script to do many calls (sometimes with tools, sometimes with libraries) which saves tokens, allows me to review, and is more deterministic.

by willio581 hours ago|

parent|

[-]

Thanks for the answer! I do see both sides