undefined

points

[-]

If input tokens dominate the cost to that extent, this implies that major gains are possible by making better use of caching. You could basically ask the model to do a one-time "compaction" step including a dump of the relevant portions of the code, and use that as the cached prefix for a large amount of "swarm" subagent calls.

by kolinko9 hours ago|

prev|

[-]

Did you experiment with giving agent better tools to navigate and document the codebase? Asts, language servers and so on?

A million tokens (not cached) sounds like a lot.

by bob10299 hours ago|

parent|

[-]

The target codebase is very large. A million tokens is a drop in the proverbial bucket.

I still don't understand how caching helps me very much. I must be misunderstanding it because I thought the user's prompt (which is the biggest variable) necessarily sits prior to all of these token intensive tool calls. How can we cache the reading of codebase if the prefix is always moving?

by Phemist8 hours ago|

parent|

[-]

If an agent makes a tool call, the LLM provider will receive the full context again after the result of the tool call becomes available in order to decide the next move. Everything up to the point of the tool call being made will no longer change and could thus in theory be cached. If the agent makes a ton of tool calls, then for every tool call one should be hitting the cache an equal amount of times.

A new instruction by the user will be appended at the end if it done in the same conversation. Thus only has influence on the cacheability of the original agent prompt, but not of subsequent tool calls.

by uxhacker7 hours ago|

parent|

prev|

[-]

Often to me it seams like using MA is like letting a million monkeys lose.

Has ai forgotten about high level design? Surely all it needs to know is what the methods, objects or functions in the code base actually does and the actual code it is meant to be fixing?

I wonder if half the issues is that the LLM try to change too much?

by willtemperley2 hours ago|

parent|

[-]

[dead]

by frumiousirc8 hours ago|

parent|

prev|

[-]

> The target codebase is very large.

But, does every prompt need the entire codebase?

by amazingamazing7 hours ago|

parent|

[-]

How could it not? Can you ever guarantee accurate answers about a book you haven't entirely read?