For a language like C++ where modules are split into definition (.h) and implementation (.cpp) parts, one choice of prefix would be all the header files for the project (which aren't likely to change much).
More generally the idea would be to have an agent that had cached-prefix reuse as it's primary context management goal.
Another possibility, to support caching of files that have since changed, would be for the agent to build the context as a fixed prefix reflecting some or all of the codebase in its start-of-session state, then append any changes to that, with appropriate prompting to only use the latest definition of a function.
e.g.
Say file A initially contains functions X, Y and Z, then the prompt prefix is built to include X Y Z. If the user then modifies Y -> Y', then just add that to the context, so that the cached prefix is unchanged, giving X Y Z Y'.
This is also a problem for all of the Mac local LLMs. Macs are a great way to get a lot of high bandwidth memory, but their compute is very far behind current gen dedicated GPUs. Some of the expensive Mac Studio setups allow you to run very large models with usable tokens/s, but you can be waiting a long time for it to get to the point of generating those tokens.