Very interesting you bring this up. It was quite a big point of discussion whilst jamie and I were building.
One of the big issues we faced with LLMs is that their attention gets diluted when you have a long chat history. This means that for large amounts of context, they often can't pick out the details your prompt relates to. I'm sure you've noticed this once your chat gets very long.
Instead of trying to develop an automatic system to descide what context your prompt should use (I.e which branch you're on), we opted to make organising your tree a very deliberate action. This gives you a lot more control over what the model sees, and ultimately how good the responses. As a bonus, if a model if playing up, you can go in and change the context it has by moving a node or two about.
Really good point though, and thanks for asking about it. I'd love to hear if you have any thoughts on ways you could get around it automatically.
I think in general these things should not be confused to be the one and same artifact - that of a personal memory device and that for LLM context management. Right now, it seems to double up, of which the main problem is that it kind of puts the burden on me to manage my memory device, which should be automatic I think. I don't have perfect thoughts on it, so I'll leave it at this, its work in progress..
is the expectation that you will be running many branches of of context at the same time?
Completely subjectively, for me its both. I have several Chat GPT tabs where it is instructed not to respond, or to briefly summarise. System works both ways imho.