undefined

points

[-]

I think my biggest hangup is some models dont have big enough context windows, my sweet spot personally for Opus is having at least 400 to 600k tokens, if I can have a local model that can go up to that or slightly above 600k maybe 700k for some buffer, that would be perfect.

I've also debated having a frontier model for planning only, and then feeding plan to smaller offline models.