undefined

points

[-]

I think so. The last few months have shown us that it isn't necessarily the models themselves that provide good results, but the tooling / harness around it. Codex, Opus, GLM 5, Kimi 2.5, etc. all each have their quirks. Use a harness like opencode and give the model the right amount of context, they'll all perform well and you'll get a correct answer every time.

So in my opinion, in a scenario like this where the token output is near instant but you're running a lower tier model, good tooling can overcome the differences between a frontier cloud model.

by AmazingTurtle4 hours ago|

prev|

[-]

Models can't improve themselves with their own (model) input, they need to be grounded in truth and reality.