upvote
Can't you just force it to do structured output via constrained generation?
reply
Yes, I did end up figuring out a clean way to allow normal reasoning inside <think> and then force JSON _after_ the closing </think>. Example here: https://gist.github.com/noperator/6c711ab19027ea8056442df839...
reply
> but I'm working around that in my harness.

How?

reply
Maybe limiting logits to what is syntactically correct? E.g. {"hello" has to be followed by whitespace or colon. Any other logits get dropped.
reply