upvote
This is a good pattern because it would allow all the models to "think" a bit before giving an answer even if they don't have reasoning or thinking turn on. Just make sure you have the reasoning output before the final answer. A mistake I see all the time is having the answer outputted first then the explanation after which leaves more room for models to rationalize bad answers.

Good pattern: {"explanation": <short explanation for your answer>, "answer": <your final answer: true|false|i don't know>}

Bad pattern: {"answer": <your answer here>, "explanation": <short explanation for your answer>}

reply
Good point. Processing the substance of the answer might be too labor-consuming (1,000 claims x 5 models), but "thinking out loud" might improve the quality of the answers indeed. And we can still force/ask them to respond with a clear verdict at the end of their reasoning, as per the chosen rubric.
reply
If you have the model use a tool you can define the schema as a free text rationale field followed by one in the set of possible answers, so everything is nicely formatted as a JSON.
reply
Some models struggle combining JSON schema and web search capabilities.
reply