upvote
A lot of models (including Opus) keep insisting in their reasoning traces that going first can be a bad idea for control decks, etc, which I find pretty interesting - my understanding is that the consensus among pros is closer to "you should go first 99.999% of the time", but the models seem to want there to be more nuance. Beyond that, most of the really interesting blunders that I've dug into have turned out to be problems with the tooling (either actual bugs, or MCP tools with affordances that are a poor fit for how LLMs assume they work). I'm hoping that I'm close to the end of those and am gonna start getting to the real limitations of the models soon.
reply