I copied/pasted a comment with faulty logic (self-defeating) directly from a HN comment and asked a bunch of models available to me (Gemini and Claude) if it could spot the issue. I figured it would be a nice test of reasoning since an actual human missed it. The only one that found the logic error without help was Claude 4.6 Opus Extending Thinking. The others at best raised relevant counterpoints in the supporting argument but couldn't identify the central issue. Claude's answer seemed miles ahead. I wonder if SotA advancements will continue to distinguish themselves.
And midwits here saying "yeah bro they have some MUCH better model internally that they just don't release to the public", imagine being that dense. Those people probably went all in on NFTs too and told other "you just don't get it bro"