Only coherent move at this point: hit the minus button immediately. There's never anything about the model in the thread other than simon's post.
> you still see improvements
This is expected if they are training their models on it, right?
> objectively-bad results
Keen to learn when this has been the case, i.e. across version increments in major models.
I've been enjoying seeing how the quality of individual models differ based on the amount of reasoning effort you give them. If they were baking an a good pelican you wouldn't expect them to differ so much.
(Google Gemini are the only lab that have very clearly paid attention to the quality of SVG animals-riding-vehicles, see their announcement for Gemini 3.1: https://twitter.com/JeffDean/status/2024525132266688757 )
When it started, comparing the progress between models was mildly interesting but everyone (including Simon) acknowledges it certainly leaked into the training data long ago.
that reply never failed to come it's basically a meme at this point
Clearly at this point they are part of the training data.
They even all look sort of ish the same. Daytime, colors,...
I know because I too had this initial take; however, upon analysis, it is not sound.
I agree as well that he writes many interesting things.
Fun at first, seems disingenuous now. A site funnel
well done anthropic.