upvote
You can't tell someone to "get a life" while taking the effort to create a burner account for the sole purpose of insulting someone.
reply
I don't really consider that a great benchmark anyway and we really need better ones that are objective instead of these mostly performative and cheatable and also available in the training set.
reply
Simon's pelicans are an institution. Are you trying to get banned. Lmao.
reply
deleted
reply
I think it's a clever thing he did to basically guarantee he continues to get major traffic to his blog here every time a model is released, especially since he's taking sponsorships with a static banner at the top of every page now. I think he's trying to go the Daring Fireball route.
reply
For me it is like if crypto bros were allowed to shill their DAOs and tokens during the crypto/NFT phase.

He is the only person not getting rate-limited for shilling AI all the time.

reply
Pointing out how much the models still suck at drawing pelicans is a funny way to shill them.
reply
Tbf the first line of your first comment is:

  > Pelican for Fable 5 on default settings is a clear improvement on Opus 4.8
And doesn't contain any actual criticism within the comment (your blog post might, but just referring to what was posted on HN, which is a bit booster-y on its own).
reply
The entire pelican benchmark is a joke. The joke is that, for all of the billions of dollars poured into these things and the claims of PhD level intelligence, they still draw pelicans not-much-better than a five year-old would.

I don't spell that joke out in every comment I post here because that wouldn't be very funny.

reply