upvote
Your posting of the pelican benchmark is honestly the biggest reason I check the HackerNews comments on big new model announcements
reply
All hail the pelican king!
reply
He is the JerryRigEverything of pelicans.
reply
Do you think it's just part of their training set now?
reply
It's time to do "frog on a skateboard" now.
reply
If it's part of their training set why do the 2B and 4B models produce such terrible SVGs?
reply
We were promised full SVG zoos, Simon. I want to see SVG pangolins please
reply
Because it is in their training set but it's unrealistic to expect a 2B or 4B model to be able to perfectly reproduce everything it's seen before.

The training no doubt contributed to their ability to (very) loosely approximate an SVG of pelican on a bicycle, though.

Frankly I'm impressed

reply
because generating nice looking svg requires handling code, shapes, long context, reasoning and at 2b you most likely will break the syntax of the file 9 times out of 10 if you train for that. or you will need to go for simpler pelicans. might not be worth to ft on a 2b. but on their top tier open model it is definitly worth it. even not directly but just crawling a github would make it train on your pelicans.
reply
Seems very likely, even if Google has behaved ethically.

Simon and YC/HN has published/boosted these gradual improvements and evaluations for quite some time now.

There is a https://simonwillison.net/robots.txt but it allows pretty much everything, AI-wise.

reply
Do you have a single gallery page where we can see all the pelicans together. I'm thinking something similar to

https://clocks.brianmoore.com/

but static.

reply
Absolutely hilarious that Qwen 3.5 had a far better clock than Opus 4.6 each time I looked.
reply
Not exactly what you asked for but try https://pelicans.borg.games/
reply
what the sorcery is that https://static.simonwillison.net/static/2024/recraft-ai-peli...

I tried their model and asking a few different svg of pelicans. it is INSANE.

reply
AFAIK that model is pretty old, and it was explicitly trained for SVG generation. For other models the capability of generating SVGs of real stuff is accidental. Same as GPT-5.x and Sonnet 4.5+ being able to generate MIDI music.
reply
is it a fine tune of some open source model?
reply
Uh, the GPT-5 clock is... interesting, to say the least.
reply
Love your work, thank you!
reply
I'd recommend using the instruction tuned variants, the pelicans would probably look a lot better.
reply
Mind I ask what your laptop is and configuration hardware wise?
reply
128GB M5, but the largest of these models still only use about 20GB of RAM so I'd expect them to work OK on 32GB and up.
reply