upvote
I don’t care how practical it may or may not be, this is my new favorite LLM benchmark
reply
I couldn't find an about page or similar?
reply
Here's the public sample https://github.com/T3-Content/skatebench/blob/main/bench/tes...

I don't think there's a good description anywhere. https://youtube.com/@t3dotgg talks about it from time to time.

reply
o3-pro is better than 5.2 pro! And GPT 5 high is best. Really quite interesting.
reply