upvote
Look at the ARC site. The scores of these models is plotted against their "cost per task". All of these huge jumps come along with massive increases in cost per task. Including Gemini 3.1 Pro which increased by 4.2x
reply
ARC 2 was made specifically to artificially lower contemporary LLM scores, therefore any kind of model improvements will have outsized effects

Also people use "saturated" too liberally. The top left corner 1 cent per task is saturated IMO. Since there are billions of people who would perfer to solve arc 1 tasks at 52 cents per task. Arc 2 a human would make thousands of dollars a day with 99.99% accuracy

reply
How much do I get if I solve this? :D

https://arcprize.org/play

reply
You are saying something interesting but too esoteric. Can you explain for beginners?
reply