If you read the charter of the eval (or any eval, really), this statement is pretty silly.
The whole point of each eval version is to identify a chunk of challenges that humans do well that AI can't. When AI gets to ~80, you move to the next chunk. When you run out of challenges, you have AGI.