And so on and so forth. Again, I'm not saying this is impossible but I am saying that if you tried to do it, and you got the money, and you built the test, and got the human subjects clearance, and you ignored that during the process of all that at least one more frontier model would come out, you can count on HN anklebiting your "rigorous" study even so, and probably being correct about a lot of the issues it could have because it would take several iterations of this to build a reasonable protocol... at which point it would quite possibly also be obsoleted by progress again.
[1] https://blog.neurips.cc/2025/09/30/reflecting-on-the-2025-re...
There are far more opportunities that can be served when the world's intellectuals have the raw weights and can fine tune, splice, distill, and reapply.
Imagine having raw unfettered access to Fable. It can be refit to structural biology. It can be fine tuned on the repo for smaller context requirements. It can be run cheaper and air gapped.
The world wants this.
I think we are leaving the main frame era of AI and entering the PC era already. If there wasn’t a RAM shortage and we all had 2TB of ram and GPUs we would all have large local models or personal APIs serving our teams.
That’s why all the labs are moving to the App layer and moving away from being the API for intelligence like they were originally.
That said, maybe we just disagree on how to drive change, and that’s fine. I’ll leave it.