Meaningful in the sense it could find security vulnerabilities in browser and kernel that >99% of the engineers couldn't find.
I'm talking about output quality compared to parameter size.
Mythos is not 4 orders of magnitude larger than Opus - it's quite possible no LLM model ever reaches that size (likely even), and it's output is only barely better...
> Mythos is not 4 orders of magnitude larger than Opus
Again can you define this. How would 4 order of magnitude better look like?