This might not be what we are facing here, but seeing how little moat anyone on AI has, I just can't discount the risk. And then instead of the consumers of today getting a great deal, we zoom out and see that 5x was spent developing the tech than it needed to, and that's not all that great economically as a whole. It's not as if, say, the weights from a 3 year old model are just useful capital to be reused later, like, say, when in the dot com boom we ended up with way too much fiber that was needed, but that could be bought and turned on profitably later.
If Sonnet 4.6 is actually "good enough" in some respects, maybe the models will just get cheaper along one branch, while they get better on a different branch.
But LLMs, and AI-related tooling, seem to really buck that trend: they're obsoleted almost as soon as they're released.
Before ChatGPT was even released, Google had an internal-only chat tuned LLM. It went "viral" because some of the testers thought it was sentient and it caused a whole media circus. This is partially why Google was so ill equipped to even start competing - they had fresh wounds of a crazy media circus.
My pet theory though is that this news is what inspired OpenAI to chat-tune GPT-3, which was a pretty cool text generator model, but not a chat model. So it may have been a necessary step to get chat-llms out of Mountain View and into the real world.
https://www.scientificamerican.com/article/google-engineer-c...
https://www.theguardian.com/technology/2022/jul/23/google-fi...
Where would we be if patents never existed?
that was also brilliant marketing
It was kinda like a having muskets against indigenous tribes in the 14-1500s vs a machine gun against a modern city today. The machine gun is objectively better but has not kept up pace with the increase in defensive capability of a modern city with a modern police force.
> Nearly a year ago we wrote in the OpenAI Charter : “we expect that safety and security concerns will reduce our traditional publishing in the future, while increasing the importance of sharing safety, policy, and standards research,” and we see this current work as potentially representing the early beginnings of such concerns, which we expect may grow over time. This decision, as well as our discussion of it, is an experiment: while we are not sure that it is the right decision today, we believe that the AI community will eventually need to tackle the issue of publication norms in a thoughtful way in certain research areas. -- https://openai.com/index/better-language-models/
Then over the next few months they released increasingly large models, with the full model public in November 2019 https://openai.com/index/gpt-2-1-5b-release/ , well before ChatGPT.
I wouldn't call it rewriting history to say they initially considered GPT-2 too dangerous to be released. If they'd applied this approach to subsequent models rather than making them available via ChatGPT and an API, it's conceivable that LLMs would be 3-5 years behind where they currently are in the development cycle.
> Due to concerns about large language models being used to generate deceptive, biased, or abusive language at scale, we are only releasing a much smaller version of GPT‑2 along with sampling code (opens in a new window).
"Too dangerous to release" is accurate. There's no rewriting of history.
It's quite depressing.
There's a world of difference between what's happening and RAM prices if OAI and others were just bidding for produced modules as they released.
> You will need one cup King Arthur All Purpose white flour, one large brown Eggland’s Best egg (a good source of Omega-3 and healthy cholesterol), one cup of water (be sure to use your Pyrex brand measuring cup), half a cup of Toll House Milk Chocolate Chips…
> Combine the sugar and egg in your 3 quart KitchenAid Mixer and mix until…
All of this will contain links and AdSense looking ads. For $200/month they will limit it to in-house ads about their $500/month model.
[1] https://www.theguardian.com/technology/2025/jun/25/second-st...
LLM providers don't, really. As far as I can tell, their moat is the ability to train a model, and possessing the hardware to run it. Also, open-weight models provide a floor for model training. I think their big bet is that gathering user-data from interactions with the LLM will be so valuable that it results in substantially-better models, but I'm not sure that's the case.
That level of internal fierce competition is a massive reason why they are beating us so badly on cost-effectiveness and innovation.
it took a lot of work for environmentalists to get some regulation into the US, canda, and the EU. china will get to that eventually