It doesn't stop there though. OpenAI is currently mired in a capital crunch. Their last round just about sucked all the dry powder out of the private markets. Folks are now starting to ask difficult questions about their burn rate and revenue. It is increasingly looking like they might not commit to the purchase order they made which kick-started this whole panic over RAM.
Soo ... how sure are we that the memory makers themselves are not going to be the ones holding the bag?
(Well that and collusion)
You get market signals that the demand is there, you acquire the necessary capital, you spend 5 years to build capacity, but guess what, 5 other market players did the same thing. So now you are doomed, because the market is flooded and you have low cash flow since you need to drop prices to compete for pennies.
Now you cannot find capital, you don't invest, but guess what, neither your competitors did. So now the demand is higher than the supply. Your price per unit sold skyrocketed, but you don't have enough capacity!
Rinse and repeat.
Capitalists claim that this is optimal.
If anything, it shows it's possible for you to arbitrage this and in doing so help "smooth out the cycle."
It's more optimal than planned economies until we have AI planned economies with realtime feedback, I guess.
Consumers get cheap goods during oversupply and most inefficient companies get elliminated during bust while consolidation leads to economies of scale.
There is an alternative where legislation dampens this behavior but the short term profits will be lower. Hence the hawks don’t like it.
Because that does not happen exactly as you say for all players. The demand signals will be processed and long-term risk is balanced against short-term gain in a distributed fashion, so not everyone will do the same.
this view isn't updated correctly post-claude code and codex. there will clearly be sufficient demand.
If they could make this stuff and sell it to regular people a decade ago for very palatable prices, why do they come up with the idea that this is the technology of the gods, unaffordable by mere mortals?
Heck, I have a phone with a 16bit memory bus for instance. The high(ish) clock rate only makes up the difference slightly.
But with general prices on all components going up, it might not be such a big factor any more.
HBM migght make sense for higher end products which can free up space for the lower end that will never use the tech.
5090 has 1.8 TB/s?
That is, memory capacity is reserved for datacenters yet to be built, but this will do weird things if said datacenter construction is postponed or cancelled altogether.
Are the Netherlands a large proportion of global datacenters?
In most other places the percentage is significantly less than that and then you can easily add more of the cheap-but-intermittent stuff because a cloudy day only requires you to make up a 10% shortfall instead of a 50% one, which existing hydro or natural gas plants can handle without new storage when there are more of them to begin with.
In every country? Citation needed.
The memory makers specifically did not scale up capacity to avoid being left holding the bag.
OpenAI (or whoever) crashes and can't pay for the order leaving the memory makers in a tough spot.
The Fiji XT architecture after it had 512GB/S on a 4096b HBM bus in 2015.
The Vega architecture did have 400GB/s or so in 2017, which was a bit of a downgrade.
Very few applications other than GPUs need HBM.
The real issue is everyone wanting to upgrade to hbm, ddr5, and nvme5 at the same time.
I hope they do, they did not have to agree to sell so much RAM to one customer. They’ve been caught colluding and price fixing more than once, I hope they take it in the shorts and new competitors arise or they go bankrupt and new management takes over the existing plants.
Don’t put all your eggs in the one basket is how the old saying goes.
We aren't. The remaining memory manufacturers fear getting caught in a "pork cycle" yet again - that is why there's only the three large ones left anyway.
China has memory makers who are creeping up through the stages of production maturity, and once they hit then there's no going back.
If the existing makers can't meet supply such that Chinese exports get their foot in the door, they may find they never get ahead again due to volume - that domestic market is huge so they have scale, and the gaming market isn't going to care because they get anything at the moment, which is all you'll need for enterprise to say "are we really afraid of memory in this business?"
Wasn't the problem here that OpenAI was negotiating with Samsung and SK Hynix at the same time without the other one knowing about it? People only realized the implications when they announced both deals at once.
They act as a de-facto monopoly and milk us. Why is this allowed?
Nobody is "allowing" this. It's a natural property of being both advanced technology and a commodity at the same time.
Recently they had a second price fixing lawsuit thrown out (in the US).
Now with the state of things I'm sure another lawsuit will arrive and be thrown out because the government will do anything to keep the AI bubble rolling and a price fixing suit will be a threat to national security, somehow. Obviously thats speculative and opinion but to be clear, people are allowing it. There are and more so were things that could be done.
It started with raegan, and even parties on the “left” in the west believe in it with very few exceptions.
The thing that enables this is pretty obvious. The population is divided into two camps, the first of which holds the heuristic that regulations are "communism and totalitarianism" and this camp is used to prevent e.g. antitrust rules/enforcement. The second camp holds the heuristic that companies need to be aggressively "regulated" and this camp is used to create/sustain rules making it harder to enter the market.
The problem is that ordinary people don't have the resources to dive into the details of any given proposal but the companies do. So what we need is a simple heuristic for ordinary people to distinguish them: Make the majority of "regulations" apply only to companies with more than 20% market share. No one is allowed to dump industrial waste in the river but only dominant companies have bureaucratic reporting requirements etc. Allow private lawsuits against dominant companies for certain offenses but only government-initiated prosecutions against smaller ones, the latter preventing incumbents from miring new challengers in litigation and requiring proof beyond a reasonable doubt.
This even makes logical sense, because most of the rules are attempts to mitigate an uncompetitive market, so applying them to new entrants or markets with >5 competitors is more likely to be deleterious, i.e. drive further consolidation. Whereas if the market is already consolidated then the thicket of rules constrains the incumbents from abusing their dominance in the uncompetitive market while encouraging new entrants who are below the threshold.
Oh no!
If they add enough capacity to meet current demand quickly then if demand crashes they still have billions of dollars in loans used to build capacity for demand that no longer exists and then they go bankrupt.
The biggest problem is predicting future demand, because it often declines quickly rather than gradually.
There’s virtually infinite capital: if needed, more can be reallocated from the federal government (funded with debt), from public companies (funded with people’s retirement funds), from people’s pockets via wealth redistribution upwards, from offshore investment.
They will be allowed to strangle any part of the supply chain they want.
Another point is I often see the money argument - like country X has more money, so they can afford to do more and better R&D, make more stuff.
This stuff comes out of factories, that need to be built, the machinery procured, engineers trained and hired.
[1]https://www.tomshardware.com/tech-industry/semiconductors/ym...
> more can be reallocated from the federal government (funded with debt)
While this is the most reliable funding, it's still not very accessible. OpenAI is a money pit, and their demands are growing quickly. The US government has started a bunch of very expensive spending. If OpenAI were to require yearly bundles of it's recent "$120B" deal, that's 6% of the US' discretionary budget. 12.5% of the non-military discretionary budget. (And the military is going to ask for a lot more money this year) Even the idea of just issuing more debt is dubious because they're going to want to do that to pay for the wars that are rapidly spiralling out of control.
None of this is saying that the US government can't or wouldn't pay for it, but it's non trivial and it's unclear how much Altman can threaten the US government "give me a trillion dollars or the economy explodes" without consequences.
Further deficit-spending isn't without it's risks for the US government either. Interests rates are already creeping up, and a careless explosion of deficit may well trigger a debt crisis.
> from public companies (funded with people’s retirement funds)
This would be at great cost. OpenAI would need to open up about it's financial performance to go public itself. With it's CFO being put on what is effectively Administrative Leave for pushing against going public, we can assume the financials are so catastrophic an IPO might bomb and take the company down with it. Nobody's going to be investing privately in a company that has no public takers.
Getting money through other companies is also running into limits. Big Tech has deep pockets but they've already started slowing down, switching to debt to finance AI investment, and similarly are increasingly pressured by their own shareholders to show results.
> from people’s pockets via wealth redistribution upwards
The practical mechanism of this is "AI companies raise their prices". That might also just crash the bubble if demand evaporates. For all the hype, the productivity benefit hasn't really shown up in economy-wide aggregates. The moment AI becomes "expensive", all the casual users will drop it. And the non-casual users are likely to follow. The idea of "AI tokens" as a job perk is cute, but exceedingly few are going to accept lower salary in order to use AI at their job.
There's simply not much money to take out of people's pockets these days, with how high cost of living has gotten.
> from offshore investment.
This is a pretty good source of money. The wealthy Arabian oil states have very deep slush funds, extensively investing in AI to get ties to US businesses and in the hope of diversifying their resource economies.
...
...
"Was". Was a good source of money.
even if gaming is and will remain very popular for years, it and the desire to upgrade gaming rigs is still a discretionary activity with more price elasticity of demand than corporate uses for RAM in the dawn of the AI age. gamers live on the margin of this market, where low prices will stimulate upgrades and high prices will lead to holding out. The complaints about price are real, but that segment of the market is some combination of less large and less important.
And hopefully kill Electron.
I have never seen the point of spinning up a 300+Mb app just to display something that ought to need only 500Kb to paint onto the screen.
I don’t see how design workflows matter in the conversation about cross-platform vs native and RAM efficiency since designers can always write their mockups in HTML/CSS/JS in isolation whenever they like and with any tool of their choice. You could even use purely GUI-based approaches like Figma or Sketch or any photo/vector editor, just tapping buttons and not writing a single line of web frontend code.
It's bad enough having to run one boated browser, now we have to run multiples?
This is not the right path.
Now that everyone who cant be bothered, vibe codes, and electron apps are the overevangelized norm… People will probably not even worry about writing js and electron will be here to stay. The only way out is to evangelize something else.
Like how half the websites have giant in your face cookie banners and half have minimalist banners. The experience will still suck for the end user because the dev doesnt care and neither do the business leaders.
If a js dev really wanted to it wouldn’t be a huge uphill climb to code a c app because the syntax and concepts are similar enough.
This comment makes no sense.
About the only thing they share is curly braces.
Pressure to optimize can more often imply just setting aside work to make the program be nearer to being limited by algorithmic bounds rather than doing what was quickest to implement and not caring about any of it. Having the same amount of time, replacing bloated abstractions with something more lightweight overall usually nets more memory gains than trying to tune something heavy to use less RAM at the expense of more CPU.
Ton of software out there where optimisation of both memory and cpu has been pushed to the side because development hours is more costly than a bit of extra resource usage.
Given that TurboQuant results in a 6x reduction in memory usage for KV caches and up to 8x boost in speed, this optimization is already showing up in llama.cpp, enabling significantly bigger contexts without having to run a smaller model to fit it all in memory.
Some people thought it might significantly improve the RAM situation, though I remain a bit skeptical - the demand is probably still larger than the reduction turboquant brings.
> Given that TurboQuant results in a 6x reduction in memory usage for KV caches
All depends on baseline. The "6x" is by stylistic comparison to a BF16 KV cache; not a state of the art 8 or 4 bit KV cache scheme.
Current "TurboQuant" implementations are about 3.8X-4.9X on compression (w/ the higher end taking some significant hits of GSM8K performance) and with about 80-100% baseline speed (no improvement, regression): https://github.com/vllm-project/vllm/pull/38479
For those not paying attention, it's probably worth sending this and ongoing discussion for vLLM https://github.com/vllm-project/vllm/issues/38171 and llama.cpp through your summarizer of choice - TurboQuant is fine, but not a magic bullet. Personally, I've been experimenting with DMS and I think it has a lot more promise and can be stacked with various quantization schemes.
The biggest savings in kvcache though is in improved model architecture. Gemma 4's SWA/global hybrid saves up to 10X kvcache, MLA/DSA (the latter that helps solve global attention compute) does as well, and using linear, SSM layers saves even more.
None of these reduce memory demand (Jevon's paradox, etc), though. Looking at my coding tools, I'm using about 10-15B cached tokens/mo currently (was 5-8B a couple months ago) and while I think I'm probably above average on the curve, I don't consider myself doing anything especially crazy and this year, between mainstream developers, and more and more agents, I don't think there's really any limit to the number of tokens that people will want to consume.
For example Gemma 4 32B, which you can run on an off-the-shelf laptop, is around the same or even higher intelligence level as the SOTA models from 2 years ago (e.g. gpt-4o). Probably by the time memory prices come down we will have something as smart as Opus 4.7 that can be run locally.
Bigger models of course have more embedded knowledge, but just knowing that they should make a tool call to do a web search can bypass a lot of that.
That is the sad reality of the future of memory.
Given the current tech, I also doubt there will be practical uses and I hope we’ll see the opposite of what I wrote. But given the current industry, I fully trust them so somehow fill their hardware.
Market history shows us than when the cost of something goes down, we do more with the same amount, not the same thing with less. But I deeply hope to be wrong here and the memory market will relax.
I hate to mention Jevons paradox as it has become cliche by now, but this is a textbook such scenario
[0] https://techwireasia.com/2026/04/chinese-memory-chips-ymtc-c...
>CXMT still trails Samsung, SK Hynix, and Micron by approximately three years in advanced DRAM node development, and yield rates on new production lines remain the variable that determines whether capacity targets translate into reliable supply. Liu notes that lines launched in the second half of 2026 are unlikely to change the global supply-demand balance until 2027.
The Verge article talks about demand exceeding supply in 2028. Your article suggests it'll take until 2029 before Chinese production catches up to current technology.
It'll help drive prices down in five yearss, but the Chinese memory production won't be ready and efficient enough to prevent the shortages from continuing to grow.
But software optimisation helps all hardware and that doesnt drive sales.
Linux however, they dont have to worry about that. Maybe it is finally the era of Haiku OS as the ghost of BeOS rises!
Assuming China takes TSMC in one piece (unlikely without internal sabotage in the best case scenario), it would still probably take years before it produces another high end GPU or CPU.
We would probably be stuck with the existing inventory of equipment for a long time…
The risk with China taking over Taiwan is that they mostly expedite their own production research by a couple of years.
Have you seen how many states and countries look enviously at Silicon Valley’s tech companies, China’s manufacturing dominance, or London’s financial sector and try to replicate them?
Turns out it’s way harder than you’d expect.
Hell, Intel can’t match TSMC despite decades of expertise, much greater fame, and regulators happy to change the law and hand out tens of billions in subsidies.
Anyone trying to spin up a competitor to TSMC would have to first overcome a significant financial hurdle: the capital investment to build all the industrial equipment needed for fabrication.
Then they'd have to convince institutions to choose them over TSMC when they're unproven, and likely objectively worse than TSMC, given that they would not have its decades of experience and process optimization.
This would be mitigated somewhat if our institutions had common-sense rules in place requiring multiple vendors for every part of their supply chain—note, not just "multiple bids, leading to picking a single vendor" but "multiple vendors actively supplying them at all times". But our system prioritizes efficiency over resiliency.
A wealthy nation-state with a sufficiently motivated voter base could certainly build up a meaningful competitor to TSMC over the course of, say, a decade or two (or three...). But it would require sustained investment at all levels—and not just investment in the simple financial sense; it requires people investing their time in education and research. Dedicating their lives to making the best chips in the world. And the only reason that would work is that it defies our system, and chooses to invest in plants that won't be finished for years, and then pay for chips that they know are inferior in quality, because they're our chips, and paying for them when they're lower quality is the only way to get them to be the best chips in the world.
They have the other system.
> A wealthy nation-state with a sufficiently motivated voter base could certainly build up a meaningful competitor to TSMC over the course of, say, a decade or two (or three...).
Demand increased, everyone built new fabs, then prices dropped and they couldn't pay off their investments. Many went out of business. It happened in the 80s, it happened in the 90s, it happened in the 2000s.
Now there's only three manufacturers left, and they know very well that demand for their product tends to be cyclical.
I've been in the industry for 30 years and I've worked at companies with fabs were demand was high and customers would only get 30% of what they ordered. Then just 2 years later our fab was only running at 50% capacity and losing money. It takes about $20 billion and 3-4 years to make a modern new fab. If you think that AI is a bubble then do you want to be left with a shiny new factory and no products to sell because demand has collapsed?
The lawsuits in the past prove that statement to not be basically but actually.
I don't want to pay more because of AI companies driving the price up. That is milking.
Think I will scrap my PC and sell its parts.
I wonder if there are any niche companies building decent rigs with DDR3 and 5/6th generation Intel CPUs out there, it is cheap and might be a business opportunity?
Another thing I've been thinking about is what happens when the next generation of NVidia chips comes out? I suspect NVidia is going to delay this to milk the current demand but at some point you'll be able to buy something that's better than the H100 or B200 or whatever the current state-of-the-art for half the price. And what's that going to do to the trillions in AI DC investment?
I'm interested when the next bump in DRAM chip density is coming. That's going to change things although it seems like much of production has moved from consumer DRAM chips to HBM chips. So maybe that won't help at all.
I do think that companies will start seeing little ot no return from billions spent on AI and that's going to be aproblem. I also think that the hudnreds of billions of capital expenditure of OpenAI is going to come crashing down as there just isn't any even theoretical future revenue that can pay for all that.
They'll just spend whatever they were planning to spend and get more performance.
We have RAM shortage now, we will have very cheap RAM tomorrow. It’s not like production is bottlenecked by raw materials. Chip companies just need to assess if the demand by AI companies will last so it’s better to scale up, or perhaps they should wait it out instead of oversupplying and cutting into their profits.
There are two RAM suppliers...