undefined

points

[-]

It's almost certainly not quadratic at 1M. This would be wildly infeasible at scale. 10^6^2 = 10^12. That's a trillion things.

They are probably doing something like putting the original user prompt into the model's environment and providing special tools to the model, along with iterative execution, to fully process the entire context over multiple invokes.

I think the Recursive Language Model paper has a very good take on how this might go. I've seen really good outcomes in my local experimentation around this concept:

https://arxiv.org/abs/2512.24601

You can get exponential scaling with proper symbolic stack frames. Handling a gigabyte of context is feasible, assuming it fits the depth first search pattern.

by FartyMcFarter5 hours ago|

prev|

[-]

They're probably taking shortcuts such as taking advantage of sparsity. There are various tricks like that mentioned in some papers, although the big companies are getting more and more secretive about how their models work so you won't necessarily find proof.