Did they solve the "lost in the middle" problem? Proof will be in the pudding, I suppose. But that number alone isn't all that meaningful for many (most?) practical uses. Claude 4.5 often starts reverting bug fixes ~50k tokens back, which isn't a context window length problem.
Things fall apart much sooner than the context window length for all of my use cases (which are more reasoning related). What is a good use case? Do those use cases require strong verification to combat the "lost in the middle" problems?