So now the question is whether the capabilities of other models are worth their far cheaper token prices.
Plus, are we at all confident Opus or GPT 5.5 aren't about to get shut off?
You mean, GPT 5.5 xhigh and Claude Opus 4.8 max? At least the benchmarks / public evals / rankings show some of the new coding models (ex: Qwen 3.7 Max & Mimo v2.5 Pro) are Opus 4.7 & GPT 5.4 level (but 3x to 5x cheaper): https://artificialanalysis.ai/leaderboards/models / https://gertlabs.com/rankings Personally speaking, in the past 1mo or so, I haven't missed GPT 5.4 / Opus 4.7 after moving to Qwen 3.7 / MiMo 2.5 / Kimi 2.6 et al.
Comparisons using the vendor-specific effort is apples and oranges. Ideally the evals would use a thinking token cap or something, so we can compare per-token performance. But eval is hard enough as it is.
I honestly think that DeepSeek is as good, and sometimes even better, than the competition.