upvote
There is no reason to benchmark against Opus 4.5 when Opus 4.6 has been out so long, other than to be misleading.
reply
I can see reasons, among others that 4.5 was the one established as they were preparing this version. "So long" is merely 2 months ago, and Qwen 3.5 was barely released less than 2 months ago. They were likely already working on finalizing 3.6 before 3.5 official launch, and as 4.6 came out.

In any case, aside Claude fanboyism, having other plays inch closer to similar performance is always useful. Even if they are "6 months behind" as the pace slows down, this guarantees that there's no huge moat and they'll eventually either get to where the SOTA is, or the difference wont be that big.

I'd rather put fewer eggs in 2-3 big player baskets.

reply
And it seems they've decided to go closed-source for their largest, best models.
reply
3.5-plus was also only available via api. I don’t know what the long term business model for open weights is, I hope there is one, but it seems foolish to assume that companies will be willing to spend millions of dollars of compute on an asset worth zero in perpetuity.
reply
deleted
reply
The business case is to salt the earth for new competitors, coupled with marketing.
reply
They've always had closed-source variants:

- Qwen3.5-Plus

- Qwen3-Max

- Qwen2.5-Max

etc. Nothing really changed so far.

reply
They always did that. Did they say anywhere they'd open all their models? They still have a business.
reply