upvote
4.6 was released at the beginning of February, so if the Chinese models only "tickle its tail," that means they're >5 months behind.
reply
That comparison is also misleading because Opus 4.6 was probably not Anthropic's frontier model.

We got the first news about Mythos in March, so it is likely that it was already close to ready by the time Opus 4.6 was released.

So the actual gap is the time elapsed between March (or April for the official announcement) and whenever Chinese models can match Mythos.

reply
The post-training process of a model that size is months, though it "works" before that. It is a big chunky model before it's released to the world and probably does some amazing things, sometimes...but, it wasn't done (else why wouldn't they release it and soundly trounce their competitors). I would assume that Chinese AI companies have a pipeline and what we see is a couple/few months behind their newest model, as well. Like, the new base model is cooked, but they're still plating it for service.

Why would Anthropic get the benefit of pre-release models counting toward their lead, if nobody else gets to count their pre-release models?

reply
> The leading Chinese models are only a few months behind now
reply
I hear that often, but what does that even mean? I am a great proponent of open weights models. I do believe they are the only reason we have not stagnated into a collusion of halting (public) model releases.

But exactly which point in time is z.ai compared to claude.ai? Consistently bring "6 months behind" in an exponentially acellerating evolution means the gap is growing exponentially wider, not constant.

reply
"an exponentially acellerating evolution"

Oh? Exponentially accelerating, huh? That's quite a surprise, to me.

reply
What range of numbers do you believe "a few" represents?
reply
Opinions vary, but:

A couple: usually 2, though not always

A few: 3, 4, 5

Several: 4, 5, 6, or 7.

reply
> A couple: usually 2, though not always

I had to explain this to my German friend. In my understanding this isn't about the actual number, it's about the certainty. If it's absolutely and definitely two, then I say two. If I'm uncertain but it's probably two, or if a non-integer, somewhere around two, then I say couple.

And few is more likely to be 3 than 5, because 5 is getting close to a "half-dozen or so", or (as you say) several.

Many is very context-sensitive, as the meme has it.

So I would agree that the open models are a few months behind, definitely more than a couple of months behind, possibly several months behind, maybe a half-dozen months or so behind, but not many months behind.

reply
In the UK, as far as I can tell, a couple are 2. Not around 2. Not maybe 3 or 4. Always 2.

3 or 4 would likely be a few, or some. 1 is, well, one.

reply
Several and a few are the same number, they only differ rhetorically.
reply
I think several is used by most speakers for larger quantities than few. It has the connotation of being larger, and that changes usage.
reply
Certainly below 6!
reply
Whats the leading Claude Code competitor model over in China?
reply
deleted
reply
So I keep hearing.
reply
Another day, more cope on this subject from many posters on here...
reply
This is nonsense.

The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.

China has no flywheel for long-form agentic traces like Claude Code and its telemetry over its userbase (no one uses the Chinese harnesses yet). Most Chinese models are forced to price themselves significantly below cost to compete with the huge demand for bootleg claude tokens, because they're that much worse.

reply
> is estimated at 10 months by Anthropic themselves, and it's growing.

How is this different than any business with something to lose saying a competitor isn't as good? Not saying it's false, but it would seem to me that it's more important how customers feel about the issue.

reply
Didn't Elon Musk said the same or even worse about BYD? He isn't laughing anymore tho.
reply
Ah, well, if Anthropic says their competitors are ten months behind...

I don't know what I was thinking.

reply
Here in Australia the sudden withdrawal of Fable made all of us think hard about models and harnesses.

I've heard half a dozen people talk about how a less advanced model coupled with a better harness outperforms a smarter model in the last few weeks.

If the USA wanted to shoot its AI industry in the foot it achieved its goal.

reply
Which products are you now using?
reply
> The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.

There's a lot of subjectivity in determining this, but I'm 100% sure that 10 months is wrong.

I don't know whether the gap is currently growing, but I'm not sure it matters. There are thresholds where models reach certain levels of usefulness. Opus 4.8, for example, is at a level where I can give it relatively vague input, and it can go for half an hour on its own and produce a high-quality PR.

If GLM reaches that level of capability and can do that task more cheaply than Anthropic's model, I will use GLM for that task, because that's a specific type of task I use models for. It doesn't really matter whether Anthropic also has a better model, because what does "better" mean in this context? It's a clearly defined task, and Opus 4.8 already does it at a very high level of quality.

reply
If Anthropic themselves say competition is 10 months behind, it's probably 5 or less.

And you seem to think "no one uses" DeepSeek's v4, z.AI's GLM 5.2 or Xiaomi's MiMo 2.5 from their official APIs when they probably dwarf Anthropic's usage and are widening the gap due to conquering a chunk of Western market too.

I know it's hard for some to comprehend there's an entire Eastern hemisphere in the globe with billions of people, so it's worth reminding. And some seem to think the world is basically silicon valley even.

reply
Because claude subscription tokens are cheaper than deepseek and friends. You have whole industry of people reselling Claude subscriptions in China.

Can you comprehend than Anthropic is winning because is both cheap(subscriptions) and better SOTA. People are cheering China providers when I reality they would rugpull open weights the moment they are competive.

China models are trash that why they are giving them away for free.

For individuals and small companies subscriptions is the best deal, for big companies china models are big no unless they can host them.

reply
No, Claude subscription tokens are not cheaper than the Deepseek API. You are dead wrong on that.
reply
Not sure why you're being downvoted for being objectively correct.

HN is full of contrarians and folks who don't know what they're talking about in regards to AI.

reply
> The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.

#1 I've had use cases where it was clearly obvious the Chinese models were behind.

#2 I've also had use cases where I couldn't tell a difference at 1/20th of the price.

The problem is - the #1 is the use case where American frontier is gated behind saboteur classifiers and is tiny minority anyway. Vast majority of work is #2.

The gap doesn't matter anymore.

reply