undefined

points

[-]

4.6 was released at the beginning of February, so if the Chinese models only "tickle its tail," that means they're >5 months behind.

by felipeerias13 hours ago|

parent|

[-]

That comparison is also misleading because Opus 4.6 was probably not Anthropic's frontier model.

We got the first news about Mythos in March, so it is likely that it was already close to ready by the time Opus 4.6 was released.

So the actual gap is the time elapsed between March (or April for the official announcement) and whenever Chinese models can match Mythos.

by SwellJoe13 hours ago|

parent|

[-]

The post-training process of a model that size is months, though it "works" before that. It is a big chunky model before it's released to the world and probably does some amazing things, sometimes...but, it wasn't done (else why wouldn't they release it and soundly trounce their competitors). I would assume that Chinese AI companies have a pipeline and what we see is a couple/few months behind their newest model, as well. Like, the new base model is cooked, but they're still plating it for service.

Why would Anthropic get the benefit of pre-release models counting toward their lead, if nobody else gets to count their pre-release models?

by trvz16 hours ago|

parent|

prev|

[-]

> The leading Chinese models are only a few months behind now

by PeterStuer12 hours ago|

parent|

[-]

I hear that often, but what does that even mean? I am a great proponent of open weights models. I do believe they are the only reason we have not stagnated into a collusion of halting (public) model releases.

But exactly which point in time is z.ai compared to claude.ai? Consistently bring "6 months behind" in an exponentially acellerating evolution means the gap is growing exponentially wider, not constant.

by SwellJoe9 hours ago|

parent|

[-]

"an exponentially acellerating evolution"

Oh? Exponentially accelerating, huh? That's quite a surprise, to me.

by SwellJoe15 hours ago|

parent|

prev|

[-]

What range of numbers do you believe "a few" represents?

by mlyle14 hours ago|

parent|

[-]

Opinions vary, but:

A couple: usually 2, though not always

A few: 3, 4, 5

Several: 4, 5, 6, or 7.

by marcus_holmes13 hours ago|

parent|

[-]

> A couple: usually 2, though not always

I had to explain this to my German friend. In my understanding this isn't about the actual number, it's about the certainty. If it's absolutely and definitely two, then I say two. If I'm uncertain but it's probably two, or if a non-integer, somewhere around two, then I say couple.

And few is more likely to be 3 than 5, because 5 is getting close to a "half-dozen or so", or (as you say) several.

Many is very context-sensitive, as the meme has it.

So I would agree that the open models are a few months behind, definitely more than a couple of months behind, possibly several months behind, maybe a half-dozen months or so behind, but not many months behind.

by cassianoleal12 hours ago|

parent|

[-]

In the UK, as far as I can tell, a couple are 2. Not around 2. Not maybe 3 or 4. Always 2.

3 or 4 would likely be a few, or some. 1 is, well, one.

by jonathrg11 hours ago|

parent|

prev|

[-]

Several and a few are the same number, they only differ rhetorically.

by mlyle3 hours ago|

parent|

[-]

I think several is used by most speakers for larger quantities than few. It has the connotation of being larger, and that changes usage.

by rafram7 hours ago|

parent|

prev|

[-]

Certainly below 6!

by pelagicAustral10 hours ago|

prev|

[-]

Whats the leading Claude Code competitor model over in China?

by 6 hours ago|

parent|

[-]

deleted

by Sammi11 hours ago|

prev|

[-]

So I keep hearing.

by dansquizsoft15 hours ago|

prev|

[-]

Another day, more cope on this subject from many posters on here...

by Der_Einzige14 hours ago|

prev|

[-]

This is nonsense.

The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.

China has no flywheel for long-form agentic traces like Claude Code and its telemetry over its userbase (no one uses the Chinese harnesses yet). Most Chinese models are forced to price themselves significantly below cost to compete with the huge demand for bootleg claude tokens, because they're that much worse.

by brailsafe14 hours ago|

parent|

[-]

> is estimated at 10 months by Anthropic themselves, and it's growing.

How is this different than any business with something to lose saying a competitor isn't as good? Not saying it's false, but it would seem to me that it's more important how customers feel about the issue.

by brazukadev7 hours ago|

parent|

[-]

Didn't Elon Musk said the same or even worse about BYD? He isn't laughing anymore tho.

by SwellJoe12 hours ago|

parent|

prev|

[-]

Ah, well, if Anthropic says their competitors are ten months behind...

I don't know what I was thinking.

by marcus_holmes13 hours ago|

parent|

prev|

[-]

Here in Australia the sudden withdrawal of Fable made all of us think hard about models and harnesses.

I've heard half a dozen people talk about how a less advanced model coupled with a better harness outperforms a smarter model in the last few weeks.

If the USA wanted to shoot its AI industry in the foot it achieved its goal.

by mmsimanga3 hours ago|

parent|

[-]

Which products are you now using?

by InsideOutSanta11 hours ago|

parent|

prev|

[-]

> The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.

There's a lot of subjectivity in determining this, but I'm 100% sure that 10 months is wrong.

I don't know whether the gap is currently growing, but I'm not sure it matters. There are thresholds where models reach certain levels of usefulness. Opus 4.8, for example, is at a level where I can give it relatively vague input, and it can go for half an hour on its own and produce a high-quality PR.

If GLM reaches that level of capability and can do that task more cheaply than Anthropic's model, I will use GLM for that task, because that's a specific type of task I use models for. It doesn't really matter whether Anthropic also has a better model, because what does "better" mean in this context? It's a clearly defined task, and Opus 4.8 already does it at a very high level of quality.

by bel814 hours ago|

parent|

prev|

[-]

If Anthropic themselves say competition is 10 months behind, it's probably 5 or less.

And you seem to think "no one uses" DeepSeek's v4, z.AI's GLM 5.2 or Xiaomi's MiMo 2.5 from their official APIs when they probably dwarf Anthropic's usage and are widening the gap due to conquering a chunk of Western market too.

I know it's hard for some to comprehend there's an entire Eastern hemisphere in the globe with billions of people, so it's worth reminding. And some seem to think the world is basically silicon valley even.

by Chyzwar11 hours ago|

parent|

[-]

Because claude subscription tokens are cheaper than deepseek and friends. You have whole industry of people reselling Claude subscriptions in China.

Can you comprehend than Anthropic is winning because is both cheap(subscriptions) and better SOTA. People are cheering China providers when I reality they would rugpull open weights the moment they are competive.

China models are trash that why they are giving them away for free.

For individuals and small companies subscriptions is the best deal, for big companies china models are big no unless they can host them.

by slopinthebag32 minutes ago|

parent|

[-]

No, Claude subscription tokens are not cheaper than the Deepseek API. You are dead wrong on that.

by Der_Einzige6 hours ago|

parent|

prev|

[-]

Not sure why you're being downvoted for being objectively correct.

HN is full of contrarians and folks who don't know what they're talking about in regards to AI.

by gck110 hours ago|

parent|

prev|

[-]

> The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.

#1 I've had use cases where it was clearly obvious the Chinese models were behind.

#2 I've also had use cases where I couldn't tell a difference at 1/20th of the price.

The problem is - the #1 is the use case where American frontier is gated behind saboteur classifiers and is tiny minority anyway. Vast majority of work is #2.

The gap doesn't matter anymore.