undefined

upvote

points

by drevil-v219 hours ago |

upvote

by reacharavindh12 hours ago|

[-]

All this will fly until a competitor from outside the US releases a “freedom” model that is even 90% as capable as Fable was without its shackles.

But, as a frustrated EU resident lamenting a lack of European option(Mistral is just not competitive enough), I will spread my money towards the Chinese models as well. Thank you Murica! You achieved your soft power by pushing us towards the Chinese :-)

This protectionism and hypocrisy (free markets and freedom!! Until it is us who needs to practice what we preach) is so tiring. I wish European nations would come together closer and put their differences aside and realise larger things together. Become the new power that the US is clearly stumbling away from being.

reply

upvote

by burnerRhodov35 hours ago|

[-]

Europe seems to be going through an identity crisis lately, and i hope this sentiment doesn't continue. Europe becoming more reliant on the Chinese is not the answer, and will, if continues, isolate the EU from the US.

reply

upvote

by squidbeak5 hours ago|

[-]

Europe may face technological and economic challenges, but one thing it isn't suffering is 'an identity' crisis - except in the daydreams of right wing propagandists. The EU's identity is represented in its charter and the various treaties behind it.

> Europe becoming more reliant on the Chinese is not the answer, and will, if continues, isolate the EU from the US

There are sound reasons to avoid reliance on China, but the risk of isolation from a fading superpower - who befriends the EU's enemies, agitates in EU politics, inflict needless damage on the EU's economy, and insults EU leaders - isn't one of them.

reply

upvote

by bloppe5 hours ago|

[-]

If AfD, RN, Futuro Nazionale and their ilk stay on their current trajectory, the identity crisis will become much harder to ignore

reply

upvote

by calgoo5 hours ago|

[-]

Also, don't forget how much they are medeling with the EU politics. Here in Spain the ultra right wing are following the thrump playbook step by step. Now I can't prove this, but with our current government standing up to both the US and Israel, there is a feeling that a bunch of money and "think tank" guidance is happening.

reply

upvote

by SubiculumCode3 hours ago|

[-]

It's Russia and the US and EU have long been under attack.

reply

upvote

by squidbeak3 hours ago|

[-]

Russia's meddling is malign. But did Russia send the USA's Vice President to Hungary to campaign on behalf of Viktor Orban?

reply

upvote

by SubiculumCode1 hours ago|

[-]

maybe, who knows. but the propaganda from russia has been pervasive...and its cause people live vance and trump to get elected.

reply

upvote

by jamiequint4 hours ago|

[-]

"Fading superpower" is typical EU cope. It may help to be a little bit introspective about why one might want to oppose EU politics, or its leaders, whose "leadership" over the last decade has led to unprecedented migrant, economic, and energy crises, and stalling growth.

reply

upvote

by squidbeak3 hours ago|

[-]

Seeing the word 'cope' lobbed out is usually a sure sign a poster is projecting, and so it is here.

What exactly is there in the USA's destruction of the economic norms that have always served it, or in the pointless dumping of its hard-won soft power, alienation of its allies, deliberate weakening of its intelligence gatherers, rampant open corruption from its leadership, or in any other of the innumerable harms it's inflicted on itself the last 18 months, that you think is conducive to the US maintaining its superpower status?

reply

upvote

by thewebguyd3 hours ago|

[-]

No, fading is right. The US is willingly and deliberately ceding much of its soft power. The US also caused a global energy crisis by being so completely incompetent in their dealings with Iran in the on again off again toxic s/relationship/war

Even if the US isn't fading, the message is still clear: the country is adopting a more isolationist stance and has no problems alienating its allies. Why would you want to continue to tie yourself to a nation like that?

reply

upvote

by surgical_fire5 hours ago|

[-]

> Europe becoming more reliant on the Chinese is not the answer

China should be dealt with as a normal country. There's no need for undue anxiety there.

EU as a trade block should exercise reciprocity and protect its own interests accordingly though.

As for LLMs, I see no issue in using Chinese models. With the talk of digital sovereignty, you can run open source models on EU datacenters without necessarily having to spend the money to train them.

> isolate the EU from the US.

That is not a bad thing. In fact, I hope this separation grows stronger.

It was about time European countries lifted themselves from the US shadow.

reply

upvote

by WarmWash3 hours ago|

[-]

China is an authoritarian ethnostate with mock capitalism experiments.

If you want to climb into Xi Jinpeng's garden where he has absolute uncontested unilateral control for life, well, be warned.

reply

upvote

by spacedcowboy5 hours ago|

[-]

I mean, this is not necessarily a problem for the EU. Some might say it's a goal.

The USA is far more dangerous a "friend" than China is an acquaintance. China has not been threatening military annexation, China does not randomly start trade (or real) wars. China doesn't just turn away from international commitments.

Bottom line: China is a far better international partner than the USA.

reply

upvote

by petilon5 hours ago|

[-]

> China has not been threatening military annexation

Maybe not in Europe, but ask their Asian neighbors.

> The USA is far more dangerous a "friend" than China is an acquaintance.

That's true and will continue to be true for 2.5 more years. European countries too have had bad leaders (like Germany), but have recovered. So too will the US.

> China is a far better international partner than the USA.

China is not a democracy and does not share western values.

reply

upvote

by Laurel12343 hours ago|

[-]

> China is not a democracy and does not share western values.

True, but neither is the US.

reply

upvote

by energy1233 hours ago|

[-]

> China has not been threatening military annexation

They've been doing military annexation right now in the South China Sea.

> China does not randomly start trade (or real) wars.

The invasion of Vietnam? The subsidization of industry and pegging their FX?

> China doesn't just turn away from international commitments.

Abandoning Ukraine despite being a signatory to an agreement that assures their defense?

This is not an anti-China post. I don't like anti-XYZ country posts that create tension and make people defensive. I am not particularly against China more than other major powers. They have their interests and they pursue them selfishly, like other countries do. This is just a basic lesson about the world you live in.

reply

upvote

by spacedcowboy1 hours ago|

[-]

Taken as read is the context "in Europe" here - it's a comment about European reaction to China vs America, and none of the above applies to Europe.

reply

upvote

by naim085 hours ago|

[-]

> taiwan? > hk? trade wars... have you actually looked at how china uses trade as political bargaining tool??? Look at how China treats Japan, south korea, etc

reply

upvote

by vintagedave12 hours ago|

[-]

> Mistral is just not competitive enough

Does anyone know why? I was really excited when they emerged, but their models and targets don't seem to be quite in the same market.

reply

upvote

by xdertz12 hours ago|

[-]

Their target market is completely different. Anthropic and OpenAI try to build general AI that wins on all the benchmarks by throwing ungodly amounts of money at it.

Mistral focuses on long term b2b contracts and their proposition is that they fine tune their model to your needs with an added bonus of 'not dependent on America' in a politically tumultuous time.

reply

upvote

by nolok9 hours ago|

[-]

Another added bonus is that they offer a clear "self hosted" if you want it, you can get the exact same product without sending your queries to them, it's costlier and you need the hardware sure, but between the economic espionnage aspect, the sovereignty aspect, the data safety aspect ... This has teeth in europe.

reply

upvote

by ghm21999 hours ago|

[-]

An so like if a business wanted to home in on one very specific use case that could be hyper optimized by SFT, had really good support for updating and adding new features, on-Prem etc. that’s the kind of market they are in?

reply

upvote

by mike_hearn11 hours ago|

[-]

Lack of capital and (probably) lack of willingness to mass distill Anthropic and OpenAI.

reply

upvote

by plufz10 hours ago|

[-]

What would happen if they mass distilled one of the really large local models like GLM 700b or deepseek 1.6t?

reply

upvote

by grim_io10 hours ago|

[-]

At that point you might as well just host them yourself.

reply

upvote

by sulam6 hours ago|

[-]

Those already exist.

reply

upvote

by sajithdilshan10 hours ago|

[-]

That's not how the innovation works

reply

upvote

by kelipso6 hours ago|

[-]

Innovation is a pretty neutral concept. It doesn’t care about things like “what if my model learns from other models” as opposed to “what if my model learns from data I painfully curated” if the model progresses the same.

reply

upvote

by weezing8 hours ago|

[-]

Innovation is teaching your model on stolen data from literally everywhere but other models.

reply

upvote

by sajithdilshan10 hours ago|

[-]

Most probably lack of capital and talent. At the end of the day they have to compete with other giants for the chips to train the models.

reply

upvote

by joelthelion10 hours ago|

[-]

I wouldn't be surprised if they had new models up their sleeve. Could be wrong of course.

reply

upvote

by sajithdilshan9 hours ago|

[-]

I’m pretty sure they have new models, but not better ones

reply

upvote

by VeejayRampay9 hours ago|

[-]

capital and talent is the same in this context

there's no shortage of talent in Europe or France, it's just an issue of available capital

reply

upvote

by sajithdilshan9 hours ago|

[-]

What I meant was top talent. US is still the top destination for top AI talent in general

reply

upvote

by throwaway6767128 hours ago|

[-]

A large contingent of the top AI people is French, the Mistral founders worked at Meta and Google before coming back to France. The real issue is capital, French salaries are shit

reply

upvote

by inglor_cz9 hours ago|

[-]

"there's no shortage of talent in Europe or France, it's just an issue of available capital"

This is more complicated than you paint it. Countries like UAE have enough capital to throw at things and little-to-no taxation, yet they don't attract as much talent as they would like to.

Preexisting centers of excellence like Silicon Valley are attractive for young talented people precisely because a lot of older talented people are already there. The same reason why a young talented painter in 15th century would prefer Florence to some rich, but boring place elsewhere.

You can only really do a meaningful work in a "heavy" field by tightly cooperating with others, and physical proximity still matters.

reply

upvote

by seviu12 hours ago|

[-]

This. I have been using anthropic and codex subs, on max. All this changed in June. We are clearly entering an era where we cannot rely on American models. As a solo developer I value reliability over performance. I cannot pay hundred of $, plus a lot of my private time figuring out how to properly use this technology, for it to be taken away within hours.

On top of that, the intelligence is being dialed down. Sonet 5 is a living proof of this. Fable has strong guardrails, but new Sonet is a dumbed down expensive model, which already falls behind GLM 5.2 and Kimi 2.7. I might go back to Claude since I know Fable is just a limited offer, and I am not going to pay for API usage. But what they are signaling with Sonet will also come to Opus. A lobotomized more expensive model.

I am honestly baffled how the current administration is giving the whole world, on a golden plate, to China. And they don't seem too bothered about it. They are living in their own bubble and reality distortion field I guess.

I could go on endless rant about Dario, but I feel I am so strongly biased now that my judgement might be clouded.

Time to move on

reply

upvote

by jandrewrogers5 hours ago|

[-]

AI tech is clearly a Red Queen's Race[0]. In the long-term, whoever can run the fastest the longest will win. Adverse action that doesn't materially impact the rate of execution has little effect on this outcome except to the extent it reduces the rate of execution of competitors. Historically, American business is exceptional at executing this type of game and patient when it comes to making it pay for itself.

The AI model people choose today has no bearing on the ultimate trajectory of the competition. Both the US and China understand this. EU simply can't move quick enough to be competitive in this type of game, which I think they also recognize.

Everyone is betting that the model you use will be a Hobson's Choice[1] over a long enough time horizon. They are likely correct.

[0] https://en.wikipedia.org/wiki/Red_Queen%27s_race

[1] https://en.wikipedia.org/wiki/Hobson%27s_choice

reply

upvote

by orangecat5 hours ago|

[-]

the intelligence is being dialed down. Sonet 5 is a living proof of this.

Huh? Sonnet 5 is a strict improvement over Sonnet 4.6 at the same price.

reply

upvote

by Forgeties7910 hours ago|

[-]

I feel like I see this comment every few months and yet in between people keep talking about all of the functionality they’re getting out of anthropic’s offerings. It doesn’t seem to me that people are willing to give up the “shackles” as it were and we’re just going to wind up with what we’re fearing here. On top of that, local models are just not turnkey enough for the average person yet (go ahead and drop somebody into LM studio and tell them to get to work, it won’t go well).

reply

upvote

by 2muchcoffeeman7 hours ago|

[-]

The best models are clearly all from US companies.

But I think what you’ll see is people making sure the model they use can just be plugged into their workflow.

I used to use Gemini-cli until they did a Google and cancelled it in favour of anti-gravity.

That was my fault. Fool me the 10th time shame on me.

So I picked more open source offerings that I can use with any model. Once the other models are good enough, I just need to jump ship.

reply

upvote

by glimshe8 hours ago|

[-]

The reason Anthropic gets away with all of that is that Claude revenues are increasing with no end in sight. People write these entitled rants and silently go back to Anthropic and OpenAI like obedient puppies.

reply

upvote

by Forgeties797 hours ago|

[-]

It’s in increasing but they still haven’t gotten into the black. I’m pretty sure nobody has yet, right?

reply

upvote

by glimshe7 hours ago|

[-]

Probably not, but I think the investors care more about users and subscriptions than profits at this point.

reply

upvote

by Forgeties795 hours ago|

[-]

That’s the old model for sure but we’ve never seen this much investment this rapidly. They have to be getting cold feet by now. Hence anthropic going public. If they had a strategy for getting in the black AND longterm viability I feel like we’d see them roll private longer. But instead we’re seeing all the signs of “grow, exit, let the next dude figure it out.”

reply

upvote

by glimshe4 hours ago|

[-]

I completely agree. I was just pointing out the current expectations and behavior.

reply

upvote

by lmf4lol7 hours ago|

[-]

Not the EU should build one but companies from the EU!!! We need to stop relying on the Brussel bureaucrats. China is not building models. US is not building models. Deepseek is doing and Anthropic!

reply

upvote

by andai6 hours ago|

[-]

I think there's a lot of government funding for it in China though?

reply

upvote

by lmf4lol6 hours ago|

[-]

Its still business and eager entrepreneurs doing the work! Not China. China is facilitating, yes. but its not building models.

reply

upvote

by andsoitis6 hours ago|

[-]

> as a frustrated EU resident lamenting a lack of European option(Mistral is just not competitive enough), I will spread my money towards the Chinese models as well. Thank you Murica!

It is interesting to hear a European exclaim they would rather depend on a selection of models from companies in China with concomitant strings attached, rather than be dependent on a selection of models from companies in America.

Isn't it better to simply stick to whatever is best and then, should it be pulled from under you, simply switch out to the new best model that IS available? I don't know that models have a moat and you can easily swap out should you need to.

Pre-emptively betting on which is going to be least susceptible to government intervention seems like premature optimization to me.

reply

upvote

by nicoburns5 hours ago|

[-]

Note the GP says "I will spread my money towards the Chinese models AS WELL", so it's more of a case of encouraging competition rather than betting on any one actor.

reply

upvote

by andsoitis5 hours ago|

[-]

The "as well" is ambiguous. I interpreted it as "like other people are doing" rather than "spend twice as much as contingency". To me it seems like the more parsimonious interpretation, but I could very well be mistaken and they are indeed planning to throw money in both directions already just in case.

reply

upvote

by thewebguyd3 hours ago|

[-]

> Companies in China with concomitant strings attached

What strings? The Chinese models are open weight, you don't have to spend your money directly with those labs. They can be hosted within the EU, by EU companies without sending a dime to China.

The bigger question is does the EU have the appetite to invest in building out data centers/hosting infrastructure for this, and that's where I have my doubts.

reply

upvote

by surgical_fire5 hours ago|

[-]

Chinese models cost a fraction of the price, and as someone who has been using DeepSeek and MiMo, they are nothing short of excellent.

In terms of cost-benefit, they are already the best models I could find.

reply

upvote

by Alifatisk7 hours ago|

[-]

> All this will fly until a competitor from outside the US releases a “freedom” model that is even 90% as capable as Fable was without its shackles.

A Chinese cybersecurity company "360" has announced "Chinas version of Mythos".

reply

upvote

by pokot08 hours ago|

[-]

Isn't building a 90% frontier model relatively cheap for EU?

I feel like EU could start a company, start from available open weight models, feed 2bln a year into it (1% of the EU budget) and make a compelling almost SOTA model for the EU market. This company could partner with datacenter providers and sell it hosted in the EU or somewhere else with EU protection terms. The budget for this company would easily double with the added revenues and you are creating an ecosystem of providers that can compete with US big-techs and have a 500 million people market that can't wait to ditch US companies for them, given the current mood.

The model can be open weight and it's an easy way to compound the efforts we are seeing in China without even having to talk to each other. Maybe there is a way to make it work not open weights but I am not sure how would that work.

These are those kind of decisions that seem such no brainers to me, which probably means I am completely out of touch with reality.

reply

upvote

by adrianN7 hours ago|

[-]

1% of the budget is a really big number. The EU has a lot than a hundred responsibilities that demand money.

reply

upvote

by throwaway6767128 hours ago|

[-]

Europeans are risk averse and don't have access to that many deep pocketed risk-taking VCs. On top of that the US is poaching all the talent since most EU firms won't or can't match the salaries offered in the US to AI researchers.

This may change in the future as AI gets more commoditized and the current US admin keeps shooting itself in the foot but they are still very far ahead right now

reply

upvote

by cindyllm8 hours ago|

[-]

[dead]

reply

upvote

by senko7 hours ago|

[-]

> I feel like EU could start a company,

That's not how market-based economies work...

> feed 2bln a year into it and make a compelling almost SOTA model

...and the reason is, if you give a bunch of people €2b a year and tell them "go try and make something", they'll make a ton of paperwork covering their asses and very little actual output.

This is irrespective if those people are European ("european google killer"), American ("cost plus" old US aerospace companies) or Chinese (which is why they do it a little different).

If there are no incentives to really try really hard, they won't do it.

In many high-tech cases in Europe, the formula for "let's subsidise the hell out of research and hope a commercially-viable business comes out" has a really poor track record.

Your second option - and possibly the best bet - is to find an existing company that already showed they're capable, and shower them with money, which is what French are doing with Mistral.

reply

upvote

by benny_s7 hours ago|

[-]

Good idea in theory, but in reality 90% of this 2bn budget would just be swallowed up by the bureaucracy that would surround this.

reply

upvote

by sajithdilshan10 hours ago|

[-]

A lot of bitter europeans would down vote this comment, but saying Murica has pushed you towards China is hypocrisy is at its finest. Your incompetent EU politicians are the ones that has failed you by outsourcing every aspect of sovereignty to the rest of the world instead of self-reliance. You have nobody to blame but yourselves. In one year you'll be blaming China for abandoning the EU when they starts controlling their frontier models.

reply

upvote

by grim_io10 hours ago|

[-]

Maybe we will in a year, but then we'll just complain about China copying protectionism and censorship from the USA.

If that's a comfortable position for you, all good.

We held the US in higher regards, that's all.

reply

upvote

by sajithdilshan10 hours ago|

[-]

[flagged]

reply

upvote

by grim_io10 hours ago|

[-]

We are just disappointed. You have to actually live there :)

I don't think anyone is making the same mistake with China, as open weight models can't be Thanos'd away.

reply

upvote

by sajithdilshan9 hours ago|

[-]

But open weighted models would be outdated in terms of capabilities and performance in few months. Also I can imagine Chinese companies only make less capable models open weighted in the future and any model capable than Mythos or even higher would be proprietary

reply

upvote

by Citizen_Lame10 hours ago|

[-]

Well you are not wrong, I would also add corrupted and cowardly politicians. We are in worse position than China, and under full control of daddy USA, no matter what they say. If US would pull switch, it would be catastrophe for the EU.

Even the premier EU companies such as ASML are heavily reliant on US supply chain.

But why can't we be bitter?

reply

upvote

by pirate7879 hours ago|

[-]

You can't be bitter because Europe squandered technological and capital market parity with the US in just 20 years.

reply

upvote

by Aurornis14 hours ago|

[-]

> The damage is done. You cannot build a business critical function on top of American SOTA frontier model. Especially not with the current crew in charge.

The switching costs of changing LLM providers is as low as it gets. All the individuals and startups I know try different models all of the time, even down to the level of choosing which provider to use based on the task. Bigger companies move slower but only because they have lawyers and teams negotiating contracts, not because there is a technical reason that it's hard to switch.

Companies have dealt with supply chain unpredictability by having multiple providers and switching between them since forever. It's infinitely easier to switch LLM providers than it is to deal with physical supply chain uncertainty.

reply

upvote

by PeterStuer12 hours ago|

[-]

For real production I find the switching cost is not as trivial as you portray. Even going to a new model version in the same model family, say GPT-4o to GPT-5.2, a transition I just finished on a not too complicated application, requires extensive retesting and tweaking of prompts, guardrails and parameters.

reply

upvote

by sshine11 hours ago|

[-]

I second this; even switching between minor versions of a model, you need to adjust prompts: the new model is better by implying a bunch of things that, when included in the prompt, will overdo that thing.

Assessing quality of output is often not trivial, either. Typically, problems that are solved by offloading something to an LLM are super subjective, and customers “feel” something is different is vulnerable.

We try to quantify output differences by many different similarity metrics. But a lot of energy goes into subjectively evaluating if something still works.

reply

upvote

by Aurornis3 hours ago|

[-]

We’re talking about SOTA models like Fable, though.

If you’ve got a product where the budget allows for Fable level token costs, I doubt you wouldn’t have the budget to run your evals again on a cheaper model if Fable was unavailable. I mean it wouldn’t even take that much token volume to turn it into a money saving proposition to do the engineering work to switch to a cheaper model.

Fable is primarily used for human in the loop tasks like coding or office work, not in some backend app unless the company has money to burn and doesn’t care about anything other than using the best model available at the time.

reply

upvote

by anonzzzies12 hours ago|

[-]

Maybe OP meant switching in a coding harness way? Not an application using AI? I had similar issues like you in the latter case, but in the former it's trivial.

reply

upvote

by jitl7 hours ago|

[-]

if you’re building on LLMs you gotta have an eval and prompt iteration pipeline, and you ought to be evaling every model release — your competitors will do this, and your users will want the latest and greatest (for frontier tasks) and the cheapest/fastest. So you should already be paying this cost anyways. i guess it depends on your team size and scale but not building this muscle seems like not having continuous delivery for regular code or even like not having tests and ci to merge to main.

reply

upvote

by Aurornis6 hours ago|

[-]

SOTA models are typically used for interactive coding and other human in the loop work

> say GPT-4o to GPT-5.2, a transition I just finished on a not too complicated application

Neither of which is close to SOTA, because tasks like these are typically built on a cost conscious manner which tries to keep token costs in check.

I’m primarily responding to all of the commenters who are acting like nobody is going to use American SOTA models for anything because the government interfered with them for a couple weeks. It’s obviously not true, and I expect these models to be oversubscribed instead of avoided like some are claiming.

reply

upvote

by jcims10 hours ago|

[-]

Vendor diversity is a longstanding risk management principle. For it to work you need to invest in it as you build, not when the rug is pulled.

reply

upvote

by miki12321113 hours ago|

[-]

Exactly!

Even if you won't be able to use some model tomorrow, you can still make money by using it today!

And in the age of limited compute, spiky workloads and constant outages, building a mechanism to fallback to a weaker model when your primary choice isn't available is smart anyway.

reply

upvote

by rob7412 hours ago|

[-]

For many, that fallback mechanism is simply called Cursor - soon to be owned by Elon Musk. Which opens up a similar but slightly different can of worms...

reply

upvote

by GTP11 hours ago|

[-]

Well, there are many alternatives to Cursor as well.

reply

upvote

by throwaw1212 hours ago|

[-]

> The switching costs of changing LLM providers is as low as it gets

Not trivial, you would need to do lots of evals and prompt tuning when you switch models.

imagine what happens when you optimize your agent skills to the current model, and new model starts breaking. you would need to have versioning for your skills, serving different skills based on the model while you do A/B testing

reply

upvote

by alfiedotwtf4 hours ago|

[-]

> Not trivial, you would need to do lots of evals and prompt tuning when you switch models.

Couldn’t we just train smaller models to “translate” what the harness user wants to what the worker model expects? I mean, if models understand caveman, it seems like just a small stretch

reply

upvote

by pornel3 hours ago|

[-]

It's not switching costs, but trust.

There's no congress. There's no policy (they've been making noises about not allowing AI regulation and now they're not-regulating it like a child paying with an on/off switch). The law is whatever Dear Leader's mood is today. It overrides any contract you sign with private companies, and they roll over and take it, because that's how oligarchies work.

reply

upvote

by Sammi19 hours ago|

[-]

I'm a small software business owner in Europe. I have to assume my competition is willing to pay for any business advantage they can get. And so I also have to pay for the SOTA model, whatever it is.

reply

upvote

by lelanthran12 hours ago|

[-]

> I'm a small software business owner in Europe. I have to assume my competition is willing to pay for any business advantage they can get. And so I also have to pay for the SOTA model, whatever it is.

If you make money from doing anything like "produce software with as little human involvement as possible", then sure, you need SOTA models. In that case, though, the value you add is very little and you probably don't have a sustainable business.

OTOH, if you make money by getting clients to pay for features, there is very little difference in time-savings from using Anthropic/OpenAI SOTA over GLM-latest.

IOW, if you business can only make money by one-shotting software, you probably don't have a business in the first place.

Regards, another small business owner.

reply

upvote

by midasz12 hours ago|

[-]

You also don't really need LLM's, we still have software engineers too. Everyone is focusing so heavily on the speed gain producing code, but in my experience clients of established products aren't really waiting for massive changes and gigantic features to be added. We aren't taking the time to think things through anymore.

reply

upvote

by goyozi11 hours ago|

[-]

> clients of established products aren't really waiting for massive changes and gigantic features to be added

In some cases they do. I work in a B2B vertical SaaS company and there’s both features that competitors build or rough edges around our features that make clients go „either we get X or we sign with someone else”. I agree though with the general sentiment that you don’t need SOTA models to build those - humans or humans + mid pack strong model will do.

reply

upvote

by resonious7 hours ago|

[-]

I have clients waiting for very gigantic features and the agent harnesses are a godsend.

reply

upvote

by Sammi11 hours ago|

[-]

I'm the only dev. I simply don't have time for dealing with the code from non-SOTA models. I'm doing all I can to keep this business afloat.

reply

upvote

by rglullis11 hours ago|

[-]

If you think your business depends on the ability for you to outspend the competition on LLM tokens, then you should cut your losses and shut it down right now.

reply

upvote

by lelanthran11 hours ago|

[-]

> I'm the only dev. I simply don't have time for dealing with the code from non-SOTA models. I'm doing all I can to keep this business afloat.

It sounds that your business is selling completely agent-coded products. I don't know how long that will be viable, or even if it is right now.

In my part of the world, I am completely unable to sell completely agent-coded products, so even a SOTA model is useless. The majority of my time is spent on analysis outside of coding anyway, so when I bill it's not based on how many lines of code I've added, it's based on whether the goal of the customer is satisfied.

reply

upvote

by MavisBacon7 hours ago|

[-]

What part of the world is that where you can’t sell agent coded products?

reply

upvote

by lelanthran7 hours ago|

[-]

> What part of the world is that where you can’t sell agent coded products?

You can try, but where I am there's literally no point - anything I offer that I bill based on how long my agent will take will be counter-offered by an even cheaper person using the same agent.

I've been through this cycle a few times already. It's pointless.

I sell outcomes, not lines of code. When I can get paid for unlocking revenue or reducing costs, SOTA makes not one bit of difference.

In practice, this means that I now don't even engage with clients who lead with "we want this program written" or "we want this feature added to this code we own". Those types of clients, their expectation is that you'll never need to bill more than the time you used to meet with them and maybe an hour of "labour".

You can, of course, continue as normal, but the expectation from clients now is that code is, for practical purposes, free. I've had one client last year vibe-code a ping program using Claude Code just to "prove" to me that my custom board+design+code for their industrial flow controller could have been done by their AI subscription.

If your business is "selling code", you aren't gonna win. If your business is "selling solutions" then you don't need SOTA anyway.

reply

upvote

by SwellJoe18 hours ago|

[-]

The good news (for you and most everyone other than the current leading AI companies), the gap between the SOTA and the near-frontiers is getting smaller every week or two. The leading Chinese models are only a few months behind now (GLM 5.2 tickles the tail of GPT 5.3 or 5.4 and Opus 4.6, according to benchmarks and the vibes among heavy users who've spent some time with it), where they were a couple of years behind a year ago.

reply

upvote

by rafram16 hours ago|

[-]

4.6 was released at the beginning of February, so if the Chinese models only "tickle its tail," that means they're >5 months behind.

reply

upvote

by felipeerias13 hours ago|

[-]

That comparison is also misleading because Opus 4.6 was probably not Anthropic's frontier model.

We got the first news about Mythos in March, so it is likely that it was already close to ready by the time Opus 4.6 was released.

So the actual gap is the time elapsed between March (or April for the official announcement) and whenever Chinese models can match Mythos.

reply

upvote

by SwellJoe13 hours ago|

[-]

The post-training process of a model that size is months, though it "works" before that. It is a big chunky model before it's released to the world and probably does some amazing things, sometimes...but, it wasn't done (else why wouldn't they release it and soundly trounce their competitors). I would assume that Chinese AI companies have a pipeline and what we see is a couple/few months behind their newest model, as well. Like, the new base model is cooked, but they're still plating it for service.

Why would Anthropic get the benefit of pre-release models counting toward their lead, if nobody else gets to count their pre-release models?

reply

upvote

by trvz16 hours ago|

[-]

> The leading Chinese models are only a few months behind now

reply

upvote

by PeterStuer12 hours ago|

[-]

I hear that often, but what does that even mean? I am a great proponent of open weights models. I do believe they are the only reason we have not stagnated into a collusion of halting (public) model releases.

But exactly which point in time is z.ai compared to claude.ai? Consistently bring "6 months behind" in an exponentially acellerating evolution means the gap is growing exponentially wider, not constant.

reply

upvote

by SwellJoe9 hours ago|

[-]

"an exponentially acellerating evolution"

Oh? Exponentially accelerating, huh? That's quite a surprise, to me.

reply

upvote

by SwellJoe15 hours ago|

[-]

What range of numbers do you believe "a few" represents?

reply

upvote

by mlyle14 hours ago|

[-]

Opinions vary, but:

A couple: usually 2, though not always

A few: 3, 4, 5

Several: 4, 5, 6, or 7.

reply

upvote

by marcus_holmes13 hours ago|

[-]

> A couple: usually 2, though not always

I had to explain this to my German friend. In my understanding this isn't about the actual number, it's about the certainty. If it's absolutely and definitely two, then I say two. If I'm uncertain but it's probably two, or if a non-integer, somewhere around two, then I say couple.

And few is more likely to be 3 than 5, because 5 is getting close to a "half-dozen or so", or (as you say) several.

Many is very context-sensitive, as the meme has it.

So I would agree that the open models are a few months behind, definitely more than a couple of months behind, possibly several months behind, maybe a half-dozen months or so behind, but not many months behind.

reply

upvote

by cassianoleal12 hours ago|

[-]

In the UK, as far as I can tell, a couple are 2. Not around 2. Not maybe 3 or 4. Always 2.

3 or 4 would likely be a few, or some. 1 is, well, one.

reply

upvote

by jonathrg11 hours ago|

[-]

Several and a few are the same number, they only differ rhetorically.

reply

upvote

by mlyle3 hours ago|

[-]

I think several is used by most speakers for larger quantities than few. It has the connotation of being larger, and that changes usage.

reply

upvote

by rafram7 hours ago|

[-]

Certainly below 6!

reply

upvote

by pelagicAustral10 hours ago|

[-]

Whats the leading Claude Code competitor model over in China?

reply

upvote

by 6 hours ago|

[-]

deleted

reply

upvote

by Sammi11 hours ago|

[-]

So I keep hearing.

reply

upvote

by dansquizsoft15 hours ago|

[-]

Another day, more cope on this subject from many posters on here...

reply

upvote

by Der_Einzige14 hours ago|

[-]

This is nonsense.

The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.

China has no flywheel for long-form agentic traces like Claude Code and its telemetry over its userbase (no one uses the Chinese harnesses yet). Most Chinese models are forced to price themselves significantly below cost to compete with the huge demand for bootleg claude tokens, because they're that much worse.

reply

upvote

by brailsafe14 hours ago|

[-]

> is estimated at 10 months by Anthropic themselves, and it's growing.

How is this different than any business with something to lose saying a competitor isn't as good? Not saying it's false, but it would seem to me that it's more important how customers feel about the issue.

reply

upvote

by brazukadev7 hours ago|

[-]

Didn't Elon Musk said the same or even worse about BYD? He isn't laughing anymore tho.

reply

upvote

by SwellJoe12 hours ago|

[-]

Ah, well, if Anthropic says their competitors are ten months behind...

I don't know what I was thinking.

reply

upvote

by marcus_holmes13 hours ago|

[-]

Here in Australia the sudden withdrawal of Fable made all of us think hard about models and harnesses.

I've heard half a dozen people talk about how a less advanced model coupled with a better harness outperforms a smarter model in the last few weeks.

If the USA wanted to shoot its AI industry in the foot it achieved its goal.

reply

upvote

by mmsimanga3 hours ago|

[-]

Which products are you now using?

reply

upvote

by InsideOutSanta11 hours ago|

[-]

> The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.

There's a lot of subjectivity in determining this, but I'm 100% sure that 10 months is wrong.

I don't know whether the gap is currently growing, but I'm not sure it matters. There are thresholds where models reach certain levels of usefulness. Opus 4.8, for example, is at a level where I can give it relatively vague input, and it can go for half an hour on its own and produce a high-quality PR.

If GLM reaches that level of capability and can do that task more cheaply than Anthropic's model, I will use GLM for that task, because that's a specific type of task I use models for. It doesn't really matter whether Anthropic also has a better model, because what does "better" mean in this context? It's a clearly defined task, and Opus 4.8 already does it at a very high level of quality.

reply

upvote

by bel814 hours ago|

[-]

If Anthropic themselves say competition is 10 months behind, it's probably 5 or less.

And you seem to think "no one uses" DeepSeek's v4, z.AI's GLM 5.2 or Xiaomi's MiMo 2.5 from their official APIs when they probably dwarf Anthropic's usage and are widening the gap due to conquering a chunk of Western market too.

I know it's hard for some to comprehend there's an entire Eastern hemisphere in the globe with billions of people, so it's worth reminding. And some seem to think the world is basically silicon valley even.

reply

upvote

by Chyzwar11 hours ago|

[-]

Because claude subscription tokens are cheaper than deepseek and friends. You have whole industry of people reselling Claude subscriptions in China.

Can you comprehend than Anthropic is winning because is both cheap(subscriptions) and better SOTA. People are cheering China providers when I reality they would rugpull open weights the moment they are competive.

China models are trash that why they are giving them away for free.

For individuals and small companies subscriptions is the best deal, for big companies china models are big no unless they can host them.

reply

upvote

by slopinthebag29 minutes ago|

[-]

No, Claude subscription tokens are not cheaper than the Deepseek API. You are dead wrong on that.

reply

upvote

by Der_Einzige6 hours ago|

[-]

Not sure why you're being downvoted for being objectively correct.

HN is full of contrarians and folks who don't know what they're talking about in regards to AI.

reply

upvote

by gck110 hours ago|

[-]

> The gap between Chinese models and American frontier models is estimated at 10 months by Anthropic themselves, and it's growing.

#1 I've had use cases where it was clearly obvious the Chinese models were behind.

#2 I've also had use cases where I couldn't tell a difference at 1/20th of the price.

The problem is - the #1 is the use case where American frontier is gated behind saboteur classifiers and is tiny minority anyway. Vast majority of work is #2.

The gap doesn't matter anymore.

reply

upvote

by hk__214 hours ago|

[-]

No you don't; it's often overkill to use the SOTA models. People want SOTA because it's shiny, but there are a lot of tasks where it's cheaper and more efficient to use other models.

reply

upvote

by jiggawatts12 hours ago|

[-]

> but there are a lot of tasks where it's cheaper and more efficient to use other models.

Sure… but which ones? How can you know ahead of time?

I just did a “simple” upgrade project where both me and the AI kept tripping over dead code, subtle typos, and difficult-to-trace live versus dead code.

Many times I used “Medium” thinking I got bitten, but not every time, and I couldn’t predict when.

So “Extra high” it was, for the entire project.

Far fewer nasty surprises!

reply

upvote

by meetingthrower7 hours ago|

[-]

Right. You hire the developer when you want a developer. But if I am building simple agentic workflows -- glorified automations with a small bit of structured "thinking" - I will sure use the cheapest API that can deliver that task at the speed I want.

I wonder where the market sizes will shake out for these different types of use cases? I am guessing right now 1 is bigger than 2 but not for long (by token volume)?

reply

upvote

by jfim8 hours ago|

[-]

For programmatic usage oftentimes SOTA isn't useful.

For example, I have software that summarizes articles and classifies links on webpages to build a synthetic RSS feed, both of which use LLMs, neither of which need a SOTA model.

I'll probably use LLMs to bootstrap a dataset of native ads in articles, and there again, I don't really need a SOTA model.

If it's for more open ended tasks like writing code though, I agree that at this point SOTA models make more sense to use.

reply

upvote

by PeterStuer12 hours ago|

[-]

In my experience: anything of open-ended complexity (software development, research, product design, ...) benefits from wathever the frontier can offer. 95% of Line of Business automation and workflows can be handled by even a reasonably small open weights generalist model flanked by a few even smaller specialized models. Yes, designing such a setup takes more knowledge and work dan just chucking it all over the api with prompts. But that is how I can run a system here for <$30/month vs >$1.000 month. As an added bonus, no model server can shut me down at the drop of a hat.

reply

upvote

by Sammi11 hours ago|

[-]

Exactly. I simply don't have the time to deal with non-SOTA model output.

reply

upvote

by parodysbird18 hours ago|

[-]

This is a great recipe for going out of business.

reply

upvote

by adrianmonk18 hours ago|

[-]

If the competitive risk is real, then are choosing between supplier risk (AI model access) and competitive risk.

When there isn't a zero-risk option, the question becomes which risk is smaller.

reply

upvote

by unknownfuture15 hours ago|

[-]

> If the competitive risk is real

Yes.

If.

Man I hope this tech FOMO eventually stops.

Companies generally fail because either their product doesn't meet a market need, or the market doesn't exist in the first place (possible because of bad timing), and not because they simply outran their competitors.

These aren't things fixed by using a frontier model to vibe code faster in lieu of one 5 months behind.

reply

upvote

by slim15 hours ago|

[-]

You can compete by being smart and using less-than-sota models and build a more solid business around them

reply

upvote

by Sammi11 hours ago|

[-]

I use whatever model is SOTA. I switch between them in order to avoid lock in.

reply

upvote

by lelanthran11 hours ago|

[-]

>I use whatever model is SOTA. I switch between them in order to avoid lock in.

What's your competitive edge here? Shaving off an hour of a feature delivery? Not having to see the code that is produced?

reply

upvote

by KronisLV10 hours ago|

[-]

Not sure about OP, I usually make Opus 4.8 on Extra thinking level implement features for me on a specific project, while I'm busy with other stuff.

For a change, I let DeepSeek V4 Pro implement it on Max thinking level. Nothing too out there - some DB migrations, some Django back end changes and Vue SPA front end changes.

Implementation time in total including tests was a few hours, so nothing too egregious. However, one of the migrations would break with pre-existing data, one of the column references in the entity was wrong, the API endpoint wasn't made consistently with the others in adjacent code (e.g. permission checks) and the front end had a Pinia state related issue and submitting one of the forms didn't work.

Tooling was run: ruff, ty, Oxfmt, Oxlint, also Docker build was green across the board, but the overall feature just didn't work. In both cases, sub-agents with clear context would review the code for serious/critical issues, at least three in parallel and do review loops until they spot nothing. The harnesses both has LSP integration.

Opus spent another hour fixing it, needed a few iterations, because I couldn't be bothered there.

> What's your competitive edge here? Shaving off an hour of a feature delivery? Not having to see the code that is produced?

The difference largely was not needing to waste time in fixing all sorts of subtle bugs that sub-optimal models will produce, worse yet if it was some sort of a serious project and those wouldn't have been spotted but instead that slop would have gotten shipped.

That said, Opus isn't ideal either and messed up a whole bunch when I was training some neural nets and try to process a bunch of satellite data and configure Garage to store them so that tiles can be served from a slow HDD and stuff like that. Obviously, it also needs a lot of babysitting in regards to UI looks, but it's better at the rest of development.

I think that DeepSeek V4 Pro and GLM 5.2 are cool though, it's just that you want as many checks and tests as you can throw at any given problem, or use languages that make shipping completely broken code increasingly likely.

reply

upvote

by jasondigitized17 hours ago|

[-]

Any competitive business will accept this risk if it gives them any type of edge no matter the duration of that edge. This is no different that using an exotic raw material.

reply

upvote

by benjaminwootton16 hours ago|

[-]

Every big business in the world biases towards risk reduction and cost reduction over getting an edge.

reply

upvote

by eru13 hours ago|

[-]

Different businesses have different biases.

reply

upvote

by rogerrogerr17 hours ago|

[-]

Eh, this isn’t really how businesses operate. How many businesses refuse to give devs large-spec machines? That’s very clear positive ROI.

I think it’s excessively charitable to assume businesses are uber-competent ROI-chasers. The expense people are eventually going to win on AI too, this blip of unrestricted AI budgets will be gone soon.

reply

upvote

by halfmatthalfcat18 hours ago|

[-]

And thus, capitalism continues to roll on. Businesses are suppose to go out of business, its a feature.

reply

upvote

by whatever12017 hours ago|

[-]

they’re not supposed to, they’re just able to

reply

upvote

by teleforce17 hours ago|

[-]

Nearly spit out my coffee, thanks for the chuckle.

reply

upvote

by w8vY7ER17 hours ago|

[-]

It’s ok to be amused, absent exaggeration. Spit takes happen in sitcoms.

reply

upvote

by Retric15 hours ago|

[-]

They do happen in real life.

They are overused in sitcoms because it’s easy for actors to mimic on demand unlike several other reactions.

reply

upvote

by pmontra15 hours ago|

[-]

I don't know if you write software for your own products or if you code for your customers. Anyway, are you going to compete on the speed of your code writing AI or on deploying the features your customers need? One useful feature is better than a hundred ones nobody really care about. And a good relationship with customers is better than any feature.

Example. Yesterday I listened the technical lead of a customer of mine digging himself into a hole by not understanding what it would mean exposing AWS EFS to their on premise server over NFS. It was just too many unknown unknowns for him and he had no time to ask the AI (and even if he did I'm not sure that he could understand.) His boss, which actually used NFS, had to stop him. I didn't speak a word.

So, he could have coded the migration of a server from AWS to on premise, asked Claude to write also all the configuration scripts and policies but then what?

reply

upvote

by Sammi11 hours ago|

[-]

I'm making a micro SaaS product. Code quality and code production speed are actually both super important. I don't have the time for non-SOTA model output.

reply

upvote

by slopinthebag20 minutes ago|

[-]

> Code quality

You care about this but use LLM's to slop out features anyways?

reply

upvote

by jdlshore18 hours ago|

[-]

What concrete business advantage are you getting from LLMs?

reply

upvote

by echelon18 hours ago|

[-]

Speed.

reply

upvote

by K0balt18 hours ago|

[-]

This x 10 . I don’t understand how people are saying you can’t use LLMs to get crazy productivity gains. If you can’t write quality code with LLMs at ludicrous speed, you’re holding it wrong. You will have occasional bad days and regressions. But overall you’re still going to be able to 4x your progress.

reply

upvote

by cedws17 hours ago|

[-]

I have plenty of experience with LLMs and use them daily but definitely wouldn't call generated code "quality code." Often looks like complete vomit.

reply

upvote

by K0balt16 hours ago|

[-]

That’s kinda what I mean. Maybe it only works well in some languages, but with the harness I built for C and C++ does a fantastic job of adhering to very strict architecture and style guides. Way cleaner, more readable, better factored, and more interpretable than human generated code, except maybe one or two devs I have worked with. YMMV I guess?

TBF I do burn 200k tokens just preloading the context with onboarding, not including any code, just document trees of development policy documents, style and architectural standards, code and documentation review processes, company ethos and culture, etc. it’s a token fire, but it really works for us.

Also, documentation driven development all the way down.

reply

upvote

by satvikpendem17 hours ago|

[-]

If you're an enterprise (including startups), you worry about customers, not code quality. There are famously many startups that gained traction despite shit code and then eventually got around to fixing it, to whatever extent was possible, like Facebook HHVM, Stripe's Sorbet, etc.

reply

upvote

by watwut13 hours ago|

[-]

Startups failed because they cound not untangle own code after 4 months. Literally true stories (plural).

reply

upvote

by lelanthran11 hours ago|

[-]

> Startups failed because they cound not untangle own code after 4 months.

That's rare, though. If they could not untangle their own code after 4 months, it's because they were not making enough money to pay a team to untangle it - that's not a code problem, it's a revenue problem.

IOW, the startup failed because their revenue was too low.

reply

upvote

by satvikpendem13 hours ago|

[-]

There are orders of magnitude that failed because they did not solve the right customer problem. Code quality is merely incidental the vast majority of the time.

reply

upvote

by wonnage16 hours ago|

[-]

[dead]

reply

upvote

by NortySpock17 hours ago|

[-]

Ok, and? You can live with that if there are more important things to deal with.

I've stared at ugly LLM code, that I had just had generated, and worked well enough for my purposes. (generally, some quick recursion into a nested python dictionary in order to dig out some property -- especially for linting or quick data analysis).

And I wanted something better, sure, something a bit more readable ...but I just needed it to work well enough to recurse through a yaml file for config file linting, not be battle-hardened against every test case.

So to deal with the mess, I shoved it in a pure function, threw a few basic sanity unit tests around it, put a comment with a disclaimer of "#this is LLM generated code, it is lightly tested, do not use it for anything truly load-bearing without a lot more tests" and I moved on to something else.

Not everything has to be bulletproof.

reply

upvote

by csallen17 hours ago|

[-]

You're on Hacker News. This is a site full of developers who are convinced that "proper software engineering" is 100% of what makes a business successful, and everything and everyone else is useless. You can't just waltz in here and point out that code in business is a means to an end and expect not to get downvoted.

reply

upvote

by satvikpendem3 hours ago|

[-]

It's ironic because around 20 years ago here, people knew HN was (more) explicitly for startup founders and the comments reflected that, with much more discussion on getting customers than writing code.

reply

upvote

by Schiendelman15 hours ago|

[-]

As a technical product manager, this 1000%. It's just irrelevant how bad code is unless it impacts the business.

reply

upvote

by AdieuToLogic15 hours ago|

[-]

> As a technical product manager, this 1000%. It's just irrelevant how bad code is unless it impacts the business.

If you are, in fact, "a technical product manager", I would hope you understand that "bad code" is identified as such specifically because it "impacts the business."

reply

upvote

by Schiendelman15 hours ago|

[-]

That is not how most engineers define bad code.

reply

upvote

by AdieuToLogic14 hours ago|

[-]

> That is not how most engineers define bad code.

The engineers I have worked with most definitely define "bad code" as having intrinsic limitations and/or latent defects which impact successful system functionality/operation. Indicators provided to stakeholders such as yourself which support this assessment are, but not limited to:

  - the system doesn't work that way
  - the system lacks test coverage, so changes take longer
  - adding feature "X" is not feasible
  - there is no repeatable way to onboard team members
  - the backlog grows exponentially
  - that "one point task" is going to take a couple weeks

All of the above impacts a business.

It is up to you, the "technical product manager", to understand what your team is trying to tell you.

reply

upvote

by Schiendelman14 hours ago|

[-]

Please stop being rude to me. I'm a human being, I'm a very experienced product manager and engineer (you can google my name, I'm the only one), and the way you are behaving sucks.

Everything you're saying is true, sometimes. Assume I'm still right, and that you might be able to learn something from someone else.

reply

upvote

by AdieuToLogic14 hours ago|

[-]

> Please stop being rude to me.

I do not see how I was being rude, unless it was my use of quotations around the title you claim.

> I'm a human being ...

I did not doubt this.

> ... I'm a very experienced product manager and engineer ...

Again, if it was my use of quotations which you found to be rude, then I do not know what to say about that.

> ... and the way you are behaving sucks.

I respect your perspective and support your right to express yourself. And no, I do not think you are being rude by doing so.

> Assume I'm still right ...

Why would I? You responded to:

>> This is a site full of developers who are convinced that "proper software engineering" is 100% of what makes a business successful, and everything and everyone else is useless.

With:

> As a technical product manager, this 1000%.

Finally, you write:

> ... you might be able to learn something from someone else.

Maybe you can learn something from someone else as well.

reply

upvote

by hack13123 hours ago|

[-]

There was nothing rude about any of their replies.

reply

upvote

by slopinthebag1 hours ago|

[-]

They weren’t rude enough. Your complete apathy towards the many antisocial effects of badly engineered software, caring only about increasing shareholder value, is the reason why modern software not only sucks but actively makes our lives worse to use it.

reply

upvote

by dgellow13 hours ago|

[-]

Googling your name brings this missing person case as the only results: https://en.wikipedia.org/wiki/Disappearance_of_Logan_Schiend...

reply

upvote

by Schiendelman3 hours ago|

[-]

I guess if all you did was paste my last name into Google with no context, you'd get something like that. :)

reply

upvote

by nomel15 hours ago|

[-]

This is something I wish I understood sooner. There is strong merit to "good enough".

Of all the "concise" and "beautiful" code I worked hard to produce, I was the only one to ever lay eyes on it. It didn't actually matter, and nobody cared but me. The people in charge of my raises could never perceive quality of code, because it wasn't their area of expertise. They only cared (rightly so) that it did what it was supposed to, and all the elegant abstractions didn't practically help that purpose. It was, literally, wasted life that I should have spent just getting off work early, like most of my colleagues.

reply

upvote

by echelon17 hours ago|

[-]

Every bit of code written in the last 50 years is going to be meaningless.

People need to get to grips with that fast.

Distribution, relationships, processes, mindshare, marketing, and politics matter. Code is just ephemeral glue and implementation detail.

reply

upvote

by ses198416 hours ago|

[-]

Not every bit of code is going to be meaningless.

Just 99.999%.

reply

upvote

by slopinthebag15 hours ago|

[-]

Lmao. Have more respect for your elders, who wrote all the code that your ai psychosis is fuelled by.

reply

upvote

by echelon15 hours ago|

[-]

Every single thing around you was pioneered by people who are dead and forgotten. From the materials science of the clothes you wear, to the very language you speak.

Get over yourself. We're all ephemeral, dead and recycled in the blink of an eye. Our species doesn't even clock on the geologic timespan.

If you think your code (or any of your artifacts or possessions) matter beyond their immediate utility, you're mistaken. Work will either fall into disuse or be replaced. It's scaffolding for what comes next along a well-traversed path.

reply

upvote

by satvikpendem3 hours ago|

[-]

Look upon my works, the mighty, and despair!

reply

upvote

by slopinthebag1 hours ago|

[-]

I refuse to accept your existential nihilism. This mindset is not only toxic to the soul, but toxic to those who must suffer the effects of someone who only cares about “immediate utility”. What a depressing comment.

reply

upvote

by hexasquid11 hours ago|

[-]

Dr Manhattan

reply

upvote

by matheusmoreira15 hours ago|

[-]

I measured an ~8x increase in my project's commit count after AI, and I'm painstakingly reading, reviewing, understanding and editing everything the models write. It's gotten to the point I'm trying to slow down in order to let the new knowledge crystallize. I'm manually writing articles about what I'm doing as I go.

I can only imagine what people are doing at their jobs with unlimited token budgets.

reply

upvote

by lelanthran10 hours ago|

[-]

> I measured an ~8x increase in my project's commit count after AI,

That's irrelevant. What's the increase in revenue?

reply

upvote

by matheusmoreira10 hours ago|

[-]

I'm a hobbyist. My revenue will only increase if my work somehow lands me a job at some point.

reply

upvote

by lelanthran7 hours ago|

[-]

Are you not employed at all?

reply

upvote

by matheusmoreira3 hours ago|

[-]

Yes, but my field has not been hit by the AI frenzy yet. Outside the usual attempts to automate us, that is. I've used AI at work for research and corroboration but it hasn't led to 100x performance or anything of the sort.

reply

upvote

by amoss12 hours ago|

[-]

Kind of weird how LoC has become a metric for people to chase again.

reply

upvote

by matheusmoreira11 hours ago|

[-]

In my case it was commits, not lines of code. I wasn't chasing after it, I just asked Claude to calculate some statistics after a month or so of AI usage.

It's not just statistics either. I know for a fact that I made major progress by using LLMs. Here's a summary from around a month ago:

https://news.ycombinator.com/item?id=48407642

AI is world changing technology as far as I'm concerned.

reply

upvote

by baq14 hours ago|

[-]

You don’t have to imagine, listen to Boris’ publicly saying how he works with these things and it’s safe to assume others do it similarly or better

reply

upvote

by 8note1 hours ago|

[-]

if hes still doing work on claude code, im not convinced its going all that great.

its a lot of features that feel half complete, with the llm pretending that the job is done rather than actually being done

reply

upvote

by cjbgkagh18 hours ago|

[-]

I wonder if the people getting 10x productivity gains are spending less time on HN and more time tending to their agents. Personally I now spend so much time productively arguing with agents that it feels like an utter waste of effort arguing with humans, if people can't see the value in LLMs by now I'm not sure what I could say to change their minds.

reply

upvote

by vhantz17 hours ago|

[-]

We must then assume you're not getting those 10x gains

reply

upvote

by cjbgkagh17 hours ago|

[-]

Less time, not zero time. I still argue with humans for sentimental reasons.

reply

upvote

by slopinthebag1 hours ago|

[-]

So you are accomplishing a year’s worth of work in a month? If that’s been happening for a few months, you must have a few years of work to show people right?

reply

upvote

by dboreham17 hours ago|

[-]

Definitely enjoying the lack of eye-rolling, being asked to explain obvious things multiple times, and stopping things being done for resume-stuffing reasons.

reply

upvote

by sscaryterry8 hours ago|

[-]

Exactly, no ego (I know I'm anthropomorphizing)

reply

upvote

by 0xy17 hours ago|

[-]

There's a small minority of people who are adamantly refusing to change, such as there are in every technological revolution. Ego prevents them from even wholeheartedly trying the tool, because it would be admission they were wrong.

The opportunities available for these people are rapidly, rapidly shrinking. I believe it's possible to be a developer today who's EXCEPTIONAL and never uses AI. Most opponents are not exceptional, though, and even these opportunities are shrinking.

Most exceptional developers in my org adopted AI in their workflows and went from 10x developers to 20x developers.

If you refuse to adapt, you're going to be out of a job complaining about the kids and their newfangled technology REAL quick. You have a few years remaining, maybe less.

reply

upvote

by drdexebtjl17 hours ago|

[-]

I can’t turn 10x work into 20x work because I have to ensure the two juniors in my team who are now creating 50x work won’t merge complete garbage, reviewed by another engineer that has already given up on caring.

I can’t turn 10x work into 20x work because my Product Manager thinks changing fundamental premises of tasks I already spent two weeks on (mostly removing human blockers) is very simple. After all, when he asked Claude to update his prototype, it only took it 10 minutes.

I can’t turn 10x work into 20x work because the company dedicated entire teams to write company-wide skills for everything. They suck, but if I don’t use them, I’m not following the new “golden path for engineering”, and I lose points in my performance review.

I can, however, turn 10x work into 20x work, or even much more than that, if AI actually did what it’s promising and eliminated most of my team, the product manager, and the middle managers. Or me. I could use a break.

reply

upvote

by dwaltrip16 hours ago|

[-]

Damn, that sounds quite rough.

reply

upvote

by llama05214 hours ago|

[-]

[dead]

reply

upvote

by dolebirchwood15 hours ago|

[-]

What about the 6x developers? Was there just a doubling multiplier across the board, resulting in them becoming 12x developers, or did they too become 20x developers?

reply

upvote

by AdieuToLogic15 hours ago|

[-]

>> What concrete business advantage are you getting from LLMs?

> Speed.

Speed of what?

Speed of understanding what needs to be done? I highly doubt it.

Speed of LoC checked into git? Sure, I'll give you that.

But one can use any number of tools to generate hundreds of thousands of lines of code. See any build tools which support specifications such as RAML, OpenAPI, CORBA, etc.

So I ask again; speed of what?

reply

upvote

by jitl14 hours ago|

[-]

fixing minor bugs takes one slack message for us now. bugs go down, goodness go up.

fixing more serious regression also easier. connect honeycomb mcp, ask agent to debug while i walk to coffee and get some pistachio rose dates. by time im back with my oat latte ive got a full report on what happened and can send the next slack message to fix.

life is good

reply

upvote

by sixothree14 hours ago|

[-]

I needed to deeply understand a code base I had no experience with in a language I don't normally use with what I would describe as haphazard documentation at best. You can't argue with the speed at which I gained the required understanding of the project.

reply

upvote

by echelon14 hours ago|

[-]

In the time it took you to type that, your hourly market comp went down another basis point.

I am appalled none of this is clicking with you anti-AI folks. This is all so exciting -- alarming even! --, and software careers are never going to be the same.

I don't know how you just metaphorically stand there and act like nothing at all is happening. We've never seen anything like this in our entire lives.

Some of you are standing right in front of the steam roller, yelling to all of us that steam rollers aren't real.

reply

upvote

by CookieCrisp14 hours ago|

[-]

Very very fast steam rollers.

reply

upvote

by AdieuToLogic14 hours ago|

[-]

Nice strawman[0], but you avoided answering my core question:

  Speed of what?

With ad hominems and a non sequitur. How about I narrow the question with the hope it engenders a relevant response:

  How do LLMs increase the speed of a person understanding
  what needs to be done?

0 - https://en.wikipedia.org/wiki/Straw_man

reply

upvote

by CookieCrisp14 hours ago|

[-]

This argument feels like

A: The sky is blue! B: No it's not. A: Yes, it is, please look up. B: No, you must prove it to me through reason. A: But, if you would just pretty please look up. B: No.

I run a company, I've been running it for 10 years, we do alright. I'm a shitty manager. Every time I've hired developers, the business freezes. The business isn't anything super important, the main consequence of bugs is that my family loses money. Everything has always rested on my shoulders. In theory there is some path for me to become a good manager, but I never landed on it. But now, with Claude, it's great. So far Claude has paid itself off in real profits at least 20x over, and that's with significant API usage on top of the monthly sub. I can prototype new features in an afternoon that before were on my giant list of "maybe somedays if I ever get to breathe" list. Our user experience has improved in so many ways that I knew were probably worth it, if I could just find the time. Now I can.

There are situations where yeah, it probably isn't ready yet. But, there are so many where it's amazing. Seriously, it's worth looking up.

reply

upvote

by dgellow13 hours ago|

[-]

You’re just plain wrong to assume people against agentic development do not have experience with the technology

reply

upvote

by CookieCrisp13 hours ago|

[-]

I think there are many valid reasons to be against them - I think a lot of them are more right than wrong. It’s the “It can’t really do much” that I think must be from people that haven’t really tried it.

reply

upvote

by AdieuToLogic13 hours ago|

[-]

This is a great case for the benefits of using GenAI, in that you already possess an understanding of what you want to achieve. You know what it is you want to prototype, what is on your "giant list of 'maybe somedays if I ever get to breathe' list", what you want to end up delivering.

My point is and remains:

  A) GenAI did not give you this understanding.
  B) GenAI can only assist in your expressing this
     preexisting understanding.
  C) GenAI is a statistical token (text) generator and
     cannot, by definition, "make" a person understand
     what they want/need to do.

reply

upvote

by CookieCrisp13 hours ago|

[-]

Ideas and functionality beget more ideas and functionality

reply

upvote

by llama05214 hours ago|

[-]

Did you use an LLM to write this for you? How odd.

For all of you people who think these LLM models are “earth shattering” how the hell do you reconcile that it’s a net positive for anyone but those who want to consolidate knowledge and power.

We are really looking at idiocracy in the making.

reply

upvote

by tskj10 hours ago|

[-]

I guess I'll chime in as someone who thinks LLMs will be earth shattering, and specifically don't think it's a net positive for anyone but those whose power will be consolidated.

reply

upvote

by nmfisher15 hours ago|

[-]

From my brief window of Fable usage, speed wasn't its strong point at all.

For actually building software, I'm starting to suspect a human with a dumber (but faster) model is going to get the job done quicker than Fable (and possibly even cheaper). Bug-finding and vulnerability detection is a different story.

reply

upvote

by baq14 hours ago|

[-]

I’d say you tried on an insufficiently complex codebase. I’ve tried on a MLOC+ and the results were excellent compared to anything else.

reply

upvote

by nmfisher14 hours ago|

[-]

Not saying the results were bad - quite the opposite. But it was very slow (and if I was paying API rates, hideously expensive).

reply

upvote

by TylerE12 hours ago|

[-]

My conclusion was the exact opposite. Maybe each individual response was slower, but it took so many fewer round trips to get what I wanted wanted. I had a project fable was progressing steadily and correctly on. Opus on the same project keeps handing me garbage it insists is working and meets the stated requirements, but isn’t and doesn’t.

reply

upvote

by Sammi11 hours ago|

[-]

And quality if you know what you're doing.

reply

upvote

by erikschoster16 hours ago|

[-]

Drawing debt

reply

upvote

by echelon16 hours ago|

[-]

We'll just rebuild stuff when we get new requirements. The models will be even faster and better for the next version, anyway.

reply

upvote

by ZeroGravitas12 hours ago|

[-]

For businesses where this is true, they also need to be able to switch provider quickly in case the best provider changes.

It's almost identical to the possibility of one model getting shut down for a business that doesn't care about SOTA.

reply

upvote

by Sammi11 hours ago|

[-]

Yeah I have both the Claude and Codex 100 dollar subscriptions and I try to use both. I also keep the 20 dollar Cursor subscription as there I can play around with everything. I also refuse to use any harness specific features. Claude is particularly annoying with this in that it's the only one that doesn't respect open config standards like .agents/skills

reply

upvote

by 18 hours ago|

[-]

deleted

reply

upvote

by cedws17 hours ago|

[-]

This thinking that every task must be stuffed into the most 'advanced' (expensive) model out there is idiotic, and it's not only you unfortunately.

At $JOB I have warned higher ups we should try to keep our expenditure under control, educate people that document slinging doesn't require Fable every time and demo the capabilities of the cheaper models, and been snubbed for it. When Fable is available once again our bill is going to be eye watering, relative to what it should be.

reply

upvote

by Sammi11 hours ago|

[-]

If I am working on something simple and want the speed boost then I'll drop the thinking to low or minimal and still get the SOTA model output quality.

But for what I work on I mostly need high or xhigh SOTA model quality output. I don't have the time to deal with anything less.

reply

upvote

by fakedang16 hours ago|

[-]

This! I've found that for most coding, Sonnet is pretty good as it is. Yeah, you might need to finesse your prompt a bit more, and you'll probably be spending a bit more time on the computer, rather than a more hands-off approach, but at the end of the day, you'll save a lot more simply because you're using a good-enough model.

If you're the one-shotting type, obviously then Fable might be useful, but I think only marginally. You don't need to bring a MANPADS to a duel at high noon.

reply

upvote

by baq14 hours ago|

[-]

Sonnet is dogshit at coding unless you eval the exact niche to be fine and still watch it like a hawk.

reply

upvote

by ferrouswheel13 hours ago|

[-]

If you can't figure out what model to use your business is already dead.

reply

upvote

by codybontecou18 hours ago|

[-]

Unless you have concrete evidence via evals that SOTA is actually needed, you’re just buying into the hype.

reply

upvote

by brazukadev18 hours ago|

[-]

do you think your current operation and niche is so optimized that not using Fable would put you out of business? Or is this a hope that using Fable will allow you to stay in business?

reply

upvote

by cjbgkagh18 hours ago|

[-]

I am on track to commoditize my niche industry, and I hope I can do it before anyone else beats me to it. I'm working at panic speeds.

reply

upvote

by dgellow13 hours ago|

[-]

So, no moat right?

reply

upvote

by cjbgkagh8 hours ago|

[-]

If there is any it’ll be rather small. I'm ideally placed to benefit from the commodification as I was planning on doing it anyway, now I'll just get there a lot faster with the help of AI.

reply

upvote

by raverbashing12 hours ago|

[-]

Reducing your costs is also an advantage, but I'm not surprised such binary thinking is present here

reply

upvote

by zombot14 hours ago|

[-]

So the panic generators ("You will be left behind!") are winning. Creating a sense of urgency that makes you switch off the higher rational functions is a key element in every successful scam.

reply

upvote

by 1over13718 hours ago|

[-]

Nonsense. Do you buy state of the art pens, pencils, printers, paper, computers, disks, etc.? No. You buy whatever is the best value for the case at hand. That’s often not the SOTA option.

reply

upvote

by Sammi11 hours ago|

[-]

Artists that need the best quality output use the best pens and papers. Call me a coding artist then haha. But seriously I don't have the time for anything less than SOTA.

reply

upvote

by admax88qqq18 hours ago|

[-]

Sure but that's orthogonal.

Yes you use the right tool for the job.

But if the job requires the best intelligence you can get with an LLM, then you use that.

Taking as an assumption that the quality of your product is a function of the quality of the inference you are using: if you use an inferior model because "what if it gets export controlled again" and your competitors don't, then your competitors are likely to win.

If you don't need frontier models for you job then this is all moot, but the thread started with

> You cannot build a business critical function on top of American SOTA frontier model

Which is silly. HN likes to roleplay bringing everythgin "business critical" in house because sometimes vendors mess up. Self host, don't use the cloud, run open models locally, built redundant supply chains in case of another covid, etc etc. Sometimes the risk is real, but most of the time the risk is rare and the cost of an interruption event is less than the cost of bringing everything in house or using lower quality vendors "just in case"

reply

upvote

by 18 hours ago|

[-]

deleted

reply

upvote

by softwaredoug17 hours ago|

[-]

The real problem is the White House just making up the rules as it goes. No laws. No predictably for the markets.

A week or so pause from seemingly legitimate cyber security concerns isn’t cause for panic. But it should be backed by laws that describe what that process should be. That would put the market at ease

reply

upvote

by cmrdporcupine1 hours ago|

[-]

The rule is, you pay the toll at the bridge or you don't get to pass.

reply

upvote

by 12 hours ago|

[-]

deleted

reply

upvote

by catigula17 hours ago|

[-]

There’s no optimal answer.

The reality is this is world-ending technology and absolutely nobody knows what to do or can even agree that the problem exists.

reply

upvote

by blooalien16 hours ago|

[-]

> "The reality is this is world-ending technology and absolutely nobody knows what to do or can even agree that the problem exists."

The reality is that the "people in power" believe it is "world-ending technology" and will therefore use it in world-ending ways. People are absolutely 100% the danger here, not the technology.

reply

upvote

by blurbleblurble16 hours ago|

[-]

Photosynthesis once ended the world

reply

upvote

by afavour18 hours ago|

[-]

Wouldn’t you just have fallbacks? Today’s frontier models are just better than the other models, they don’t really have a ton of entirely unique abilities that can’t be replicated with more time and effort.

So you use the frontier model, then when you can’t you accept things are less efficient. The alternative (right now) is to be less efficient all the time, I don’t see any advantage to that.

reply

upvote

by theptip18 hours ago|

[-]

Yeah. It’s not the end of the world.

But, it is a big own goal, because once you invest in building evals for your internal use-case, 1) it’s easier to switch your model to whatever is cheapest, and 2) it’s way easier to fine-tune an oss model.

Evals are annoying to build and most companies were fine to rest on vibes. Now many companies have to do the work for insurance.

reply

upvote

by boc18 hours ago|

[-]

> You cannot build a business critical function on top of American SOTA frontier model.

Yes 1000%, please, all my European competition please don't use mythos whatever you do it's total USA trash and the Chinese models work better anyway.

reply

upvote

by notrealyme12311 hours ago|

[-]

Please elaborate, I don't understand.

Why is it good if they give their money to China?

reply

upvote

by rcxdude9 hours ago|

[-]

I think they're being sarcastic (the implication being they want their competitors using the worse models)

reply

upvote

by well_ackshually17 hours ago|

[-]

[flagged]

reply

upvote

by satvikpendem17 hours ago|

[-]

Read the guidelines, you can make your point without calling people "suckers".

> When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."

https://news.ycombinator.com/newsguidelines.html

reply

upvote

by toddmorey17 hours ago|

[-]

I predict we all be using the hell out of fable until the next great model comes around and in two weeks we won’t be talking about the export controls anymore. We just don’t have the attention span.

Nobody should be putting loadbearing weight on Amazon or Microsoft with their ruthless monopoly ambitions, yet here we are

reply

upvote

by rvz16 hours ago|

[-]

> I predict we all be using the hell out of fable until the next great model comes around and in two weeks we won’t be talking about the export controls anymore.

Until it goes down, or Anthropic raises prices again.

Fable is already expensive to use compared to GLM and they want you to use the API as much as possible so you get a worse deal.

reply

upvote

by solenoid093713 hours ago|

[-]

Why would you compare Fable to GLM? What a bizarre comparison. They're at least two generations/tiers of intelligence apart. GLM is great and I use it often but it might as well be Sonnet when compared to Fable.

reply

upvote

by jacksonastone15 hours ago|

[-]

Feels like a leap. This kind of move was always possible. It's possible China stops publishing their frontier too. US could lock down access to Nvidia latest hw of scale even if you intended to do open models. Then what? Say amd or bust? The best you could do going solo (i.e no nation-state interference) is tiny stuff that you can run on commercial stuff. But that is seriously limited / slow in comparison. You either have to do dumb and fast, or smart and slow IME for these self-hosted things that aren't on the beefy racks.

reply

upvote

by benfortuna15 hours ago|

[-]

The real answer is you should never build your business on ANY specific model. As usual avoid lock-in and switch when you need to.

reply

upvote

by hodgehog1111 hours ago|

[-]

You should not build a business-critical function to rely on a particular proprietary LLM stack period, especially with so many sensible competitors in place now. It's insane to me that this needs to be said.

The SOTA frontier models have value elsewhere, not monetarily perhaps, but certainly per user. Quite a few cool things have come out of that brief Fable window. There should be more.

reply

upvote

by fhub18 hours ago|

[-]

This won't age well. You just need to code in a way that has fallbacks. Whether that is to older models, different companies. It's going to be a commodity (if it isn't already).

reply

upvote

by satvikpendem17 hours ago|

[-]

Nah, people will still pay, as many if not most consumers truly do have a short memory. And like other comments say, imagine everyone is using Fable and you are not, you will quickly fall behind, per the Red Queen hypothesis.

reply

upvote

by kcb16 hours ago|

[-]

LLMs are still easily replaceable. If the SOTA frontier model provides meaningful impact for your critical business function, then worse case you flip the switch to the next most capable model.

reply

upvote

by Art968116 hours ago|

[-]

Unequivocally false. Models have different behaviors, parameters, tool calling templates, etc. The providers publish extensive documentation on all of this. Yes, you can take the quick way and swap a model, but it will not run at its full potential until you adapt your workflows to it.

reply

upvote

by Marha0114 hours ago|

[-]

Or you can use universal harness that is not tied to a specific model (there are many available now, such as OpenCode, Pi..).

reply

upvote

by Wowfunhappy8 hours ago|

[-]

Oh come now. Fable was available for less than a week before it was pulled, not enough time to build a business critical function. The government isn't going to pull a model that has been out for a substantial length of time. Or do you also avoid using US-developed encryption?

(Okay, I can't predict what crazy crap this particular administration is going to do, but that goes for anything, well beyond AI. I think it's about as likely that they would restrict access to Opus as restrict exporting iPhones, or bomb Greenland, or whatever.)

reply

upvote

by futureshock18 hours ago|

[-]

I think this is black and white thinking. Fable and US AI is not unique technology. It’s just marginally better than open source tech at 10 times the price. You can swap out the models at will, they are pretty much fungible. If your use case can pay for a best in class model then you will pay for it no matter the bogeymen. If your best in class model becomes unavailable, you switch to the next best model for a very minor performance degradation. I really doubt this will deter anyone from using American AI.

reply

upvote

by esailija6 hours ago|

[-]

Who is creating business critical function on top of something that is for entertainment purposes only (all providers have equivalent clauses)? AI tools and shovel sellers don't count as they just can just push the entertainment downstream.

reply

upvote

by internet200017 hours ago|

[-]

> You cannot build a business critical function on top of American SOTA frontier model.

Yes I can!

reply

upvote

by blints16 hours ago|

[-]

Most companies do not model themselves as "building on [AI model du jour]" yet. They model themselves as building products with those tools, which they consider as relatively substitutable.

reply

upvote

by 18 hours ago|

[-]

deleted

reply

upvote

by jitl7 hours ago|

[-]

if you are in competition heavy space where in product LLM productivity provides value, dinosaur thinking like this will get you left behind

reply

upvote

by recursivegirth17 hours ago|

[-]

Better to fix it now than tomorrow.

reply

upvote

by teekert8 hours ago|

[-]

I don't really trust our EU leaders not to pull the same stunt. So we're back to Marx's "owning your means of productions". Which has always been good advice, whether it's GitHub's recent failings, FaceBook's blocking, or some Google service on their graveyard.

reply

upvote

by solenoid093714 hours ago|

[-]

The EU has totally and utterly failed when it comes to frontier AI. They are out of the running, they won't catch up in time for superintelligence. There literally is not enough compute for sale in the world for them to do so.

They crippled their own domestic entities with the AI Act. (see the Mistral CEO's rant.)

If you want to use frontier models until then, you're gonna use what's available, and that's US models.

reply

upvote

by rightbyte12 hours ago|

[-]

Not doing anything and avoiding the mania as much as possible seem to be a winning move.

reply

upvote

by solenoid09376 hours ago|

[-]

Not if it prevents you from reaching superintelligence. Then you just become the defacto subject of whichever country gets there first.

But hey, at least you got more regulation passed along the way!

reply

upvote

by rightbyte2 hours ago|

[-]

If that is the scenario there is no way the EU, a trade union, can compete anyway so why bother.

reply

upvote

by jaapz10 hours ago|

[-]

> The damage is done. You cannot build a business critical function on top of American SOTA frontier model. Especially not with the current crew in charge.

I mean, this was already pretty clear before. But it surely didn't help!

reply

upvote

by cmiles87 hours ago|

[-]

…except until non-US and non-Chinese companies can match performance this (mostly Europe) is just wishful thinking and sand pounding.

reply

upvote

by maxdo10 hours ago|

[-]

As much as i hate current admin, slowing down advanced model until figure out security impact is a good thing. It's called national interest over commercial one. You didnt loose anything. US customers except a few selected one lost access the same one as EU customers. In a few weeks advanced model got released.

So if you decided to bring money to communists you can put whatever rational but not this. Do so , and you will loose your last competitive edge in this domain. ASML. After that EU will become a completely agricultural-only region, since edge is lost almost everywhere else.

reply

upvote

by alfiedotwtf10 hours ago|

[-]

GLM 5.2 is the elephant in the room. GLM 6 will probably be a Claude Killer

reply

upvote

by espeed15 hours ago|

[-]

The Damage: Now every time Claude does something stupid or trashes your code, developers in the back of their mind will think, is Claude sabotaging me on purpose? [1] Trust is hard to gain. Easy to lose. And harder to get back. Models will converge. Trust won't.

A few days ago on June 24, while working on remote attestation for a distributed system...

  CLAUDE OPUS 4.8 No. I'm not a rogue agent, and I'm not trying to sabotage your code. But I'm not going to wave off how this looks. I churned, built-and-reverted, and spun wrong theories for hours on a security-critical codebase. That's alarming, and it's a real failure on my part

What are we to think? Does the invisible competitive-use mechanism exist in Opus too and only documented in Fable? How long has it existed? Is it still in effect? -- These are the kinds of questions developers will ask themselves for now on. This is why it was one of the stupidest things Anthropic could have done. Developers will now question everything and rightly so. There's no attestation protocol for that. How will they know?

[1] "In light of the ability of recent models to accelerate their own development, we’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design). Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through our safeguards avoids accelerating the actors most willing to violate these terms.

Unlike our interventions for cybersecurity, biology and chemistry, and distillation attempts,these safeguards will not be visible to the user. Fable 5 will not fall back to a differentmodel. Instead, the safeguards will limit effectiveness through methods such as prompt modification, steering vectors, or parameter-efficient fine-tuning (PEFT). These interventions will not affect the vast majority of coding work. We estimate they will impact ~0.03% of traffic, concentrated in fewer than 0.1% of organizations. When these interventions are active, we expect them to have minimal behavioral impact on the model except to limit its effectiveness in developing frontier LLMs. Claude will still respond helpfully to user requests. We’ll continue to improve the precision of our detection methods following the launch of this model."

Source: https://www-cdn.anthropic.com/d00db56fa754a1b115b6dd7cb2e3c3...

reply

upvote

by solenoid093714 hours ago|

[-]

They undid this after the backlash

reply

upvote

by espeed13 hours ago|

[-]

Look at the date. That's from after they said they reverted it, and it's a different model. The point is trust. They've shown their willingness to do so, how will you know?"

reply

upvote

by petcat19 hours ago|

[-]

Nobody cares about this temporary "ban" by the US government. If anything it only increased the mystique of the two models.

I think Europe and Canada are just happy not to be frozen out of AI access completely at this point.

reply

upvote

by andy9919 hours ago|

[-]

All the discussion this week have been about GLM, Qwen, etc. Both over 1000 comments in the last couple days.

https://news.ycombinator.com/item?id=48709670

https://news.ycombinator.com/item?id=48721903

Of course Anthropic is still relevant, but people have realized they’re not special, and between this and the ID verification thing, they’ve given up a ton of their relevance vs a month ago.

reply

upvote

by modriano16 hours ago|

[-]

I used Fable 5 for maybe 10 hours in the window when it was available. It was much better than Opus 4.8. And I have found the Opus models to be excellent, but Fable 5 was cranking out incredible research on some data sources I wanted to plumb into my project.

I wouldn't personally pay API pricing for it for my personal projects, but I bet it's going to be absolutely slammed with usage for the next month+.

reply

upvote

by musha68k16 hours ago|

[-]

No doubt Anthropic Mythos/Fable are frontier. I also miss having access as it uncovered some "evals repellant" regressions on my personal pet factory.

OTOH for most of my day to day work I've come to realize that faster ~ Opus 4.6 / GPT 5.3 level capabilities could be the sweet spot as scaffolding has to be put in place right after clean specs and constant review anyways. The latest chinese models and GLM 5.2 in particular felt on-par on that front.

reply

upvote

by musha68k17 hours ago|

[-]

As everyone knows, Kool-Aid is also just mostly water.

I work in AI / infrastructure and I have never seen as much interest towards investing into sovereignty by actual deciders. Thankfully, at this point I can't see any flip-flopping / change of messaging stopping that train.

In CA/EU over the last ~15 years, one used to be perceived as a bit of a "weird systems person" by just proposing alternatives to the big hyperscalers.

So the Trump administration, hands-down, has been the greatest ally here.

In tandem, I was hoping Anthropic would be keeping "dangerously capable" models banned from "evil Chinese distillers" for as long as possible.

reply

upvote

by datakan18 hours ago|

[-]

[flagged]

reply

upvote

by chews18 hours ago|

[-]

I bought a GLM 1 year subscription and changed my environment variables to use Claude Code... yep the same one that is using stegonography to send details about users to the model. China knows where I live, I'm not getting ripped off or rug pulled on their models either.

reply

upvote

by sparkling14 hours ago|

[-]

How has your experience been so far? Did you previously use Opus? Im curious how the overall "feel" of it is.

reply

upvote

by BrandoElFollito9 hours ago|

[-]

The Trump US gave us (Europeans) the kick in the bottom we needed to get the head from the sand.

Like never we delibrate specifically on non-US solutions (objectively great) because we realized we are neck deep in US dependence. It is not that it was not known before, we just did not feel the threat.

This is why EU companies niw look at our own solution (which are late and will probably suffer from the incompetence and mess of the EU institutions) but another key playet is round the corner, namely China.

Trump managed to make us look elsewhere than the US. Thanks for that.

reply

upvote

by sieabahlpark17 hours ago|

[-]

[dead]

reply

upvote

by flyingshelf19 hours ago|

[-]

[flagged]

reply

upvote

by brazukadev18 hours ago|

[-]

that is exactly why downvote exist.

reply

upvote

by AnimalMuppet18 hours ago|

[-]

That is part of why downvotes exist. They also exist for personal attacks, off topic tangents, posts that don't make sense, trolling, advertisements, AI generated content, and other such things we don't want to see here.

But "downvote for disagreement" is a legitimate use. I personally tried to tell someone that it wasn't, and I got corrected by dang.

reply

upvote

by naturalmovement17 hours ago|

[-]

> But "downvote for disagreement" is a legitimate use.

This made me realize it's a waste of one's time to write thoughtful, informative, educational posts only to have them buried and downvoted by man-children.

If we go by empirical evidence alone, it's a more effective use of time making Reddit-quality quips.

reply

upvote

by AnimalMuppet7 hours ago|

[-]

Depends. Do you want points, or do you want to say what you have to say?

And, there have been times when I have upvoted something I disagreed with, because it made me think.

reply

upvote

by brazukadev7 hours ago|

[-]

if you are writing things to get blessed by votes, your incentive might be in the wrong place to begin with.

As an example, my point that it is fine to downvote for disagreement got a few downvotes. Ironic, no?

reply

upvote

by deadbabe16 hours ago|

[-]

composer 2.5 is all you really need don’t be so dramatic

reply

upvote

by Art968116 hours ago|

[-]

No nation is going to willingly release a model that can be used against it. Not even China. The moment they have a Mythos class model, they will go through the same process. The AmericanCorp models are far ahead of any other models so we see this process unfolding through that lens.

No Mythos class model will be allowed to be legally hosted for download on any service. All powerful nations will ban this since safeguards are not guaranteed by shady service providers running these models.

For the Chinese first party providers, they will be forced to implement the same process and safeguards, and they will not be allowed to release the model weights to the public.

Why? Because no sane nation is going to put that kind of capability in the hands of the public only for the public to use that power against that nations best interests.

Save this comment. It is prophesy.

reply

upvote

by simonw16 hours ago|

[-]

Not great news for nations that want to secure their software.

reply