upvote
Competition is bad? Who cares - let the big players subsidize and compete between each other. That's what we want. We want strong models at a low price, and we'll hype up whoever is doing it.

Simultaneously, we also hype up the open models that are catching up. That are significantly more discounted, that also put pressure on the big players and keep them in check.

People aren't falling for PR; people are encouraging the PR to put pressure on the competition. It's not that hard.

reply
Interesting to see your observation where I have observed the opposite: posts that share big news about open-weight local models have many upvoted comments arguing local models shouldn’t be taken seriously and promoting the SOTA commercial models as the only viable options for serious developers.

Here and on AI tech subreddits (ones that aren’t specifically about local or FOSS) seem to have this dynamic, to the degree I’ve suspected astroturfing.

So it’s refreshing to see maybe that’s just a coincidence or confirmation bias on my end.

reply
Both can be true at the same time. I currently wouldn't waste my time with open models for almost all use cases, but they're crucial from a data privacy and competitive perspective, and I can't wait for them to catch up enough to be as useful as the current frontier models.
reply
I've found qwen3 to be very usable on my local machine (a Framework Desktop with 128gb RAM). I doubt it could handle the complex tasks I throw at Claude Opus at work, but it's more than capable of doing a surprising number of tasks, with good performance.
reply
What tasks do you use qwen3 for? Coding? Are you running it on CPU or GPU? What GPU does that Framework have?

Thanks!

reply
I have an Asus GX10 that I run Qwen3.5 122B A10B on, and I use it for coding through the Pi coding agent (and my own); I have to put more work in to ensure that the model verifies what it does, but if you do so its quite capable.

It makes using my Claude Pro sub actually feasible: write a plan with it, pick it up with my local model and implement it, now I'm not running out of tokens haha.

Is it worth it from a unit economics POV? Probably not, but I bought this thing to learn how to deploy and serve models with vLLM and SGLang, and to learn how to fine tune and train models with the 128GB of memory it gets to work with. Adding up two 40GB vectors in CUDA was quite fun :)

I also use Z.ai's Lite plan for the moment for GLM-5.1 which is very capable in my experience.

I was using Alibaba's Lite Coding Plan... but they killed it entirely after two months haha, too cheap obviously. Or all the *claw users killed it.

reply
GLM 5.1 is extremely good, and ridiculously cheap on their coding plan. Its far better than Sonnet, and a fifth of the cost at API rates. I don't know if the American providers can compete long-term; what good is it to be more innovative it only buys them a six month lead andthey can't build the data center capacity fast enough for demand? Chinese providers have a huge advantage in electrical grid capacity.
reply
True but Z.ai also just silently raised the price, and the entire Chinese frontier set is having to make profit now... hence Alibaba killing the Lite plan and not letting people sign up to their Pro one either; and why MiniMax has their non-commercial license, etc. etc.

So I agree with you, its better than Sonnet but way cheaper. I do wonder how long that will last though

reply
Z.ai does really well at the carwash question!
reply
Thank you. I've been using ollama for a much more modest local inference system. I'll research some of the things you've mentioned.
reply
[dead]
reply
The Framework Desktop has a Ryzen 395 chip that is able to allocate memory to either the CPU or GPU. I've been able to allocate 100+gb to the GPU, so even big models can run there.

Most recently I used it to develop a script to help me manage email. The implementation included interacting with my provider over JMAP, taking various actions, and implementing an automated unsubscribe flow. It was greenfield, and quite trivial compared to the codebases I normally interact with, but it was definitely useful.

reply
That's great. Ostensibly my system could also allocate some of the 32 GB of system memory to argument the 12 GB VRAM, but I've not been able to get it to load models over 20B. I should spend some more time on it.
reply
I'm just waiting till I can afford a GPU again
reply
I've invested significant time into getting open models to work, and investigating what works well.

The TL;DR is that unless you are doing it as a hobby or working in an environment where none of the data privacy options supported by Anthropic/OpenAI (including running on Azure/Bedrock with ZDR) work for you then it's not worth it.

The best open models are around the Sonnet 4.6 level. That's excellent, but the level of tasks you can give to GPT 5.4 or Opus 4.6 is just so much higher it doesn't compare (and Opus 4.7 seems noticeably better in my few hours of testing too).

I have my own benchmarks, but I like this much under-publicized OpenHands page: https://index.openhands.dev/home

It shows for every task they test closed models do the best. The closest and open model gets is Minmax 2.7 on issue resolution where it's ~1% worse than the leaders.

That matches my experience - fine for small problems, but well behind has the task gets bigger.

reply
deleted
reply
> Interesting to see your observation where I have observed the opposite: posts that share big news about open-weight local models have many upvoted comments arguing local models shouldn’t be taken seriously and promoting the SOTA commercial models as the only viable options for serious developers.

When I argue this, my point is that FOSS shouldn't target the desktop with open weights - it should target H200s. Really big parameter models with big VRAM requirements.

Those can always be distilled down, but you can't really go the other way.

reply
I agree but I’d like to add that people are definitely falling for PR, people are always falling for PR or no one would bother with PR
reply
This assumes people are in touch with reality and aren't just motivated by vibes and insta-reactions on social media
reply
> Competition is bad? Who cares - let the big players subsidize and compete between each other.

Subsidizing is the opposite of competing. It's literally the practice of underpricing your product to box out competition. If everyone was competing on a level playing field they would all price their products above cost.

All these tech oligarch asshat companies need to be regulated to hell and back.

reply
The moat was already too large for smaller players. Let them subsidize. Take from investors and give to us buying me time to beef up my local stack to run local models.

For many things now you need to go local and in the future if you want any privacy you'll need to go local.

reply
Excellent point, but I still think the oligarchs have gotten a little monopoly-happy.
reply
What's the alternative, move to North Korea ?
reply
Well, that's a great big wtf out of left field.
reply
deleted
reply
Big players subsidizing is what kills medium and small players which then kills competition. What follows is monopoly.

Big players operating at loss to distort the market is not a good thing overall.

reply
The medium and small players are literally just distilling the larger models.

It's not the smaller players spending billions on training data.

reply
No, the medium and small players are the Mistals, DeepSeek and H Company of the world, with their own models using quirky optimisation techniques to be able to compete.
reply
It's hilarious how much this post reads as drafted by an LLM. The emdash, "it's not X, it's Y" framing, incredible.
reply
I wrote my post myself.
reply
Dogfooding by the slop factory. The artificial centipede.
reply
Call it fall for it, but here are my two experiences, with both applications open. ($20/month plan for both)

  - Claude: Good for ~20 minutes of work once every 4 hours
  - Codex: Good for however long I want to use it.  
Claude nerfed their product so that it's not usable, so I use something else.
reply
Since we’re sharing anecdata: I also have the $20 month plan for codex, and I hit the five hour limit after about an hour of work every single time I open it. I use it for personal side projects primarily in the evening after kids are in bed, so my strategy is to launch it about 4pm and send a simple prompt to prime the 5 hour window to end at 9pm, start working about 8pm, and then I can use up the existing 5 hour window and the next one by about 10pm.
reply
What kind of side projects do you need to run these models for that many hours? I haven't experimented with Opus to that extent and mostly supervise it and/or am prompting it every 5-10min to fix something up.
reply
I've done a variety of things with it:

- sysadmin tasks for my home server which runs home assistant, plex, and minecraft servers. Being able to tell it "Set up a minecraft fabric server with this list of mods" is pretty nice, and it's fairly competent at putting together home assistant dashboards and automations (make sure you have backups of anything it's allowed to touch, though--it may delete stuff without warning).

- Several small web apps primarily for my own use.

- Currently working on an opinionated desktop writing app for my own use.

reply
I'm on the 100 USD plan with Anthropic, I hit the 5 hour limits about 75% of the time during working hours, but almost never the weekly ones - by the time they're reset I've usually used up between 50% - 75% of the quota. There are periods of more intense usage ofc, but this is the approx. situation I'm in (also it doesn't work on tasks while I'm asleep, because I occasionally like having a look at WIP stuff and intervene if needed).

The Anthropic 20 USD plan would more or less be a non-starter for agentic development, at least for the projects that I work on, even while only working on a single codebase or task at a time (I usually do 1-3 at a time).

I would be absolutely bankrupt if I had to pay per-token. That said, I do mostly just throw Opus at everything (though it sometimes picks Sonnet/Haiku for sub-agents for specific tasks, which is okay), so probably not a 100% optional approach, but I've wasted too much time and effort in the past on sub-optimal (non-SOTA) models anyways. I wonder which is closer to the actual cost and how much subsidizing there is going on.

reply
The $200 openai plan feels like 10x the limit as the $100 claude plan.

But Opus is both smarter and faster than GPT, so I can get a lot more done during the Claude limits.

reply
for now... right now you are getting 2x usage as a promo
reply
Concur, re the ratio of weekly vs hourly limits: I hit the hourly one much more often than weekly.
reply
Wow the 20 dollar Claude plan sounds awful. I use Claude at work which has metered billing and have to carefully not to hit my four figure max cap.

For me $20 a month is more than I want to spend I just use the free tiers. If I use AI in an app or site I use older models mostly chatgpt3.5. The challenge is more fun and it means I can do more like, make more api calls - 100x more.

reply
I use $20 plan for my side projects and in the beginning I was hitting limits very fast but after creating proper .md files and running /clear, it seems to work fine for my use. I am really curious how people are using $100-$200 plans. Maybe I am not utilizing to its full capacity??
reply
[dead]
reply
There's a systematic marketing campaign from oai on reddit and HN - there's a huge uptick of "codex is better than claude code" comments and posts this last week which is perfectly timed with the claude code increased limits
reply
Go to /r/codex and see how pissed off people are by the new Codex Plus plan 5-hour limits (they're a sliver of what they were a week ago). Whatever OpenAI is doing to market on Reddit isn't working.
reply
I'm not sure what changed or what the complaint is ... But personally, I have still never hit the rate limit on the $20/mo ChatGPT Plus plan, while I was constantly getting kicked off the Claude Pro plan until I got fed up and cancelled a few months ago.
reply
I can get about 20 ~ 40 minutes of my 5-hour limit using Codex 5.4 medium to say write a patch script in typescript for a Firebase + BigQuery app. That's including about 10 minutes of first writing a planning.md doc with 5.2 High.

A couple weeks ago I'd get roughly 2~3 hours. And a month before that I couldn't break the 5-hour limit.

reply
They were running a 2x rate limit promo last month.
reply
To be fair, GPT 5.4 is mostly a better model than Opus 4.6 in terms of quality of work. The tradeoff is it's less autonomous and it takes longer to complete equivalent tasks.
reply
Thing is, Codex 5.3 is a better and more consistent model than anything Anthropic have come out with. It can deal with larger codebases, has compaction that works, and has much less of a tendency to resort to sycophantic hallucination as it runs out of ideas. I also appreciate their approach to third party harnesses like opencode, which is obviously the complete opposite to Anthropic and their scramble to keep their crumbling garden walls upright.

Which makes it even more of a shame that Sam Altman is such a psychopathic jackass.

reply
So Anthropic degraded their product. OAI updated their product to meet for exceeded Anthropic old product.

This is normal behavior and not a cause for such a hyperbolic response.

reply
There is good competition and bad competition.

Pricing your product unsustainably vs a competitor to gain market share is regarded as "bad competition" and has historically been seen as anticompetitive.

It does not benefit the consumer in the long run, because the goal is to use your increased funding or cash reserve to wipe your competition out of the market, decreasing competition in the long term.

Then, once your competition is gone, and you've entrenched yourself, you do a rug pull.

reply
you're right but for now it doesn't matter if both competitors are running on infinite vc money, we as consumers benefit from it. it only matters if they cause negative externalities in the meantime
reply
This is the benefits of competition in action
reply
To be clear, unsustainably hemorrhaging money to gain marketshare over a competitor is generally considered an anticompetitive practice.
reply
What if both competitors are doing it?
reply
It’s also THE playbook of the Silicon Valley.
reply
Also why there’s so much enthusiasm for it on HN
reply
This is true. But Anthropic did us dirty most recently and so it’s their turn on the pitch fork. Sam will do us too. Just not yet.
reply
It's one of the things I really dislike about providers hyping "inference time scaling" as a concept. Apart from being a blatant misnomer (there's nothing scalable about it), it's so transparently a dial they can manipulate to shape perception. If they want a model to seem more intelligent than it really is, just dial up the "thinking" and burn tokens. Then once you have people fooled, you can dial it down again. Everyone will assume its their own fault that their AI suddenly isn't working properly. And since it's almost entirely unmeasurable you can do it selectively for any given product you want to pitch for any period of time you like and then pull the rug.

We need to force them back into being providers of commodity services and hit this assumption they can mold things in real time on the head.

reply
I have a feeling that Codex is also getting lower limits. Got this email just now. Basically they copy Claude's $100 tier.

> To help you go further with Codex, we’re introducing a new €114 Pro tier designed for longer, high-intensity sessions.

> At launch, this new tier includes a limited-time Codex usage boost, with up to 10x more Codex usage than Plus (typically 5x).

> As the Codex promotion on Plus winds down today, we’re rebalancing Plus usage to support more sessions across the week, rather than longer high-intensity sessions on a single day.

reply
They didnt just lower limits they keep messing with peoples local settings and I wish it would be called out drastically more because it could cause serious issues. A coding agents settings are a contract, even the default ones, if they worked for me for 9 months and now you are changing defaults on me, you shouldnt just force new defaults on me without warning, Claude can and will goof up hard if misconfigured.
reply
Thinking in counterfactuals, how would the hype around Codex would be different if it was organic and because they had built a genuinely good product? Asking as someone who genuinely loves Codex and has been in the OpenAI camp for months after buying a Claude Max plan from November to February.
reply
I haven’t noticed much hype around Codex. I have both and use Claude for broad work off my phone and Codex on my computer to clean up the mess. Crank reasoning to the highest setting for each. Claude is extremely unreliable for me, and Codex feels like more of a real tool. I’d say Codex has a bit of a learning curve. Nothing much has changed for me in the past month or two (whenever GPT 5.4 came out).
reply
deleted
reply
It's quite likely that OpenAI is running a significant PR campaign to compensate for the bad rep they earned by stepping in to meet the demands of the Trump administration, after Anthropic refused to assist the administration with mass domestic surveillance and development of lethal autonomous weapons. Presumably OpenAI didn't buy the podcast TBPN just because they like the guys.

https://paulgraham.com/submarine.html

reply
Anthropic don't seem to know how to look after and keep customers.
reply
everyone seems to unconditionally love anthropic, but openai has always had the best models… it just requires a bit more effort on behalf of the user to actually leverage it.
reply
> because Anthropic lowered rate limits for individuals due to compute constraints

It's because they don't support OpenCode.

reply
I really hate this kind of behavior. Yeah, Anthropic may do some bad things, I don't know, but we all see that Anthropic is always one step ahead of OpenAI. And just because Anthropic lowered rates for some people, people now start saying that Codex is way better than Claude Code / Claude Desktop.
reply
Codex is much worse than Anthropics model. My experience is that I burn 10x the tokens using Codex compared to Sonnet 4.6
reply
There was brief consternation when OpenAI swooped in to snatch up those DoD contracts but then the next model released and all is forgiven.
reply
Anthropic coming out to say they won't surveil Americans wasn't actually a positive for me. It meant they're okay with surveilling the rest of the world, which in turn signaled "fuck you, you're inferior, deal with it" to me (as someone from the aforementioned rest of the world).

When OpenAI snatched those contracts, it made me think no worse of OpenAI. The surveillance was already factored into how I saw them (both).

reply
And hopefully Anthropic has extra capacity then and I can return there.
reply
Uber, but AI!
reply
No it’s because Anthropic can’t message anything to its customers without lying.
reply
Not only that, but anthropic is now forcing users to give their biometric information to palantir

They're doing a slow rollout

reply
OAI already requires this. They both require identity verification in some cases
reply