Which of course causes some unfairness on both ends. Nobody here can compete with me. I often use left over tokens on local client projects; which despite lower pay, still pays off because they now take hours not days or weeks to complete. And nobody in the local clients talent pool can compete with me; unless they charge about half the market rate.
Take away my 500$ monthly grant; and I’d be more or less screwed. Better open models will more or less start to reduce this advantage. It’s not like I positioned myself here on purpose. But it’s definitely a „right place, right time“ situation.
This depends a lot on how you work, and how much of the architectural thinking you do yourself.
People seem to lose sight of the fact that a flash model today is as powerful as a frontier model from a year ago. If you were happy with GPT 4.x, you should be ecstatic that equivalent power is now basically free...
Mind if I ask you for a few vibe coding tips? I failed to solve you gh puzzle in the profile though.
I've been fooling around with DeepSeek 4 agentically. It's probably not as good as Anthropic offerings, but even those seem to be roiled in politics and strife and DeepSeek 4 is very good IMHO. I'll later try out GLM.
I'm in Australia. The government has set up a "return and earn" scheme to keep aluminium cans, plastic bottles and paper drink cartons out of the waste stream. A laudable project. The money you make from return drink containers is pretty low, $AU 0.1 per container. I've participated to get the rubbish out of natural water streams and to make a nano amount of money on the side.
When I looked at the costs of an app I was getting DeepSeek to help me with, I realised that the several hours I'd spent learning and building had cost something like 8 recycled containers. In my head after doing some DeepSeek stuff, I calculate a "cans per app" metric for myself for fun. I may even setup a simple graph to view my costs that way.
I kind of hope the Anthropics of the world get enough price competition from sources like DeepSeek and GLM to drop their prices significantly. Time will tell.
I'm using the Chinese DeepSeek provider, so everything done there could potentially be taken and used by the CCP... But this is hobbyist learning.
There is probably a market for Deepseek/GLM served from non CCP available servers. I might even look into how hard that would be to setup here.
I also hope that inference focused hardware will come to the fore, reducing energy use and cost. Realistically this will take time though, on the order of years.
Here in Oz, we have community batteries that community members can charge and later draw from. Their electricity prices are competitive. I wonder if someone could setup something like a community battery to run data centres... That way reasonable environmental consideration could be given to inference power generation... This might not work in a market like the US or Europe, but small market size might be an advantage... Who knows.
Please do. There is definitely a market for Deepseek / GLM hosted from non-China servers, there's over 20 providers for GLM 5.2 on OpenRouter alone... and they're all either Singapore (home of Z.AI / GLM), China, or US. There is nothing yet listed on OpenRouter from Europe (Inceptron still only has GLM 5.1). And of course, there is absolutely nothing hosted in Australia.
We're in a particularly dire situation in Australia. We're about to be cut off from Claude Fable and premium American models. The European Mistral models are garbage, at least in comparison to US models. Our only hope is going to be Chinese models (GLM 5.2 is good), and we're not even hosting them in Australia.
By the way, if you haven't tried an Anthropic model, it's worth spending at least $20 one month to give Opus 4.8 a try. I only got one night of access to Fable before I was cut off, but one single evening of Fable provided plans that I've been working through for about a week afterwards with Opus 4.8... and that was only Fable, not even Mythos. That's the kind of intelligence lead Australia is about to be cut off from.
(And kudos on the Containers For Change, that's something I do as well - mostly as an exercise incentive to walk to the local recycling machine, because the money certainly doesn't compensate for the time spent on the recycling.)
So two European providers at least
(Speaking as a not-so-proud Australian.)
Jeremy Howard was recommending fireworks.ai as a host of you want to go direct. Or there's Cloudflare.
For subscription alternatives people here on HN seem to mention Open Code Go a lot too https://opencode.ai/go
As opposed to Anthropic or OpenAI where everything done could potentially be taken and used by the US government.
Also, replace "could potentially" with "will definitely" in both cases, there's no conspiracy here.
We're stuck between two bad positions, so just use the one that's best for you, and wait for a better solution to arrive.
Why don't you exclusively host and use the open-weight western models, even if right now they don't perform as well?
I'd like to know the psychology behind this, because your actions feel contradictory to me.
A NYC dev and a dev in india have the same ai costs, based the ratio tokens/salary it becomes less of comparative disadvantage to be in NYC.
Now combine that with the fact that AI makes the act of generating code less a % time of the job, and the ability to get/refine requirements more of the job and you have a decent shift.
Has a very race-to-the-bottom feel to it.
Though in the grand scheme of it, $200/mo probably isn’t the real price either. Also looking at it not just in a vacuum - paying for a product that can change what you get from under you doesn’t seem great anyway.
At least with a locally-hosted model you know what you’re getting.
The LLM in a box is something you can buy today, but it 1. doesn’t serve over usb by default 2. costs $100k for hardware (not counting electricity) at 100 tps 3. can’t buy this from AliExpress.
Better to put that $100k in t-bills and just buy tokens even at api prices.
It's been awesome for embeddings and document OCR!
3D printing a case for it is on my todo list.
I’m using Qwen3.6:27B at home and mostly Sonnet/Opus (depending on the complexity of the task) at work.
You have to break things down into smaller chunks for the local models. For the bigger cloud ones they can do a lot of the broader thinking.
OpenAI already charges enterprise users a premium purely for that title over on-demand, no-contract usage. Retail users get a good deal. People make a lot of hay about subsidies but this is a very sane approach if you want exposure to these three different types of customers.
If that was true, they would be collaborating with each other and opening up all the results from their work.
The Chinese are genociding Uyghurs as we speak, purely for being Muslim, in numbers that dwarf any harm the US has done.
The list of wars the US is or was actively involved in[0] is SO LONG that the Wikipedia page is split into multiple different pages.
The main relevant ones are 20th[1] and 21st century[2], for which you better get a good grip on your mouse to scroll down.
I urge you to use your favorite AI to give you a rough summary of direct and indirect casualties of just those wars directly caused, started, or provoked by the US, from these lists.
For example, the "war on terror" alone has, so far, seen around 4.5–4.6 million+ people killed, and at least 38 million people displaced.
[0]: https://en.wikipedia.org/wiki/Lists_of_wars_involving_the_Un...
[1]: https://en.wikipedia.org/wiki/List_of_wars_involving_the_Uni...
[2]: https://en.wikipedia.org/wiki/List_of_wars_involving_the_Uni...
https://amnesty.ca/wp-content/uploads/2024/12/Amnesty-Intern...
Nothing China did comes close to this.
its not, this would require voted resolution to declare genocide. It was some report on inquiry by individuals with unknown bias.
He's sitting on a frontier model letting it burn a hole in his wallet that could actually pay for itself.
"Meta has been using Google’s Gemini large language model for most of its moderation and customer support, but staff have recently been told to switch to Meta’s new foundational model, Muse Spark, the people said."
https://www.ft.com/content/39251a31-4a9d-4870-b86c-dc6353d67...
0. https://openrouter.ai/compare/z-ai/glm-5.2/anthropic/claude-...
The sonnet tier sits below claude or chatgpt in terms of price but costs so much more than free models. If you are breaking downtasks now I'm not sure that 13 cents is worth it.
At work I'm struggling to keep my claude bill around $500.
Also if you run the “loops” they’re now yapping about, it will burn through enormous amounts of usage as well.
People speak of a permanent underclass.
https://www.nytimes.com/2026/04/30/opinion/ai-labor-work-for...
I get a lot more out of a 200/mo subscription now in a week than I did from them in a month.
Now obviously in today’s world they’d be using a 200/mo subscription themselves. But it’s not like money is nothing, software development doesn’t scale down below 1k/mo for anyone competent even in the poorest areas.
So a 200 USD subscription falls between 10% and 33% of an average brazilian developer's salary.
If you're running a business I agree it's a no-brainer, but the context here is for personal projects.
The median hourly wage in the US is $28/h, this equates to nearly 7.5 hours. A full day of work a month for the average person to use Claude with reasonable limits.
Yes, the people on $28/h may not be the software development types, so their income might not be as high, but these are the people who would probably be vibe coding the most since they aren't day to day programmers!
So not really comparable. I use Step 3.7 Flash locally, models are good enough for so many coding tasks even at the lower end! (Though I note that calling a 200B model "lower end" is kind of amusing)