upvote
The problem is that the differences between flagship and local models are compounding heavily. An 4% different could be massive when you keep iterating on the same code base.
reply
> The problem is that the differences between flagship and local models are compounding heavily

This depends a lot on how you work, and how much of the architectural thinking you do yourself.

People seem to lose sight of the fact that a flash model today is as powerful as a frontier model from a year ago. If you were happy with GPT 4.x, you should be ecstatic that equivalent power is now basically free...

reply
I am one of those ecstatic folk :)
reply
Thanks for sharing your insight.

Mind if I ask you for a few vibe coding tips? I failed to solve you gh puzzle in the profile though.

reply
If you are running multiple agents your cost to them should be multiples less what their roi is.
reply
My costs are 0$ as any token or subscription spend on agents is invoiced as an expense to my clients.
reply
Thanks so much for being bold enough to be fairly open about the costs, how you arrange billing and the advantages that's given you.

I've been fooling around with DeepSeek 4 agentically. It's probably not as good as Anthropic offerings, but even those seem to be roiled in politics and strife and DeepSeek 4 is very good IMHO. I'll later try out GLM.

I'm in Australia. The government has set up a "return and earn" scheme to keep aluminium cans, plastic bottles and paper drink cartons out of the waste stream. A laudable project. The money you make from return drink containers is pretty low, $AU 0.1 per container. I've participated to get the rubbish out of natural water streams and to make a nano amount of money on the side.

When I looked at the costs of an app I was getting DeepSeek to help me with, I realised that the several hours I'd spent learning and building had cost something like 8 recycled containers. In my head after doing some DeepSeek stuff, I calculate a "cans per app" metric for myself for fun. I may even setup a simple graph to view my costs that way.

I kind of hope the Anthropics of the world get enough price competition from sources like DeepSeek and GLM to drop their prices significantly. Time will tell.

I'm using the Chinese DeepSeek provider, so everything done there could potentially be taken and used by the CCP... But this is hobbyist learning.

There is probably a market for Deepseek/GLM served from non CCP available servers. I might even look into how hard that would be to setup here.

I also hope that inference focused hardware will come to the fore, reducing energy use and cost. Realistically this will take time though, on the order of years.

Here in Oz, we have community batteries that community members can charge and later draw from. Their electricity prices are competitive. I wonder if someone could setup something like a community battery to run data centres... That way reasonable environmental consideration could be given to inference power generation... This might not work in a market like the US or Europe, but small market size might be an advantage... Who knows.

reply
> There is probably a market for Deepseek/GLM served from non CCP available servers. I might even look into how hard that would be to setup here.

Please do. There is definitely a market for Deepseek / GLM hosted from non-China servers, there's over 20 providers for GLM 5.2 on OpenRouter alone... and they're all either Singapore (home of Z.AI / GLM), China, or US. There is nothing yet listed on OpenRouter from Europe (Inceptron still only has GLM 5.1). And of course, there is absolutely nothing hosted in Australia.

We're in a particularly dire situation in Australia. We're about to be cut off from Claude Fable and premium American models. The European Mistral models are garbage, at least in comparison to US models. Our only hope is going to be Chinese models (GLM 5.2 is good), and we're not even hosting them in Australia.

By the way, if you haven't tried an Anthropic model, it's worth spending at least $20 one month to give Opus 4.8 a try. I only got one night of access to Fable before I was cut off, but one single evening of Fable provided plans that I've been working through for about a week afterwards with Opus 4.8... and that was only Fable, not even Mythos. That's the kind of intelligence lead Australia is about to be cut off from.

(And kudos on the Containers For Change, that's something I do as well - mostly as an exercise incentive to walk to the local recycling machine, because the money certainly doesn't compensate for the time spent on the recycling.)

reply
Cortecs (EU router) lists GLM 5.2 from Tensorix and Nebius https://cortecs.ai/detailedServerlessView/glm-5.2

So two European providers at least

reply
Hosting in Australia is not feasible at Australian electricity prices.

(Speaking as a not-so-proud Australian.)

reply
Same issue in Canada - domestic inference capability for the open models is woefully behind.
reply
Canada has fewer excuses, given sparsely populated places that are cold with nearly infinite water and extremely cheap electricity.
reply
Yep, agreed. Main issue in Canada is a notoriously slow and stingy investment ecosystem. Resource-wise we're incredibly well positioned.
reply
Would you happen to know why there are so many Canadian investments in American telecom?
reply
It's very easy to use other providers. See https://openrouter.ai/ which also lets you filter by where the provider is hosted and their data retention policy.

Jeremy Howard was recommending fireworks.ai as a host of you want to go direct. Or there's Cloudflare.

For subscription alternatives people here on HN seem to mention Open Code Go a lot too https://opencode.ai/go

reply
> I'm using the Chinese DeepSeek provider, so everything done there could potentially be taken and used by the CCP

As opposed to Anthropic or OpenAI where everything done could potentially be taken and used by the US government.

Also, replace "could potentially" with "will definitely" in both cases, there's no conspiracy here.

We're stuck between two bad positions, so just use the one that's best for you, and wait for a better solution to arrive.

reply
You don't seem to like the "CCP" and their political views, but why are you using their sponsored models?

Why don't you exclusively host and use the open-weight western models, even if right now they don't perform as well?

I'd like to know the psychology behind this, because your actions feel contradictory to me.

reply
AI is the first technology that doesn't incentivize offshoring, and incentivizes co-location of talent.

A NYC dev and a dev in india have the same ai costs, based the ratio tokens/salary it becomes less of comparative disadvantage to be in NYC.

Now combine that with the fact that AI makes the act of generating code less a % time of the job, and the ability to get/refine requirements more of the job and you have a decent shift.

reply
Errr you just responded to someone that is offshore and is using AI to be much cheaper than local talent.
reply