upvote
I agree. My company pays for my tokens so I use the best models I can. I'm more worried about the quality of the work and the speed of accomplishing tasks than I am on saving the most money on every token.

Now, if they come back and tell me I can't spend as much om tokens, I'll have to change my strategy. But everything I'm hearing so far is we're going to be increasing our token spend this year and probably next year too. Not crazy increases but maybe enough to still keep using the latest models without being anxious about every prompt.

reply
Not even SotA models are good enough to generate code (beyond functions or small, very simple modules) that I'd be happy shipping, so I've decided to just not have them do that. And given this, it has basically turned out that what's left is information gathering + analysis + design overview stuff.

I've just recently started trying out DeepSeek 4 Flash and I was very skeptical at first because I've had some really good experiences with GPT-5.{4,5}, and couldn't possibly believe that this model they charge nothing for could give me similar results, but it absolutely shreds through things and ends up giving me very good answers in almost no time. I also like that it doesn't really seem to have much personality, it's given me mostly just facts and data so far without any additions to the prompt by me.

In my own agent I also specifically prompt to remove flowery language, snark, etc., but I haven't tried it with models like GPT-5.x which I've found has too much personality and tries to make it seem like I'm talking to a human too much.

reply
I feel similarly. I'll gladly pay to use the most intelligent model I can find on the best harness I have. Sometimes this is GPT Pro, sometimes this is Opus.

I ask AI a lot of questions, not only about code but about my personal life, and I would be willing to pay very large sums to have the best quality output.

reply
I think that's true for now, but eventually there will reach a point where a model is good enough (approaching that right now with frontier models) and there will be diminishing returns. I don't need a PHD level Genius to build me an analytics dashboard for example, so why would I pay for a model with that level of intelligence when I can (eventually) self host a good enough model and run queries for electricity cost + hardware.
reply
I think we are approaching that now, with correct expectations. With frontier large models you can often one-shot tasks with vague prompts for stuff like creating CRUD APIs and dashboards around a simple data model since it's such a solved-problem now. With something like Qwen3.6 27B or 35B-A3B and a Strix Halo level computer or a MBP with 32GB or more or RAM, you may need to be more explicit and stay involved and be a little more patient, but you can absolutely get work done with it or delegate tasks to it successfully.

My Framework Desktop does a lot of similar work as my Claude subscription at work (Cowork, chats) for 100W of power draw and a little patience waiting for a slow GPU with limited memory bandwidth to crunch the numbers. Agentic coding is obviously weaker but CRUD development and visualization dashboards are within reach, and I'm usually pleasantly surprised at its ability to self-manage devops.

reply
You pay $3k/year for personal use? Or out of your own pocket but for your job?
reply
I started paying $100/month a few years ago to now ~$5k a year out of pocket for personal use to learn and grow in my position at work.
reply
It's through my startup, so both I guess. Generally I find my bottleneck to be attention and focus, and the opportunity cost of not going back to work at my prior employers absolutely dwarfs the amount of money I spend on tools, so it's not hard for me to justify spending $200/mo on something I use every day that makes me more productive and generally removes bullshit from my life.

At my prior job there was still what felt like a strong enough correlation between my actual performance and my pay that I don't think I would have had a hard time justifying the expense there either; now I absolutely don't. With the current state of the models, it's baffling to me to hear about professional software developers planning their work around their $20/mo subscription's quotas.

Obviously it's more complicated than more tokens = more productive, but I see them less like SaaS and more like gasoline, where if I run out or need more to do what I'm doing, as long as I'm not being wasteful, I just buy more. Why would I waste a day walking 30 miles by foot when I can just pay $5 for gasoline and drive?

reply
I do that for personal use too (although $2.4k/yr for me because I only have an Claude Max subscription). Outside of my hobby projects Opus also manages my personal accounting, researches and organizes info (travel plan, what to buy and where to buy, etc), helps me reply to emails when I'm working in the kitchen, etc. I consider it well worth the price. Tbh I'm willing to pay more than what I currently do, but competition is good for the consumers.
reply
I thought the same way until I tried DeepSeek. I am genuinely impressed at how capable it is.
reply