upvote
> I love everything about this direction except for the insane inference costs.

If this direction holds true, ROI cost is cheaper.

Instead of employing 4 people (Customer Support, PM, Eng, Marketing), you will have 3-5 agents and the whole ticket flow might cost you ~20$

But I hope we won't go this far, because when things fail every customer will be impacted, because there will be no one who understands the system to fix it

reply
Inference costs at least seem like the thing that is easiest to bring down, and there's plenty of demand to drive innovation. There's a lot less uncertainty here than with architectural/capability scaling. To your point, tomorrow's commodity hardware will solve this for the demands of today at some point in the future (though we'll probably have even more inference demand then).
reply
I worry about the costs from an energy and environmental impact perspective. I love that AI tools make me more productive, but I don't like the side effects.
reply
Environmental impact of ai is greatly overstated. Average person will make bigger positive impact on environment by reducing his meat intake by 25% compared with combined giving up flying and AI use.
reply
[dead]
reply
[dead]
reply
This is the wrong way to see it. If a technology gets cheaper, people will use more and more and more of it. If inference costs drop, you can throw way more reasoning tokens and a combination of many many agents to increase accuracy or creativity and such.
reply
> throw way more reasoning tokens and a combination of many many agents to increase accuracy or creativity and such.

But this is just not true, otherwise companies that can already afford such high prices would have already outpaced their competitors.

reply
No company at the moment has enough money operate with 10x the reasoning tokens of their competitors because they're bottlenecked by GPU capacity (or other physical constraints). Maybe in lab experiments but not for generally available products.

And I sense you would have to throw orders of magnitude more tokens to get meaningfully better results (If anyone has access to experiments with GPT 5 class models geared up to use marginally more tokens with good results please call me out though).

reply
I mean theoretically if there are many competitiors the costs of the product should generally drop because competition.

Sadly enough I have not seen this happening in a long time.

reply