undefined

points

[-]

> Rumors are worth squat

You can make some educated guesses and find out some limits on inferencing cost by looking at 3rd party providers on platforms like openrouter. You can get some median cost /tok for a given model size. Then make some educated guesses on SotA model sizes, and you can get an estimate on pure cost of serving a model. Error bars and all that, of course. But still a range, with some limits.

by shimman4 hours ago|

parent|

[-]

No, you can't really make educated guesses unless people start opening their books. Especially in an industry where the vast majority of firms make up valuations out of thin air and not based on any reproducible insights.

by NitpickLawyer3 hours ago|

parent|

[-]

Opening their books would let you know things like profitability. I'm talking about cost per token, model development and human costs being irrelevant.

by whattheheckheck1 hours ago|

parent|

[-]

Yeah take the gpu rental cost, what it can run, how many tokens per second come out and see the true rate per token. Plus the margin on harness special sauce