undefined

points

by Aurornis16 hours ago |

comments

by caughtinthought15 hours ago|

[-]

What open source model and what non-subsidized provider specifically?

by nijave11 hours ago|

parent|

[-]

GLM 4.7 Flash is 0.07/1m tokens in, 0.40/1m tokens out on AWS Bedrock us-east-1. That's less than 1/10 the price of Haiku 4.5

Bedrock isn't the cheapest either although I'm fairly sure they aren't being VC subsidized

There are definitely cheap tokens out there. The big gotcha is "for tasks that can tolerate slightly less quality"

by EduardoBautista15 hours ago|

prev|

[-]

Yes, but how cheap is it to run four at the same time? It’s tough to run one good model locally, but running four at the same time which I commonly do with Claude and Codex just doesn’t seem to be happening anytime soon.

by Aurornis13 hours ago|

parent|

[-]

I'm referring to hosted models such as via OpenRouter or from the model providers' own services.

I think everyone making claims that inference is getting more expensive are unaware that there are more LLM providers than Google, Anthropic, and OpenAI.

by boringg15 hours ago|

prev|

[-]

Fair - there are bets both ways though I wouldn't consider it to be a certainty. That revenue drive on this AI build out is going to be real and multifold.