undefined

points

[-]

You said you need $80k in hardware for "good performance". I'm saying the local AI inference workflow will be a lot more flexible about performance than that, and can get away with something vastly cheaper and in line with what the user owns already.

by otabdeveloper46 hours ago|

prev|

[-]

> paying $100/month

There will not ever be a monthly subscription for LLM tokens. The economics isn't there.

Local tokens will always be cheaper.

by entrope1 hours ago|

parent|

[-]

What's the basis for saying local tokens will always be cheaper? As others have outlined, LLMs serving one user at a time are pretty expensive, but concurrent users become much more cost-effective (assuming there's enough RAM for the contexts). If "local" to you means ~10 hours daily use by a team of employees, the company still has to balance against cloud services that can amortize non-recurring costs over 24 hours of service per day.

by zozbot23436 minutes ago|

parent|

[-]

Why wouldn't a team of employees be able to run AI workloads 24/7? Not all workloads are time sensitive.