undefined

points

by fluoridation1 days ago |

comments

by Yokohiii1 days ago|

[-]

Those yottabytes of VRAM are also consuming electricity constantly.

by fluoridation1 days ago|

parent|

[-]

The difference being that an LLM request is not an operating system. Since they're compartmentalized and ephemeral, you can very easily distribute requests among your available hardware so that you can switch off machines during periods of low activity.

by jmalicki1 days ago|

parent|

[-]

Your capital costs for buying those machines don't go away.

by fluoridation1 days ago|

parent|

[-]

That's a problem that already exists in power generation and delivery, and it's already been solved. Bills are sums of fixed terms and variable terms.

by Yokohiii1 days ago|

parent|

[-]

Custom payment schemes are late stage profit generation. It requires hoards of salespeople or an AI that can actually do math.

It's just how hyperscaling works. You are not wrong, but in the wrong timeline.

by fluoridation1 days ago|

parent|

[-]

I'm not talking about custom, negotiated service contracts, I'm talking about simply charging people for what they use.

by anonzzzies1 days ago|

parent|

[-]

But that would be using (a special Claude code version) of the API; as it stands now, I have tried the current api for fun and I hit $200 well within an hour. So if they would charge for real use, no one would use it as there are competitors who have less harsh limits with tier plans still. If all go away then I will be running open models on vast.ai or so as those are now viable (been testing with glm 5 and it's great for coding). So tier subscriptions cannot go away as it will end those companies fast.