upvote
> AI is different.

I agree. The other thing here is that, once you can run LLMs on a single piece of commodity hardware (whether that includes one GPU or several), the difference between cloud vs. on-premise LLMs will largely be about where your hardware is located. There will be very little software configuration involved (just an HTTP endpoint that talks to the GPU). This is decidedly different from cloud products where the moat of hyperscalers is largely in the software and services on top of the hardware, not the hardware itself. (Sure, GPUs will eventually break & need replacement, too, but there's no state to lose, so that's already orders of magnitude easier than replacing hard drives.)

reply
There's no economic reason why running a model locally should be better than using a cloud hosted version.
reply
Sure there is. Keeping your IP in house.
reply