upvote
Personally I wouldn't want a couple dozen apps installed all with their own model.

It seems easier to have industry specs that define a common interface for local models.

I also assume the OS can, or would need to, be involved in proving the models. That may not be a good thing depending on your views of OS vendors, but sharing a single local model does seem more like an OS concern.

reply
I mean the openai API is the industry standard for allowing apps to communicate with models, llama-server has it, oMLX has it, ollama has it, vLLM has it, lmstudio as well. I don't think this is such a hard thing to do, but it requires people to set it up.
reply
I don't know enough about that API surface to know if its a particularly good one for the use cases we'd have, but yes defining a universal spec for all implementors to support wouldn't be a big lift and is done in plenty of other areas already.
reply
There is no other way than shipping your own model, because you will want an abstracted API over the inference, and you don't know what the user has installed. Also you can ship 9b fp4 model but it all just depends
reply
Knowing what's installed would have to be an OS API. If LLMs provide a standard API surface to the OS, likely including metadata related to feature support.
reply
You can know what the user has installed if the OS developer offers something.
reply