upvote
> If they care about privacy, they can rent cloud instances in order to setup, run, close and it will be both cheaper, faster (if they can afford it) but also with no upfront cost per project. This can be done with a lot of scaffolding, e.g. Mistral, HuggingFace, or not, e.g. AWS/Azure/GoogleCloud, etc.

I'm a somewhat tech heavy guy (compiles my own kernel, uses online hosting, etc).

Reading your comment doesn't sound appealing at all. I do almost no cloud stuff. I don't know which provider to choose. I have to compare costs. How can I trust they won't peek at my data (no, a Privacy Policy is not enough - I'd need encryption with only me having the key). What do I do if they suddenly jack up the rates or go out of business? I suddenly need a backup strategy as well. And repeat the whole painful loop.

I'll lose a lot more time figuring this out than with a Mac Studio. I'll probably lose money too. I'll rent from one provider, get stuck, and having a busy life, sit on it a month or two before I find a fix (paying money for nothing). At least if I use the Mac Studio as my primary machine, I don't have to worry about money going to waste because I'm actually utilizing it.

And chances are, a lot of the data I'll use it with (e.g. mail) is sitting on the same machine anyway. Getting something on the cloud to work with it is yet-another-pain.

reply
To your second issue/question, all the cloud provide CMEK services/features (for many years now).
reply
> suddenly jack up the rates or go out of business?

There is basically no lock-in, you don't even "move" your image, your data is basically some "context" or a history of prompts which probably fits in a floppy disk (not even being sarcastic) so if you know the basic about containerization (Docker, podman, etc) which most likely the cloud provider even takes care of, then it takes literally minutes to switch from one to another. It's really not more complex that setting up a PHP server, the only difference is the hardware you run on and that's basically a dropdown button on a Web interface (if you don't want to have scripts for that too) then selecting the right image (basically NVIDIA support).

Consequently even if that were to happen (which I have NEVER seen! at worst it's like 15% increase after years) then it would actually not matter to you. It's also very unlikely to happen based of the investment poured into the "industry". Basically everybody is trying to get "you" as a customer to rely on their stack.

... but OK, let's imagine that's not appealing to you, have you not done the comparison of what a Mac Studio (or whatever hardware) could actually buy otherwise?

reply
Ok. I think I misunderstood. So the idea is to simple set up the LLM service on the server and access it with an API like I would with any LLM provider? This way whatever application I want to use it for stays at home?

That's a bit more appealing. How much would it cost per month to have it continually online?

reply
Well it depends entirely on what you need. You can even do the training yourself on that infrastructure to rent if you want. The more you do yourself, the more private but also the more expensive it will be.

I don't want to make an ad here but I'm going to point to HuggingFace https://endpoints.huggingface.co (and to avoid singling them out just https://replicate.com/pricing too but I don't know them well) as an example with pricing.

The "beauty" IMHO of such solutions is that again you pay for what you want. If you want to use the endpoint only for 5min to test that the model and its API fits your need? OK. You want the whole month? Sure. You want 1 user, namely you? Fine, not a lot of power, you want your whole organization to use that endpoint? Scale up.

I'm going to give very rough approximation because honestly I'm not really into this so someone please adjust with source :

Apple Mac Studio M3 Ultra 96GB = $4K

~NVIDIA A100 with 80G ~ 10x perf compared to M3 Pro (obviously depends on models)

So on Replicate today a one can get an A100 for ~$5/hr which is ... about a month. But that's for 10x speed and electricity included. So very VERY approximately if you use a Mac Studio for 10 months on AI non stop (days and night) then it's arguably worth it.

If you use it less, say 2hrs/day only for inference, then I imagine it takes few years to have the equivalent and by that time I bet Replicate or HuggingFace is going to rent much faster setup for much cheaper simply because that's what they have ALL done for the last few years.

reply
Well, full disclosure (despite my comments above): I'm not interested in buying a Mac Studio. I was merely explaining why I thought people may prefer it.

For my own use, I'm just looking at absolute price (and convenience).

I haven't explored open weights models, so I have no idea which I'd want. It would be great to get a "frontier" model like Minimax-M2.5, but at $10/hr, it's not worth it - let alone $40/hr for GLM-5. I'd have to explore use cases for cheaper models. Likely for things related to reading emails, I can get by with a much cheaper model.

If I set one of these up, how easily is it for me to launch one of these (on the command line on my home PC) and then shut it down. Right now, when I write any app (or use OpenCode), it's frictionless. My worry is that either turning it on will be a hassle, and even worse, I'll forget to turn it off and suddenly get a big pointless bill.

If there are any guides out there on how people manage all this, it would be much appreciated.

reply
Honestly I doubt it's worth it, hence my suggestion to make a "cold" estimation of both options.

Well it's not exactly a guide and honestly it's quite outdated (because I stop keeping track as I just don't get the quality of results I hope for versus huge trade offs that aren't worth it for me) but I listed plenty of models and software solutions for self-hosting, at home or in the cloud at https://fabien.benetou.fr/Content/SelfHostingArtificialIntel...

Feels free to check it out and if there is something I can clarify, happy to try.

reply
I think the main use case is home automation. You don't want details of your home setup leaking out.
reply
Genuine question: If I were to fine-tune a model with 10 years of business data in a competitive space, would you feel safe with cloud training?
reply
If you already have those 10 years of bussiness data on Microsoft or Google services or their respective clouds, are you feeling safe?
reply
I'm not a lawyer but technically most if not all cloud providers, specific to AI ("neo-cloud") or not, to provide Customer-managed encryption keys (CMEK) as someone else pointed out.

That being said if I were to be in such a situation, and if somehow the guarantees wouldn't be enough then I'd definitely expect to have the budget to build my own data center with GB300 or TPUs. I can't imagine that running it on a Mac Studio.

reply
People store that data in databases in the same data centre so it's really the same level of trust needed that your provider adheres to the no training on your data. Trust and lawyers.
reply