upvote
Which takes a $20k thunderbolt cluster of 2 512GB RAM Mac Studio Ultras to run at full quality…
reply
"Full quality" being a relative assessment, here. You're still deeply compute constrained, that machine would crawl at longer contexts.
reply
[flagged]
reply
deleted
reply
70B dense models are way behind SOTA. Even the aforementioned Kimi 2.5 has fewer active parameters than that, and then quantized at int4. We're at a point where some near-frontier models may run out of the box on Mac Mini-grade hardware, with perhaps no real need to even upgrade to the Mac Studio.
reply
>may

I'm completely over these hypotheticals and 'testing grade'.

I know Nvidia VRAM works, not some marketing about 'integrated ram'. Heck look at /r/locallama/ There is a reason its entirely Nvidia.

reply
Are you an NVIDIA fanboy?

This is a _remarkably_ aggressive comment!

reply
Which while expensive is dirt cheap compared to a comparable NVidia or AMD system.
reply
It's still very expensive compared to using the hosted models which are currently massively subsidised. Have to wonder what the fair market price for these hosted models will be after the free money dries up.
reply
Inference is profitable. Maybe we hit a limit and we don't need as many expensive training runs in the future.
reply
Inference APIs are probably profitable, but I doubt the $20-$100 monthly plans are.
reply
For sure Claude Code isn’t profitable
reply
What speed are you getting at that level of hardware though?
reply
The article mentions https://unsloth.ai/docs/basics/claude-codex

I'll add on https://unsloth.ai/docs/models/qwen3-coder-next

The full model is supposedly comparable to Sonnet 4.5 But, you can run the 4 bit quant on consumer hardware as long as your RAM + VRAM has room to hold 46GB. 8 bit needs 85.

reply
LOCAL models. No one is running Kimi 2.5 on their Macbook or RTX 4090.
reply
On Macbooks, no. But there are a few lunatics like this guy:

https://www.youtube.com/watch?v=bFgTxr5yst0

reply
Having used K2.5 I’d judge it to be a little better than that. Maybe as good as proprietary models from last June?
reply