undefined

70B dense models are way behind SOTA. Even the aforementioned Kimi 2.5 has fewer active parameters than that, and then quantized at int4. We're at a point where some near-frontier models may run out of the box on Mac Mini-grade hardware, with perhaps no real need to even upgrade to the Mac Studio.

by PlatoIsADisease38 minutes ago|

parent|

[-]

>may

I'm completely over these hypotheticals and 'testing grade'.

I know Nvidia VRAM works, not some marketing about 'integrated ram'. Heck look at /r/locallama/ There is a reason its entirely Nvidia.

by sealeck35 minutes ago|

parent|

prev|

[-]

Are you an NVIDIA fanboy?

This is a _remarkably_ aggressive comment!

by teaearlgraycold2 hours ago|

parent|

prev|

[-]

Which while expensive is dirt cheap compared to a comparable NVidia or AMD system.

by SchemaLoad1 hours ago|

parent|

[-]

It's still very expensive compared to using the hosted models which are currently massively subsidised. Have to wonder what the fair market price for these hosted models will be after the free money dries up.

by cactusplant73741 hours ago|

parent|

[-]

Inference is profitable. Maybe we hit a limit and we don't need as many expensive training runs in the future.

by paxys20 minutes ago|

parent|

[-]

Inference APIs are probably profitable, but I doubt the $20-$100 monthly plans are.

by teaearlgraycold30 minutes ago|

parent|

prev|

[-]

For sure Claude Code isn’t profitable

by blharr2 hours ago|

parent|

prev|

[-]

What speed are you getting at that level of hardware though?

by corysama1 hours ago|

prev|

[-]

The article mentions https://unsloth.ai/docs/basics/claude-codex

I'll add on https://unsloth.ai/docs/models/qwen3-coder-next

The full model is supposedly comparable to Sonnet 4.5 But, you can run the 4 bit quant on consumer hardware as long as your RAM + VRAM has room to hold 46GB. 8 bit needs 85.

by paxys2 hours ago|

prev|

[-]

LOCAL models. No one is running Kimi 2.5 on their Macbook or RTX 4090.

by DennisP27 minutes ago|

parent|

[-]

On Macbooks, no. But there are a few lunatics like this guy:

https://www.youtube.com/watch?v=bFgTxr5yst0

by teaearlgraycold2 hours ago|

prev|

[-]

Having used K2.5 I’d judge it to be a little better than that. Maybe as good as proprietary models from last June?