undefined

points

[-]

"Full quality" being a relative assessment, here. You're still deeply compute constrained, that machine would crawl at longer contexts.

by PlatoIsADisease54 minutes ago|

prev|

[-]

[flagged]

by 29 minutes ago|

parent|

[-]

deleted

by zozbot23447 minutes ago|

parent|

prev|

[-]

70B dense models are way behind SOTA. Even the aforementioned Kimi 2.5 has fewer active parameters than that, and then quantized at int4. We're at a point where some near-frontier models may run out of the box on Mac Mini-grade hardware, with perhaps no real need to even upgrade to the Mac Studio.

by PlatoIsADisease40 minutes ago|

parent|

[-]

>may

I'm completely over these hypotheticals and 'testing grade'.

I know Nvidia VRAM works, not some marketing about 'integrated ram'. Heck look at /r/locallama/ There is a reason its entirely Nvidia.

by sealeck37 minutes ago|

parent|

prev|

[-]

Are you an NVIDIA fanboy?

This is a _remarkably_ aggressive comment!

by teaearlgraycold2 hours ago|

prev|

[-]

Which while expensive is dirt cheap compared to a comparable NVidia or AMD system.

by SchemaLoad2 hours ago|

parent|

[-]

It's still very expensive compared to using the hosted models which are currently massively subsidised. Have to wonder what the fair market price for these hosted models will be after the free money dries up.

by cactusplant73741 hours ago|

parent|

[-]

Inference is profitable. Maybe we hit a limit and we don't need as many expensive training runs in the future.

by paxys22 minutes ago|

parent|

[-]

Inference APIs are probably profitable, but I doubt the $20-$100 monthly plans are.

by teaearlgraycold32 minutes ago|

parent|

prev|

[-]

For sure Claude Code isn’t profitable

by blharr2 hours ago|

parent|

prev|

[-]

What speed are you getting at that level of hardware though?