undefined

points

[-]

I suspect with those specs, you're not in the game right now for reliably using local models for code generation. The easiest way in is a MacBook with at least 32GB of RAM. This should be able to run a 4bit quantization of qwen 3.6 using the MLX format really well.

by Otternonsenz1 days ago|

parent|

[-]

Now that I’m dipping more into this space, am gonna see what I can upgrade with the motherboard I have, but RAM pricing as it is, I’ll need to be smart about when I upgrade.

I very much appreciate the frank response, as it makes me feel less defeated at knowing my understanding of how it should work is not the full issue, hahaha

by fumeux_fume23 hours ago|

parent|

[-]

M series macs are usually used for running these LLMs locally because the GPU and CPU share the same pool of RAM at very low latency. If you upgrade your RAM on a different kind of chipset without the Unified Memory Architecture, then it'll be much slower to produce all the tokens you need. Just another data point to add to your upgrade equation.

by jboss1023 hours ago|

prev|

[-]

I have 8GB VRAM but 32GB RAM. Qwen 3.6 35B runs nicely.

You should look at gemma-4-26B-A4B. 16+8=24gb and Q4 is about 16GB. Not much context left, but might run.

by jboss1023 hours ago|

prev|

[-]

I have 8GB VRAM, but 32GB sys ram. I can run qwen 3.6 35B at 30 tok/s. I also use pi, and it's smart enough to extend itself(multishot and maybe a few tries)

For you, you could try gemma-4-26B-A4B

by Otternonsenz4 hours ago|

parent|

[-]

Thank you for the recommendation, and so far, it has been working great (within reason, haha). It doesn’t kill my rig when thinking, but it definitely needs more training wheels to nudge it towards the goal.

It seemed to get the idea of my prompt to extend the footer info (I want it to show the model abilities like tool calling or reasoning where the context percent thing is), made a plan and wrote the file, but then got hung up on implementation because it couldn’t figure out how Pi renders that part of the UI in Powershell

So possibly trying a different terminal might help on that front, haha

by fluoridation1 days ago|

prev|

[-]

I think at 16 GB you'd struggle to run the regular development tools nowadays, forget about any interesting inference.

by Otternonsenz1 days ago|

parent|

[-]

Fully agreed, and my hope is as open models grow and change, that getting some amount of this working on Pro-sumer hardware will be more attainable.

But certainly seems like we are a few years away from that, sadly.

Am I also screwed in being able to train my own small model or adjust another one with such a non-workhorse PC?

by fluoridation1 days ago|

parent|

[-]

Training requires even beefier hardware than inference.

by spaqin19 hours ago|

prev|

[-]

I got a 32GB of RAM and a 6GB VRAM card; tried both 27B and 35B, with pi. And it's a laptop. Speed isn't exactly a concern for me, I can enjoy the real life while the agent is doing its thing. And while they appear smart enough on the first glance, once it reads a file that's more than 100 lines it loses all memory of anything I asked it to do. The lack of failure state or any indication what might be wrong here is just frustrating. Guess local models aren't for me, unless I move to Silicon Valley and redeem my free MacBook at a local Startbucks.

by jadbox1 days ago|

prev|

[-]

[dead]