undefined

points

[-]

No, I've just seen benchmarks showing most models start degrading around 4-5 bits. That's not to say they become useless, just that down to about 6-bits (with careful hybrid quantizations like unsloth where some of the layers aren't quantized or are quantized at higher bit depths) the quality isn't measurably degraded, but below that there are measurable differences in performance.

People report good results from DeepSeek V4 Flash at 2 bits (the DwarfStar 4 folks are doing it, and I've tried it on my Strix Halo, but it's too slow to be usable, so I haven't bothered to figure out if it's actually smart enough to use for anything).

Anyway, it's obvious models have to degrade in terms of knowledge, at any quantization, even though it may not show up clearly on benchmarks until lower. If you halve the size of the data available, it necessarily loses information about the world.

by hedgehog17 minutes ago|

parent|

[-]

The data I've seen is stuff like the KL Divergence comparisons that Unsloth does which show something but not clearly whether there's an observable or significant difference in task performance.

by akulbe39 minutes ago|

parent|

prev|

[-]

One of the things I'm wondering about is what I'm missing for $LLM to create files on the local FS like Claude and Codex do. What I see instead is stuff just printing to stdout, rather than files on the filesystem.

What am I missing?

by SwellJoe7 minutes ago|

parent|

[-]

You're missing an agent. The model uses tool calls to interact with the filesystem, commands on the system, optionally search (you need a search MCP server, like Brave or Exa, and API key), etc.

I usually use the Zed Agent built into Zed editor for self-hosted models, but you could use Pi, OpenCode, Hermes, Claude Code, etc. there are many, many, agents.

by hedgehog21 minutes ago|

parent|

prev|

[-]

The model just predicts text, Claude Code etc parse the output and do the actual file creation (or run shell commands that do it). If you have Claude Code installed look in ~/.claude/projects/... and you can see the transcripts of your actual sessions, or install Mini-SWE-Agent and play with that to get a feel for what's going on.