undefined

points

by christina9718 hours ago |

comments

by aimxhaisse16 hours ago|

[-]

It even fits on a 3060 with turboquant / Q4 at decent speed (40T/s) for ~200$ (:

by 2ndorderthought17 hours ago|

prev|

[-]

Some of the early quants for qwen3.6 were broken. It's still finicky but with a little hand holding it's crazy.

Local models are the future it's awesome

by jszymborski18 hours ago|

prev|

[-]

The A4B model is blazing fast and the model is super good at general inquiries. Notably worse than Qwen 3.6 for coding tasks but that says more about the Qwen model.

by maille14 hours ago|

parent|

[-]

Bad at coding, but would it be good at code review?

by avadodin2 hours ago|

parent|

[-]

Good compared to what? Nothing? Probably better.

by moffkalast15 hours ago|

prev|

[-]

The 31B is surprisingly fast too, for a dense model. Runs tg at least twice as fast as it ought to on my machine when compared to other 30B, probably due to the hybrid attention I guess. Ingestion is somewhat slower though.