upvote
On a model of this size quantization has much less impact on quality of output. I'm running a 3bit version and find it comparable to sonnet, almost opus.
reply
it is not a flat quant but a dynamic
reply