upvote
Definitely could be, but in the time I spent talking to the 4-bit models in comparison to the 16-bit original it seemed surprisingly capable still. I do recommend benchmarking quantized models at the specific tasks you care about.
reply
Yes I was wondering why they mentioned those numbers without mentioning their practical significance.
reply