upvote
Accuracy isn't a meaningful metric here without reference to a specific task.
reply
Additionally, I'd imagine quantization to have more side-effects than just slightly lower performance (on whatever task). You are basically removing information, and that information could be by chance what the model needs to fulfill it exactly the way you'd want to do - although it's still fully capable. I am not sure if this is really different from "lower performance" but open to hear your opinions.
reply
And that 2%-4% makes all the difference.
reply
Yes, it's like saying "we took off a big chunk of his brain but look! He can still breathe autonomously, swallow food and walk almost straight, which is like 95% of what he did before!"
reply