upvote
Anthropic serves quantized versions of their models and you can run q8 locally.
reply
I don't even use Sonnet anymore. Current feels worse than Claude 3.5 couple years ago. They have quantized that much? Switched to GPT 5.5, let's see how long it will stay good.
reply