upvote
They serve it about 2x slower. So it must have about 2x the active parameters.

It could still be 10x larger overall, though that would not make it 10x more expensive.

reply
I agree that Opus almost definitely isn't anywhere near that big, but AWS throughput might not be a great way to measure model size.

According to OpenRouter, AWS serves the latest Opus and Sonnet at roughly the same speed. It's likely that they simply allocate hardware differently per model.

reply