upvote
You’re right, but this poses a separate issue as the providers then do FP4 PTQ, which is quite lossy. Reduces the model size and optimizes for Blackwells at the (imo severe) cost of performance.
reply