Hacker News
new
past
comments
ask
show
jobs
points
by
kouteiheika
3 hours ago
|
comments
by
knollimar
1 hours ago
|
next
[-]
Isn't it not completely quantized? I thought there were some dense parts but most is int4?
reply
by
theanonymousone
2 hours ago
|
prev
|
[-]
But the huggingface link mentions BF16, F16, and I32?
reply
by
kouteiheika
2 hours ago
|
parent
|
next
[-]
Not every weight is quantized. For example, those weights which don't take much space
or
are highly important are left in higher precision. State-of-art quantization of weights is never done uniformly (i.e. to
all
weights and in the same way).
reply
by
zackangelo
1 hours ago
|
parent
|
prev
|
[-]
I don't believe safetensors has a native int4 dtype, so they packed 4 int4s into a bf16 in this checkpoint.
reply