undefined

points

by scrlk7 hours ago |

comments

by johnnyApplePRNG3 hours ago|

[-]

>unquantised -> FP8 is pretty much lossless

Claude Shannon is rolling in his grave.

by gpm1 minutes ago|

parent|

[-]

I don't know, sounds quite similar to his rate distortion theorem (analyzing minimum number of bits/symbol you need to stay under some fixed amount of distortion). I.e. lossy compression with a maximum amount of loss. I.e. "pretty much lossless" compression.

https://en.wikipedia.org/wiki/Rate%E2%80%93distortion_theory (

by ComputerGuru4 hours ago|

prev|

[-]

Do infra providers reveal that level of implementation detail?

by scrlk4 hours ago|

parent|

[-]

I've seen a few articles from providers talking about KV cache quantisation, but it's not something they explicitly point out like they do with weights.

So you could end up paying more for unquantised weights, only to get silently hit with a quantised KV cache...