upvote
KV cache compression, so how much memory the model needs to use for extending its context. Does not affect the weight size.
reply