A n-tuple notation would have been more readable and mathematically accurate like (Q=K, V), (Q, K=V), and (Q=K=V).
In fact, on the second last page of the paper, they discuss this very problem. There is a clear correlation between performance and increasing sequence lengths for the Q-K=V model. While limited to a tight n=3 sample between 512, 1024, 2048 lengths, the degradation decreases from 5.4% to 2.2% as context is increased, suggesting that it is unlikely shorter sequences are the reason K=V performs acceptably.
I agree with GP that it's super confusing to us the minus sign as a delimiter between formulas. The tuple notation suggested elsewhere would be way clearer.