upvote
Which bits are the best is completely dependent on the particular hash function.

For instance, if the last operation during computing a hash function was the kind of integer multiplication where the result has the same length as the operands, then the top bits are the best (because they depend on more of the input bits than the bottom result bits).

On the other hand, if the last operation during computing a hash function was the kind of integer multiplication where the result has a length that is the sum of the lengths of the operands (i.e. where the high bits of the result are not discarded), then the best bits of the result are neither the top bits nor the bottom bits, but the middle bits, for the same reason as before, they are the bits that depend on more of the input bits than either the bottom bits or the top bits of the result. (This fact becomes obvious if you do a long multiplication with pen on paper. After you compute the partial products and you start to add them, you can see that when computing either the top digits or the bottom digits you need to add much less digits from the partial products than when you compute the middle digits, so the middle digits are more thoroughly mixed as functions of the input digits.)

When the last operation is an addition, because the carry bits propagate from the bottom bits towards the top bits, the top bits are the best, because the higher a bit is in the result there are more input bits on which it depends.

reply
I think the reason real-world implementations don't do this is to speed up access when the key is a small integer. Say, if your IDs are spread uniformly between 1 and 1000, taking the bottom 7 bits is a great hash, while the top 7 bits would just be zeros. So it's optimizing for a trivial hash rather than a general-purpose fast hash.

And since most languages require each data type to provide its own hash function, you kind of have to assume that the hash is half-assed and bottom bits are better. I think only Rust could make decisions differently here, since it's parametric over hashers, but I haven't seen that done.

reply
I think older processors used to have a slower implementation for shifts, which made this slower.

Nowadays swisstable and other similar hashtables use the top bits and simd/swar techniques to quickly filter out collisions after determining the starting bucket.

reply
Sorry, I've read and reread TFA but the concept still evades me. Is it that, since it's easier for a hash function to have higher entropy in the higher bits than in the lower ones, it would be more logical for hash tables to discard the lower bits and keep the higher ones?
reply
Higher entropy in the higher bits is a property of addition and multiplication when the result has the same size as the operands (i.e. the result is taken modulo 2^N, which folds back the values exceeding the size of the result).

When other operations are used, there may be other bits with higher entropy. For example, when full-length multiplication is used (i.e. where the length of the result is the sum of the lengths of the operands), the middle bits have the highest entropy, not the top bits. On CPUs like the Intel/AMD x86-64 CPUs, where fast long multiplication instructions are available, this can be exploited in more performant hash functions and PRNGs.

In hash functions, additions and multiplications are frequently used together with rotations, in order to redistribute the entropy from the top bits to the bottom bits, during the following operations.

reply
Honorary mention: byte swapping instructions (originally added to CPUs for endianness conversion) can also be used to redistribute entropy, but they're slightly slower than rotations on Intel, which is why I think they aren't utilized much.
reply
Yes, that's my point. It's not true that all hash functions have this characteristic, but most fast ones do. (And if you're using a slow-and-high-quality hash function, the distinction doesn't matter, so might as well use top bits.)
reply