undefined

points

[-]

Hash functions and PRNGs are closely related, they share many properties and they can be built from the same algorithmic components, so for many kinds of PRNGs there are corresponding kinds of hash functions and vice-versa.

Nevertheless, the purposes of hash functions and PRNGs are different and complementary.

A PRNG receives a short fixed-length value (the seed) and it expands it into a long pseudo-random sequence of arbitrary length.

A hash function receives a long input sequence of arbitrary length and it generates a short fixed-length pseudo-random value.

Good PRNGs are injective functions and good hash functions are surjective functions.

Normally the design methods for PRNGs and for hash functions should be presented together, because it is easy to interconvert algorithms for one of them with algorithms for the other. For instance, given a good hash function one could make a PRNG by computing the hashes of a sequence of numbers or the hashes of a sequence of strings of increasing length, and given a good PRNG one could make a hash function by accumulating somehow the input into a PRNG seed and taking the first generated number, or better by using input chunks as seeds and then accumulating the first generated numbers into a single value.

However for a successful conversion between PRNG and hash function algorithms, the source algorithm may have have to be overdesigned, to guarantee good enough properties even after the conversion.

When an algorithm is designed directly as a hash function or as a PRNG, with clearly specified requirements, it can be designed only as good as strictly necessary, enabling thus a better performance.

by derriz3 hours ago|

prev|

[-]

Could you explain what you mean here?

Hashes are _functions_ so provide the same output given the same input.

If you don't reseed the RNG after every hash computation, then you break this vital property of hashes.

And if you do reseed, then your claim boils down to "every hash function is just an XOR against a contstant" which certainly is not true either.

by eru4 hours ago|

prev|

[-]

Sorry, what?

That might we one very particular way to write a hash function, but it's far from the only one.

Believe it or not, for some purposes taking the first few bytes of a string or even just the length of the string are good hash functions.

by andai3 hours ago|

parent|

[-]

Well, that's technically also a deterministic random number generator! (I want to say it's not a great one, but... that's apparently context-dependent!)

What are those purposes?

by eru1 hours ago|

parent|

[-]

If your input is i.i.d. random, then truncating works great. Eg if your keys are UUIDs then truncating can work well.

Another use:

Suppose you write a tool like rmlint that is looking for duplicate files. Generally, you compute some hash for each file, see if you got any duplicates, and then compare the relevant files directly.

A traditional hash like crc or sha256 takes O(n) to compute. But for files you can start with some cheaper hashes, like file length. After all, files of different length can't have the same content. Taking the first few bytes of your file is another cheap 'hash' you can compute.

Only when these cheap 'hashes' show that you have a potential duplicate, do you go and pay for a more expensive hash.