upvote
O(1) only if you're working in a language with length-stored strings (like Pascal[0]), right?

In something like C with its generic strings[1], it would surely have to be O(n) since you have to scan the entire string to calculate its length?

(I have always been terrible at big-O, mind.)

[0] There's probably more of them by now.

[1] ie. not a specific length-stored string type.

reply
Yes. But that's a known misfeature of C and no other language does it like that. Plus I kind of meant arbitrary byte strings where you can have embedded zeroes and thus have to know the length.
reply
You'd probably want 'return len(str)', if this is Python?

In any case, for some applications this is indeed a great hash function. Programs like rmlint use it as part of their checks for duplicate files: if files have different lengths, they can't have the same content after all.

reply
Yes, too much Rust :D Eliding the `return` becomes so intuitive after a while that you sort of forget that most other languages require it.
reply
Haskell and the Lisps also work like this.
reply
Re: missing return keyword

Actually, in Python, None is a valid key...

(I'm so sorry. JavaScript has ruined me.)

reply
Any `return c` for some constant is a valid and correct hash function. It just has a lot of collisions and degenerates hash-maps to terrible performance. That was in fact my first thought when I read "simplest hash functions".
reply
That's why I said "probably".
reply
If you use the identity function as your hashing function then is it O(0) because you are done before you start?
reply
For an arbitrarily long input, you still have to compress it to constant size somehow.
reply
Or pad all entries with 0s to an arbitrarily long size. The 0s can be assumed, and not actually stored. Therefore arbitrarily long entries need not be shortened.
reply