> The cache coherency protocol is one of the hardest parts of a modern CPU to make both fast and correct. Most of the complexity involved comes from supporting a language in which data is expected to be both shared and mutable as a matter of course.
I feel like we live in a world where everyone works very hard to pretend that C is our best low-level language, when in reality an APL-like purely functional array language would be a better candidate.