upvote
MapReduce is nice but it doesn't, by itself, help you reason about pushdowns for one. Parquet, for example, can pushdown select/project/filter, and that's lost if you have MapReduce. And a reduce is just a shuffle + map, not very different from a distributed join. MapReduce as an escape hatch over what is fundamentally still relational algebra may be a good intuition.
reply
Performance aside it seems you could do most maybe a the ops with those three. I say three because your sneaky plus is a union operation. So map, reduce and union.

But you are also allowing arbitrary code expressions. So it is less lego-like.

reply
Reductions are painful because they specify a sequence of ordered operations. Runtime is O(N), where N is the sequence length, regardless of amount of hardware. So you want to work at a higher level where you can exploit commutativity and independence of some (or even most) operations.
reply
You can reduce in parallel. That was the whole point of MapReduce. For example, the sum abcdefgh can be found by first ab, cd, ef, gh; then those results (ab)(cd), (ef)(gh); then the final result by (abcd)(efgh). That's just three steps to compute seven sums.
reply
You're right it's primarily a runtime + compiler + language issue. I really don't understand why people tried to force functional programming in environments without decent algebraic reasoning mechanisms.

Modern graph reducers have inherent confluence and aren't reliant on explicit commutation. They can do everything parallel and out of order (until they have to talk to some extrinsic thing like getting input or spitting out output), including arbitrary side-effectual mutation. We really live in the future.

reply
Reduce is massively parallel for commutative operations
reply