upvote
>which buffers everything in memory

gnu sort can spill to disk. it has a --buffer-size option if you want to manually control the RAM buffer size, and a --temporary-directory option for instructing it where to spill data to disk during sort if need be.

reply
> I’d like to know the memory profile of this. The bottleneck is obviously sort which buffers everything in memory.

That's not obvious to me. I checked the manuals for sort(1) in GNU and FreeBSD, and neither of them buffer everything in memory by default. Instead they read chunks to an in-memory buffer, sort each chunk, and (if there are multiple chunks) use the filesystem as temporary storage for an external mergesort.

This sorting program was originally developed with memory-starved computers in mind, and the legacy shows.

reply