Still confused though. For a standard TCP/IP networking stack, that support is all there anyway, as it's not meant for point-to-point links, and out-of-order delivery is a thing that happens on the Internet. I haven't tried thunderbolt-net, but it says it implements Apple's ThunderboltIP, so I'd expect it's IP-based networking on top, and so it'd all work? Is it that out-of-order delivery is far more common than usual, and this path is so much slower (by impairing LRO/GRO) that it's not worth aggregating at all?
I'd understand if each pair is logically represented as a separate networking device, and then you have to set up link aggregation on top of that. (And iirc at least with some forms of aggregation a particular flow is bound to one link, so you'd have to have a bunch of streams to actually get bandwidth benefits.) So caveats for sure but I'd expect something to be possible. But does it just not support using both pairs at all?
Even with using one pair I still don't understand why you'd only get about 10G rather than 20G on a pair. I do see chapter 4 of the (your?) article talks about the single DMA ring maybe imposing the 10 Gbps limit but I don't have any good intuition for why. I don't know say how large the rings are or what latencies to expect on their operations or what packet sizes are supported which might help me understand.