upvote
BTW, the whole situation with IRQ accounting disabled reminds me the -fomit-frame-pointer case. For a long time there was no practical performance reason, but the option had been used... Making slower and harder to build stacks both for perf analyses and for stack unwinding in languages like C++.

After careful reading I'm surprised how small IRQ squares build up 30%. Should search for interrupts when I inspect our flamegraphs next time.

reply
I was doing over 11M IOPS during that test ;-)

Edit: I wrote about that setup and other Linux/PCIe root complex topology issues I hit back in 2021:

https://news.ycombinator.com/item?id=25956670

reply
FYI 11M IOPS in terms of AWS EBS is 138 gp3 volumes (80K IOPS each), which costs about $56K/month or about $1.3M over 2 years. If anyone was considering using EBS for high-IOPS workloads, don't.

I think your test had 10 980 Pros, which were probably around $120 each at the time (~$1200 total). SSDs are wildly more expensive now, but even if you spend $500 each, it's nowhere close to EBS.

It's apples vs oranges, but sometimes you just want fruit.

reply
That's super hot. Especially the update with the 37M IOPS reference. Might be very useful for my next tasks related to a setup with 6 NVMe disks: 1. Get all disks saturated through the network (including RDMA usage). 2. Play with io_uring to share a polling thread. Currently, no luck: if I share kernel poller between two devices then improvement is just +30% (at a cost of 1 core). Considering alternative schemes now.
reply
Unfortunately, we don't have a proper measurements for IOPOLL mode with and without IOMMU, because initially we didn't configure IOPOLL properly. However, I bet that this mode will be affected as well, because disk still has to write using IOMMU.

You suggest a very interesting measurements. I will keep it in my mind and try during next experiments. Wish I have read this before to apply during the past runs :)

reply
Yeah you'd still have the IOMMU DMA translation, but would avoid the interrupt overhead...
reply