upvote
That reminds me of one of the easiest big wins I've had in my career. SystemD was causing issues, so I slapped in Gentoo with the real-time kernel patch. Peak latency (practically speaking, the only core metric we cared about -- some control loop doing a bunch of expensive math and interacting with real hardware) went down 5000x.

That specific advice isn't terribly transferable (you might choose to hack up SystemD or some other components instead, maybe even the problem definition itself), but the general idea of measuring and tuning the system running your code is solid.

reply
Agreed. For benchmarking I used this <https://github.com/david-alvarez-rosa/CppPlayground/blob/mai...> which relies on GoogleBenchmark and pins producer/consumer threads to dedicated CPU cores

What else could be improved? Would like to learn :)

Maybe using huge pages?

reply
kernel tickrate is a pretty big one, most people don't bother and use what their OS ships with.

Disabling c-states, pinning network interfaces to dedicated cores (and isolating your application from those cores) and `SCHED_FIFO` (chrt -f 99 <prog>) helps a lot.

Transparent hugepages increase latency without you being aware of when it happens, I usually disable that.

Idk, there's a bunch but they all depend on your use-case. For example I always disable hyperthreading because I care more about latency than processing power- and I don't want to steal cache from my workload randomly.. but some people have more I/O bound workloads and hyperthreading is just and strict improvement in those situations.

reply
Thanks. Do you happen to know why hyperthreading should be disabled?

In prod most trading companies do disable it, not sure about generic benchmarks best practices

reply
It eliminates cache contention between siblings, which leads to increased latency (randomly)
reply
There are some microarchitectural resources that are either statically divided between running threads, or "cooperatively" fought over, and if you don't need to hide cache miss latency, which is the only thing hyperthreading is really good at, you're probably better off disabling the supernumerary threads.
reply