Weird reasoning.
You already caught our attention with your article. But not everyone has the time or means to go and re-do the tests.
However such information is really important to surface when making infra decisions. And if one of the brain cells pops up and says something about 20-80% perf improvement VS there were some perf improvements - which would be more convincing to research the topic when the time comes for the reader to benefit from your research?
You're testing "variability" and latency, and you even mention that "modern Intel CPUs tend to ramp frequency..." but entirely neglect to mention which specific Windows Power Profile you were using.
Fundamentally, you're benchmarking a server operating system on laptops and/or desktop-class hardware, and not the same spec either. I.e.: you're not controlling for differences in memory bandwidth, SSD performance, etc...
Even on server hardware the power profiles matter! A lot more than you think!
One of my gimmicks in my consulting gig is to change Intel server power settings from "Balanced" to "Maximum Performance" and gloat as the customer makes the Shocked Pikachu face because their $$$ "enterprise grade server" instantly triples in performance for the cost of a button press.
Not to mention that by testing this in VMs, you're benchmarking three layers: The outer OS (and its power management), the hypervisor stack, and the inner guest OS.
> Processors are always locked at the highest performance state (including "turbo" frequencies). All cores are unparked. Thermal output may be significant.
> Processors are always locked at the highest performance state (including "turbo" frequencies).
Unless performance state means something idiosyncratic in MS terminology.
Normally you'd want to let idle apply power saving measures including downclocking to donate some unused power envelope to busy cores, increasing overall performance.
But this varies across various Linux based platforms. For example on RHEL (https://docs.redhat.com/en/documentation/red_hat_enterprise_...):
"throughput-performance:
A server profile optimized for high throughput that disables power savings mechanisms. It also enables sysctl settings to improve the throughput performance of the disk and network IO.
accelerator-performance: A profile that contains the same tuning as the throughput-performance profile. Additionally, it locks the CPU to low C states so that the latency is less than 100us. This improves the performance of certain accelerators, such as GPUs.
latency-performance: A server profile optimized for low latency and disables power savings mechanisms and enables sysctl settings that improve latency. CPU governor is set to performance and the CPU is locked to the low C states (by PM QoS). "
Here the latency-performance profile sounds most like the Windows Server mode (but differnet from throughput-performance).You might be benchmarking the chassis fans more than the CPUs!