upvote
At least in the early 2000s, Bloomberg had strict requirements about this. Their financial terminal has a ton of math calculations. The requirement was that they always had live servers running with two different hardware platforms with different operating systems and different CPU architectures and different build chains. The math had to agree to the same bitwise results. They had to turn off almost all compiler optimisations to achieve this, and you had to handle lots of corner cases in code: can't trust NaN or Infinity or underflow to be portable.

They could transparently load balance a user from one different backend platform to the other with zero visible difference to the user.

reply
Ah the old Enterprise Service Bus...
reply
> If you want this to work across ARM and x86 (or even multiple ARM vendors), you are screwed, and need to restrict yourself to using only the basic arithmetic operations and reimplement everything else yourself.

Is this problematic for WASM implementations? The WASM spec requires IEEE 754-2019 compliance with the exception of NaN bits. I guess that could be problematic if you're branching on NaN bits, or serializing, but ideally your code is mostly correct and you don't end up serializing NaN anyway.

reply
I'm sure you know, but for others reading: even on the same architecture, there is more to floating point determinism than just running the same "x = a + b" code on each system. There's also the state of the FPU (eg: rounding modes) that can affect results.

On older versions of DirectX (maybe even in some modern Windows APIs?) there were cases where it would internally change the FPU mode, causing chaos for callers trying to use floats deterministically[1].

[1] https://gafferongames.com/post/floating_point_determinism/ (see the Elijah quote, especially)

reply
As far as I know, the ARM (at least aarch64) situation should be about the same as x86-64. Anything specific that's bad about it? (there's aarch32 NEON with no subnormal support or whatever, but you can just not use it if determinism is the goal)

that RECIP14 link is AVX-512, i.e. not available on a bunch of hardware (incl. the newest Intel client CPUs), so you wouldn't ever use it in a deterministic-simulation multiplayer game anyway, even if you restrict yourself to x86-64-only; so you're still stuck to the basic IEEE-754 ops even on x86-64.

x86-64 is worse than aarch64 is a very important aspect - baseline x86-64 doesn't have fused multiply-add, whereas aarch64 does (granted, the x86-64 FMA extension came out around not far from aarch64/armv8, but it's still a concern, such is life). Of course you can choose to not use fma, but that's throwing perf away. (regardless you'll want -ffp-contract=off or equivalent to make sure compiler optimizations don't screw things up, so any such will need to be manual fma calls anyway)

reply
The Steam hardware survey currently has FMA support at 97%, which is the same level as F16C, BMI1/2, and AVX2. Personally, I would consider all of these extensions to be baseline now; the amount of hardware not supporting them is too small to be worth worrying about anymore.
reply
We use floating point operations with deterministic lockstep with a server compiled on GCC in Linux a windows client compiled with MSVC in windows, and an iOS client running on ARM which I believe is compiled with clang.

Works fine.

This is a not a small code base, and no particular care has been taken with the floating point operations used.

reply
I'm pretty sure he is talking about deterministic output.
reply