The JS code had been written carefully to avoid allocations, and also avoiding the built-in JavaScript BigInt. I rolled my own BigInt instead using an array of numbers. Each number, despite being a double, was basically a 48-bit integer. Long multiplication requires splitting a 48-bit integer into two 24-bit integers so an intermediate multiplication result will fit in 48 bits.
The C version used 32x32=64-bit integer math. (Would have been nice if WASM had supported 64x64=128-bit multiplication)
Even with the overhead of using doubles instead of integers, the JavaScript and C versions ran at nearly the same speed. I think the C version was slightly faster, but not significantly. The C version took a lot longer to load, as it had to instantiate a Webassembly object, and had to run glue code to copy things in and out of Webassembly memory.
Not surprising. The FFI boundary is always a bottleneck. If you can eliminate it, you will see where the WASM JIT shines. You have far more control over mechanical sympathy with C/WASM than JavaScript (though far from perfect).
Also, consider publishing your findings and ask for reviews for optimization opportunities.
https://hacks.mozilla.org/2026/02/making-webassembly-a-first...