points
Running different GPU kernels / inference engines also matters. It's easy to write an implementation that is faster and thus cheaper but numerically much noisier / less accurate.