undefined

points

by adrian_b9 hours ago |

comments

by pixl977 hours ago|

[-]

If you want a deterministic LLM, just build 'Plain old software'.

by KeplerBoy8 hours ago|

prev|

[-]

It's not even hard, just slow. You could do that on a single cheap server (compared to a rack full of GPUs). Run a CPU llm inference engine and limit it to a single thread.

by usernametaken299 hours ago|

prev|

[-]

Only that one is built to be deterministic and one is built to be probabilistic. Sure, you can technically force determinism but it is going to be very hard. Even just making sure your GPU is indeed doing what it should be doing is going to be hard. Much like debugging a CPU, but again, one is built for determinism and one is built for concurrency.

by wat100007 hours ago|

parent|

[-]

GPUs are deterministic. It's not that hard to ensure determinism when running the exact same program every time. Floating point isn't magic: execute the same sequence of instructions on the same values and you'll get the same output. The issue is that you're typically not executing the same sequence of instructions every time because it's more efficient run different sequences depending on load.

This is a good overview of why LLMs are nondeterministic in practice: https://thinkingmachines.ai/blog/defeating-nondeterminism-in...