However, if you evaluate carefully the LLM core function, i.e. in a fixed order, you will obtain perfectly deterministic results (except on some consumer GPUs, where, due to memory overclocking, memory errors are frequent, which causes slightly erroneous results with non-deterministic errors).
So if you want deterministic LLM results, you must audit the programs that you are using and eliminate the causes of non-determinism, and you must use good hardware.
This may require some work, but it can be done, similarly to the work that must be done if you want to deterministically build a software package, instead of obtaining different executable files at each recompilation from the same sources.
This is a good overview of why LLMs are nondeterministic in practice: https://thinkingmachines.ai/blog/defeating-nondeterminism-in...