undefined

points

[-]

I'll grant that you can guarantee the length of the output and, being a computer program, it's possible (though not always in practice) to rerun and get the same result each time, but that's not guaranteeing anything about said output.

by satvikpendem10 hours ago|

parent|

[-]

What do you want to guarantee about the output, that it follows a given structure? Unless you map out all inputs and outputs, no it's not possible, but to say that it is a fundamental property of LLMs to be non deterministic is false, which is what I was inferring you meant, perhaps that was not what you implied.

by program_whiz9 hours ago|

parent|

[-]

Yeah I think there are two definitions of determinism people are using which is causing confusion. In a strict sense, LLMs can be deterministic meaning same input can generate same output (or as close as desired to same output). However, I think what people mean is that for slight changes to the input, it can behave in unpredictable ways (e.g. its output is not easily predicted by the user based on input alone). People mean "I told it don't do X, then it did X", which indicates a kind of randomness or non-determinism, the output isn't strictly constrained by the input in the way a reasonable person would expect.

by yunwal7 hours ago|

parent|

[-]

The correct word for this IMO is "chaotic" in the mathematical sense. Determinism is a totally different thing that ought to retain it's original meaning.

by wat100007 hours ago|

parent|

prev|

[-]

They didn't say LLMs are fundamentally nondeterministic. They said there's no way to deterministically guarantee anything about the output.

Consider parameterized SQL. Absent a bad bug in the implementation, you can guarantee that certain forms of parameterized SQL query cannot produce output that will perform a destructive operation on the database, no matter what the input is. That is, you can look at a bit of code and be confident that there's no Little Bobby Tables problem with it.

You can't do that with an LLM. You can take measures to make it less likely to produce that sort of unwanted output, but you can't guarantee it. Determinism in input->output mapping is an unrelated concept.

by silon429 hours ago|

parent|

prev|

[-]

You can guarantee what you have test coverage for :)

by rightofcourse8 hours ago|

parent|

[-]

haha, you are not wrong, just when a dev gets a tool to automate the _boring_ parts usually tests get the first hit

by bdangubic8 hours ago|

parent|

prev|

[-]

depends entirely on the quality of said test coverage :)

by mhitza8 hours ago|

prev|

[-]

If you self-host an LLM you'll learn quickly that even batching, and caching can affect determinism. I've ran mostly self-hosted models with temp 0 and seen these deviations.

by phlakaton7 hours ago|

prev|

[-]

But you cannot predict a priori what that deterministic output will be – and in a real-life situation you will not be operating in deterministic conditions.

by zbentley9 hours ago|

prev|

[-]

Practically, the performance loss of making it truly repeatable (which takes parallelism reduction or coordination overhead, not just temperature and randomizer control) is unacceptable to most people.

by wat100007 hours ago|

parent|

[-]

It's also just not very useful. Why would you re-run the exact same inference a second time? This isn't like a compiler where you treat the input as the fundamental source of truth, and want identical output in order to ensure there's no tampering.

by 4ndrewl10 hours ago|

prev|

[-]

If you also control the model.

by simianparrot10 hours ago|

prev|

[-]

A single byte change in the input changes the output. The sentence "Please do this for me" and "Please, do this for me" can lead to completely distinct output.

Given this, you can't treat it as deterministic even with temp 0 and fixed seed and no memory.

by dwattttt10 hours ago|

parent|

[-]

Interestingly, this is the mathematical definition of "chaotic behaviour"; minuscule changes in the input result in arbitrarily large differences in the output.

It can arise from perfectly deterministic rules... the Logistic Map with r=4, x(n+1) = 4*(1 - x(n)) is a classic.

by satvikpendem10 hours ago|

parent|

[-]

Correct, it's akin to chaos theory or the butterfly effect, which, even it can be predictable for many ranges of input: https://youtu.be/dtjb2OhEQcU

by adrian_b9 hours ago|

parent|

prev|

[-]

Which is also the desired behavior of the mixing functions from which the cryptographic primitives are built (e.g. block cipher functions and one-way hash functions), i.e. the so-called avalanche property.

by satvikpendem10 hours ago|

parent|

prev|

[-]

Well yeah of course changes in the input result in changes to the output, my only claim was that LLMs can be deterministic (ie to output exactly the same output each time for a given input) if set up correctly.

by layer810 hours ago|

parent|

[-]

You still can’t deterministically guarantee anything about the output based on the input, other than repeatability for the exact same input.

by exe349 hours ago|

parent|

[-]

What does deterministic mean to you?

by layer88 hours ago|

parent|

[-]

In this context, it means being able to deterministically predict properties of the output based on properties of the input. That is, you don’t treat each distinct input as a unicorn, but instead consider properties of the input, and you want to know useful properties of the output. With LLMs, you can only do that statistically at best, but not deterministically, in the sense of being able to know that whenever the input has property A then the output will always have property B.

by peyton7 hours ago|

parent|

[-]

I mean can’t you have a grammar on both ends and just set out-of-language tokens to zero. I thought one of the APIs had a way to staple a JSON schema to the output, for ex.

We’re making pretty strong statements here. It’s not like it’s impossible to make sure DROP TABLE doesn’t get output.

by layer85 hours ago|

parent|

[-]

You still can’t predict whether the in-language responses will be correct or not.

As an analogy: If, for a compiler, you verify that its output is valid machine code, that doesn’t tell you whether the output machine code is faithful to the input source code. For example, you might want to have the assurance that if the input specifies a terminating program, then the output machine code represents a terminating program as well. For a compiler, you can guarantee that such properties are true by construction.

More generally, you can write your programs such that you can prove from their code that they satisfy properties you are interested in for all inputs.

With LLMs, however, you have no practical way to reason about relations between the properties of inputs and outputs.

by satvikpendem6 hours ago|

parent|

prev|

[-]

And also have a blacklist of keywords detecting program that the LLM output is run through afterwards, that's probably the easiest filter.

by tsimionescu7 hours ago|

parent|

prev|

[-]

I think they mean having some useful predicates P, Q such that for any input i and for any output o that the LLM can generate from that input, P(i) => Q(o).

by exe342 hours ago|

parent|

[-]

If you could do that, why would you need an LLM? You'd already know the answer...

by tsimionescu21 minutes ago|

parent|

[-]

Having that property is still a looooong way away from being able to get a meaningful answer. Consider P being something like "asks for SQL output" and Q being "is syntactically valid SQL output". This would represent a useful guarantee, but it would not in any way mean that you could do away with the LLM.

by idiotsecant10 hours ago|

parent|

prev|

[-]

You don't think this is pedantry bordering on uselessness?

by WithinReason9 hours ago|

parent|

[-]

No, determinism and predictability are different concepts. You can have a deterministic random number generator for example.

by satvikpendem10 hours ago|

parent|

prev|

[-]

It's correcting a misconception that many people have regarding LLMs that they are inherently and fundamentally non-deterministic, as if they were a true random number generator, but they are closer to a pseudo random number generator in that they are deterministic with the right settings.

by 9 hours ago|

parent|

[-]

deleted

by albedoa6 hours ago|

parent|

prev|

[-]

The comment that is being responded to describes a behavior that has nothing to do with determinism and follows it up with "Given this, you can't treat it as deterministic" lol.

Someone tried to redefine a well-established term in the middle of an internet forum thread about that term. The word that has been pushed to uselessness here is "pedantry".

by exe349 hours ago|

parent|

prev|

[-]

Let's eat grandma.

by 9 hours ago|

parent|

prev|

[-]

deleted

by yunohn10 hours ago|

prev|

[-]

I initially thought the same, but apparently with the inaccuracies inherent to floating-point arithmetic and various other such accuracy leakage, it’s not true!

https://arxiv.org/html/2408.04667v5

by layer810 hours ago|

parent|

[-]

This has nothing to do with FP inaccuracies, and your link does confirm that:

“Although the use of multiple GPUs introduces some randomness (Nvidia, 2024), it can be eliminated by setting random seeds, so that AI models are deterministic given the same input. […] In order to support this line of reasoning, we ran Llama3-8b on our local GPUs without any optimizations, yielding deterministic results. This indicates that the models and GPUs themselves are not the only source of non-determinism.”

by yunohn5 hours ago|

parent|

[-]

I believe you've misread - the Nvidia article and your quote support my point. Only by disabling the fp optimizations, are the authors are able to stop the inaccuracies.

by layer83 hours ago|

parent|

[-]

First, the “optimizations” are not IEEE 754 compliant. So nondeterminism with floating-point operations is not an inherent property of using floating-point arithmetics, it’s a consequence of disregarding the standard by deliberately opting in to such nondeterminism.

Secondly, as I quoted the paper is explicitly making the point that there is a source of nondeterminism outside of the models and GPUs, hence ensuring that the floating-point arithmetics are deterministic doesn’t help.