upvote
This sort of theoretical result is not always as clear-cut as you suggest.

Computers are finite machines. There is a theorem that although a machine with finite memory can add, multiplication requires unbounded memory. Somehow we muddle along and use computers for multiplication anyway.

More to your point there is a whole field of people who write useful programs using languages in which every program must be accompanied by a proof that it halts on all inputs.

(See for example https://lean-lang.org/ or David Turner's work on Total Functional Programming from about 20 years ago.)

Other examples are easy to find. The simplex algorithm for linear optimization requires exponential time in general, and the problem it solves is NP-hard, but in practice works well on problems of interest and is widely used. Or consider the dynamic programming algorithms for problems like subset-sum.

Theory is important, but engineering is also important.

reply
> There is a theorem that (...) multiplication requires unbounded memory

What theorem is that?

The multiplication of any two integers below a certain size (called "words") fits in a "double word" and the naive multiplication algorithm needs to store the inputs, an accumulator and at most another temporary for a grand total of 6*word_size

Sure, you can technically "stream" carry-addition (which is obvious from the way adders are chained in ALU-101) and thus in a strict sense addition is O(1) memory but towards your final point:

> Theory is important, but engineering is also important.

In practice, addition requires unbounded memory as well (the inputs). And it's definitely compute-unbounded, if your inputs are unbounded.

I dislike the term "we muddle along". IEEE 754 has well specified error bars and cases, and so does all good data science. LLMs do not, or at least they do not expose them to the end user

So then, how exactly do we go about proving that the result of chaining prompts is within a controllable margin of error of the intended result? Because despite all the specs, numerical stability is the reason people don't write their own LAPACK.

reply
But it's not like these systems make theory go away, they make compromises. So the question is, what's the compromise required for an algorithm that can check the conformance of computer programs to natural language specifications that doesn't involve hoping for the best?
reply
Natural language specifications often aren't specifications at all: interpreting them requires context that is not available to the computer, and often not even available to the specification's authors without further research / decision-making.

LLMs address this problem by just making things up (and they don't do a great job of comprehending the natural language, either), which I think qualifies as "hoping for the best", but I'm not sure there is another way, unless you reframe the problem to allow the algorithm to request the information it's missing.

reply