But Forth taken holistically is a do-anything-anytime imperative language, not just "concatenative" or "postfix". It has a stack but the stack is an implementation detail, not a robust abstraction. If you want to do larger scale things you don't pile more things on the stack, you start doing load and store and random access, inventing the idioms as you go along to load more and store more. This breaks all kinds of tooling models that rely on robust abstractions with compiler-enforced boundaries. I briefly tested to see what LLMs would do with it and gave up quickly because it was a complete rewrite every single time.
Now, if we were talking about a simplistic stack machine it might be more relevant, but that wouldn't be the same model of computation.
Not exactly. Not only the stack is central in the design of Forth (see my comment over there [1]).
It seems to me that a point-free language like Forth would be highly problematic for an LLM, because it has to work with things that literally are not in the text. I suppose it has to make a lot of guesses to build a theory of the semantic of the words it can see.
Nearly every time the topic of Forth is discussed on HN, someone points out that the cognitive overload* of full point-free style is not viable.
This won't show up in a smaller benchmark, because the clutching at straws tends to happen nearer to the edge of the window. The place where you can get it to give up obvious things that don't work, and actually try the problem space you've given.
What I’m investigating is if more compact languages work for querying data.
What makes you think it’s going to clutch at straws more? What makes you think it won’t do better with a more compact, localized representation?
If you're executing the operations interactively, you're seeing what's happening on the stack, and so it's easy to keep track of where you are, but if you're reading postfix expressions, it's significantly harder.
Playing with APL has really changed the way I look at both.
LLMs handle deeply nested syntax just fine - parentheses and indentation are not the hard part. Linearization is not a meaningful advantage.
In fact, it’s much more likely to be a disadvantage, much as it is for humans. Stack effects are implicit, so correct composition requires global reasoning. A single missing dup breaks everything downstream. LLMs, and humans, are much more effective when constraints are named and localized, not implicit and global.