upvote
>This shows the downside of using AI to write up your project. I see the eloquent sentences, but don't get the message.

Not really sure what this obsession with calling things you don't like AI generated is but it's poor form. If you have something to say about the text then say it. Otherwise leave baseless accusations out of it.

>What's the benefit? Is it speed? Where are the benchmarks? Is it that you can backprop through this computation? Do you do so?....

It's pretty clearly an ideological thing. Some people are firmly on the 'some sort of symbolic logic is necessary' camp. From the article, 'A system that cannot compute cannot truly internalize what computation is.'

Some things are just interesting for the sake of it. This is one of those things. I don't agree with the authors on the above and I'm still glad they shared. It's a very interesting read regardless.

reply
> If you have something to say about the text then say it.

I could point out the individual phrases and describe the overall impression in detail, or I can just compactly communicate that by using the phrase "AI". If it bothers you, read it as "AI-like", so there is a pretension.

I have no problem with using AI for writing. I do it too, especially for documentation. But you need to read it and iterate with it and give it enough raw input context. If you don't give it info about your actual goals, intentions, judgments etc, the AI will substitute some washed-out, averaged-out no-meat-on-the-bone fluff that may sound good at first read and give you a warm wow-effect that makes you hit publish, but you read into it all the context that you have in your head, but readers don't have that.

Formatting and language is cheap now. We need a new culture around calling out sloppy work. You would not have had a problem with calling out a badly composed rambling article 5 years ago. But today you can easily slap an AI filter on it that will make it look grammatical and feel narratively engaging, now it's all about deeper content. But if one points that out, replies can always say "oh, you can't prove that, can you?"

reply
>"This shows the downside of using AI to write up your project."

I just find phrases like this a bit obnoxious at times.

>You would not have had a problem with calling out a badly composed rambling article 5 years ago.

Then why not just say that? It's rambling bla bla bla. What's so hard about that? Why invent a reason for issues, as if rambling articles didn't get written 5 years ago.

Like No, being written by an LLM or not is not the reason the article has no benchmarks or interpretability results. Those things would be there regardless if the author was interested in that, so again, it just seems there's little point in making such assertions.

reply
It's very hard to discuss this. To some people it's obvious, to some it isn't. To me, every single paragraphs is obvious fluff AI writing. One problem with it is the repetitiveness and the schmoozing salesman feel. The other is the lack of benchmarks and stuff. It's both. The two are connected because the AI has to lean in to its bullshitter persona when it's not given enough raw material to write up something strong. But whenever an AI writes in its default voice like this, it also indicates that the context was not well curated.

But anyway, yes, I can also just move on to the next article. Most of the time I indeed do that.

reply
For what it’s worth, I agree with you; the article is LLM written although not with the usual gotchas, so they’re more subtle.

The subtle ones like this I don’t mind too much, as long as they get the content correct, which in this case leaves quite a bit to be desired.

I’m also noticing that some people around me appear to just be oblivious to some LLM signals that bother me a lot, so people consume media differently.

I absolutely do believe that AI generated content needs to be called out, although at this point it’s safe to say that pretty much all online content is LLM written.

reply
I'm glad they shared too! Wish they shared without letting the LLM process it so heavily, it makes it too hard to read, it gives monotone importance to every piece of text. Mostly it does this by bringing everything up to a slight over-importance with tone and fluff language, and by turning everything into dry statements of fact.

As to why people call this out without going into great detail about the problems with the actual text, it's because this is happening all over the place and it's very disrespectful to readers, who dig into an article that looks very well written on the surface, only to discover it's a lot of labor to decode and often (but not always) a total waste of time. Asking for a critical report of the text is asking even more of a reader who already feels duped.

reply
I got the same impression as the parent post. Even if its not AI-generated, the text reads like a politician's speech at a lot of places. Talks a lot, says little.

The idea itself was very cool, so I endured it. But it was not a pleasant read.

reply
This is a nice case study of the downside of creating explicit policies of "no AI comments" without a technical method of enforcing it. I am sure the hacker news comment quality will suffer almost as much from an escalating culture of accusation and paranoia that it will from LLM comment themselves.
reply
Agreeing first that it is genuinely interesting, let me make a constructive comment on the text: Early on, there are too many small paragraphs that don't on their own make a cogent argument. That important but easily overlooked structural work is pushed back to the reader. I felt rewarded in pushing past that though. Bravo.
reply
> Not really sure what this obsession with calling things you don't like AI generated is but it's poor form

Admonishing someone for correctly identifying AI-written or AI-edited blog posts is poor form, friend.

It is without a doubt written by an LLM. All of the telltale signs are there. I work with these tools 8-20 hours a day and after a while the verbiage and grammatical structures stick out like a sore thumb.

Get off the high horse. I too think this is a very interesting read. I was fascinated with the subject, but the presentation was nauseatingly distracting and immediately sets off yellow flags about how Percepta operates, and what kind of quality they're willing to settle with. It tells me they are more interested in appearances and superficiality.

The numbers that are there categorically cannot be trusted, because hallucinating those details is quite common for models. There is simply no indication that a human adequately proof-read this and therefore any of its claims must be taken with a grain of salt. Don't forget the recent Cloudflare+Matrix debacle: https://news.ycombinator.com/item?id=46781516

I share the same concerns as OP; this post lacks metrics and feels like someone did something cool and raced to get an AI to post about it, instead of giving it a proper treatment.

reply
I don't care how sure you are. Honestly, it's irrelevant. 99% of the time, it's a more pleasant and productive conversation for everyone involved if you just focus on issues you had with the text itself than any nebulous AI involvement.

From my point of view, all you've done is said a lot of nonsense and fabricated a convoluted explanation for why you think the text is bad. I'm fine on my horse thanks.

reply
People can no longer freely point out when the fact that a piece of work is automated and the lack of meat are red flags as to the veracity of the content, but your antagonistic metacommentary for other people pointing out factual information is welcome discourse?

You claimed "this obsession with calling things you don't like AI generated" is "poor form", attacking the parent commenter by claiming they are lying about the nature of the content. However, multiple people have pointed out the clear signs which you missed, and the consensus is that you were wrong. Now you suddenly don't care about this point, and have introduced a new argument instead.

"From my point of view, all you've done is said a lot of nonsense and fabricated a convoluted explanation for why you think the text is bad"

What a bad-faith response. Categorically dismissive, vague, antagonistic and ultimately failing to critically engage with anything I said.

reply
Whether a piece of work is automated and 'lacks meat' is ultimately not something you can know for sure as a reader. Articles like this existed plenty Pre-AI and will exist plenty post-AI, involvement or not, so yeah pretty pointless to focus on that. It adds nothing and all we have to go is your own surety, which is fallible. If you can't recognize that then there's not much to say.

I didn't miss anything. I never cared about it one way or another. What clear signs have people pointed out ? This is the problem. It's apparently so obvious yet even the original commenter admits "It's things humans do too". What is clear about that ?

reply
Your inability to recognize the clear imprint of current-generation language models on this article doesn't mean they're not present.

All knowledge is ultimately fallible, but ignoring or not being able to appreciate the high statistical likelihood of this article being LLM edited/generated doesn't change reality.

You're asking me to share my expertise with you so that you can understand, but your antagonistic overtones make it not feel worth the time and effort. Other readers have also pointed out that it has characteristic idiosyncrasies. Feel free to look into it yourself, but it would also be wise to learn to defer these kinds of attacks until you have all the information.

reply
The post is the perfect example of the kind of writing about AI that dupes people that don't really understand how things like LLMs actually work and are actually trained. Anyone who properly understands these things finds the complete and total lack of detail about training and the loss function (and of course real metrics / benchmarks) to be a monstrous red flag here.

Especially egregious to me is the claim "Because the execution trace is part of the forward pass, the whole process remains differentiable: we can even propagate gradients through the computation itself". This is total weasel-language: e.g. we can propagate any weights through any transformer architecture and all sorts of other much more insane architectural designs, but that is irrelevant if you don't have a continuous and differentiable loss function that can properly weight partially-correct solutions or the likelihood / plausibility of arbitrary model outputs. You also need a clearer source of training data (or way to generate synthetic data).

So for e.g. AlphaFold, we needed to figure out a loss function that continuously approximated the energy configuration of various molecular configurations, and this is what really allowed it to actually do something. Otherwise, you are stuck with slow and expensive reinforcement-based systems.

The other tells are garbage analogies ("Humans cannot fly. Building airplanes does not change that; it only means we built a machine that flies for us"). Such analogies add nothing to understanding, and indeed distract from serious/real understanding. Only dupes and fools think you can gain any meaningful understanding of mathematics and computer science through simplistic linguistic analogies and metaphors without learning the proper actual (visuspatial, logical, etc) models and understanding. Thus, people with real and serious mathematical understanding despise such trite metaphors.

But then, since understanding something like this properly requires serious mathematical understanding, copy like that is a huge tell that the authors / company / platform puts bullshitting and sales above truth and correctness. I.e., yes, a huge yellow flag.

reply
> Is it speed?

> Is it that you can backprop through this computation? Do you do so?

With respect, I feel that you may not have read the article.

> Because the execution trace is part of the forward pass, the whole process remains differentiable: we can even propagate gradients through the computation itself. That makes this fundamentally different from an external tool. It becomes a trainable computational substrate that can be integrated directly into a larger model.

and,

> By storing points across nested convex hulls, this yields a decoding cost of O(k+log⁡ n).

and,

> Regardless of their eventual capability ceiling, they already suggest a powerful systems primitive for speeding up larger models.

So yes, and yes.

> Where are the benchmarks?

Not clear what they should benchmark it against. They do compare speed to a normal KV Cache. As for performance.. if it's actually executing a Sudoku solver with a 100% success rate, it seems pretty trivial to find any model doing < 100% success rate. Sure, it would be nice to see the data here, agree with you there.

Personally I think it would be really interesting to see if this method can be combined with a normal model MoE-style. It is likely possible, the router module should pick up quite quickly that it predicts the right tokens for some subset of problems deterministically. I like the idea of embed all sorts of general solvers directly into the model, like a prolog solver for example. In fact it never would have occurred to me to just go straight for WASM, pretty interesting choice to directly embed a VM. But it makes me wonder what "smaller" interpreters could be useful in this context.

reply
I read the article and had the same question. It's written in such a way that it feels like it's answering these questions without actually doing so.

The right thing to benchmark against isn't a regular transformer, it's a transformer that writes programs that are then interpreted. They have a little visual demo where it looks faster but only because they make Python absurdly slow, and it's clearly not meant to be a real benchmark.

I spent the whole article thinking, wow, cool, but also ... how is this better than an LLM steering a regular computer? The closest we get is a statement about the need to "internalize what computation is" which doesn't say anything to me.

Fundamentally, running actual instructions on a real CPU is always going to be faster than running them via a neural network. So the interesting part is where they say you can backprop through it, but, ok, backprop is for cases where we don't know how to encode a function using strict logic. Why would you try and backprop through a Sudoku solver? It's probably my imagination is just limited but I could have used more on that.

reply
Benchmark it against a fast Python interpreter optimized for AI tool calling, like Monty: https://github.com/pydantic/monty
reply
Did you read the post you are responding to? It says:

> What's the benefit? Is it speed? Where are the benchmarks? Is it that you can backprop through this computation? Do you do so?

The correct parsing of this is: "What's the benefit? [...] Is it [the benefit] that you can backprop through this computation? Do you do so?"

There are no details about training nor the (almost-certainly necessarily novel) loss function that would be needed to handle partial / imperfect outputs here, so it is extremely hard to believe any kind of gradient-based training procedure was used to determine / set weight values here.

reply
> There are no details about training

my understanding was that they are not training at all, which would explain that. they are compiling an interpreter down to a VM that has the shape of a transformer.

ie they are calculating the transformer weights needed to execute the operations of the machine they are generating code for.

reply
This is my interpretation as well.

EDIT: Actually, they do make this clear(ish) at the very end of the article, technically. But there is a huge amount of vagueness and IMO outright misleading / deliberately deceptive stuff early on (e.g. about potential differentiability of their approach, even though they admit later they aren't sure if the differentiable approach can actually work for what they are doing). It is hard to tell what they are actually claiming unless you read this autistically / like a lawyer, but that's likely due to a lack of human editing and too much AI assistance.

reply
Well, for one, by eliminating external tool calling, the model gains an amount of security. This occurs because the tools being called by an LLM can be corrupted, and in this scenario corrupted tools would not be called.
reply
The key difference is that the model is able to write the program as it’s executing it.

Before it needs to write the code and have an external program execute it. Here it can change its mind mid execution. Kinda like what was observed in the CoT’s ah ha moment

reply
What are the AI tells? The only one I found is redundancy, but it makes sense because this is trying to be approachable to laymen.

Like, you have a great point (the benefit of this approach isn't explained), but that's a mistake humans frequently make.

reply
Here is a rough list, some may be contentious individually, but the more of these appear, the more you should suspect an LLM:

Cadence and rhythm: LLMs produce sentences with an extremely low variability in the number of clauses. Normal people run on from time to time, (bracket in lots of asides), or otherwise vary their cadence and rhythm within clauses more than LLMs tend to.

Section headings that are intended to be "cute" and "snappy" or "impactful" rather than technically correct or compact: this is especially a tell when the cuteness/impactfulness is deeply mismatched with the seriousness or technical depth of the subject matter.

Horrible trite analogies that show no actual real understanding of the actual logical, mathematical, or visuo-spatial relationships involved. I.e. analogies are based on linguistic semantics, and not e.g. mathematical isomorphism or core dynamics. "Humans cannot fly. Building airplanes does not change that; it only means we built a machine that flies for us". Can't imagine a more retarded and useless analogy for something as complex as the article topic.

Verbose repetition: The article defines two workarounds: "tool use" and "agentic" orchestration, then defines them, then in the paragraph immediately following, says the exact same thing. There are basically multiple (small paragraphs) that all say nothing at all more than the sentence "LLMs do not reliably perform long, exact computations on their own, so in practice we often delegate the execution to external tools or orchestration systems".

Pseudo-profound bullshit: (https://doi.org/10.1017/S1930297500006999). E.g. "A system that cannot compute cannot truly internalize what computation is." There is thankfully not too much of this in the article, and it appears mostly early on.

Missing key / basic logic (or failing to mention such points clearly) when this would be strongly expected by any serious practitioner or expert: E.g. in this article, we should have seen some simple nice centered LaTeX showing the scaled dot-product self attention equation, and then some simple notation to represent the `.chunk` call, and subsequent linear projection, something like H = [H1 | H2], or etc., I shouldn't have to squint at two small lines of PyTorch code to find this. It should be clear immediately this model is not trained, and this is essentially just compiling a VM into a Transformer, and not revealed more clearly only at the end.

reply
I read a lot of LLM text every day, so I'm quite good at seeing the cadence, the narrative structures and the phrasing styles. It's not just "it's not just X but Y" or emdashes. I could point them out and you would say oh humans use this trope or phrasing style too, and of course that's true. It's still a tell. But it's pointless to argue about this.
reply
deleted
reply
Honestly, the most interesting thing here is definitely that just 2D heads are enough to do useful computation (at least they are enough to simulate an interpreter) and that there is an O(log n) algorithm to compute argmax attention with 2D heads. It seems that you could make an efficient pseudosymbolic LLM with some frozen layers that perform certain deterministic operations, but also other layers that are learned.
reply
I wish people put half as much energy into actually doing things as they did to complaining about AI generated text. We'd have ascended to energy based being about 18 months ago.
reply