undefined

points

by svnt8 hours ago |

comments

by operatingthetan8 hours ago|

[-]

>It needs surgical guardrails at the exact moments where its output layer flinches.

This article is very clearly shitty LLM output. Abstract noun and verb combos are the tipoff.

It's actually quite horrible, it repeats lines from paragraph to paragraph.

by smallerize8 hours ago|

parent|

[-]

I know that's one of the tells of AI-generated text, but if anything there's too much of it on this page. The article barely has any complete sentences. I think a human learned "sentence fragments == punchy" and then had too much fun writing at least some of this article.

by operatingthetan7 hours ago|

parent|

[-]

My guess is they used the 2b model to write the article as a proof of concept. Which did not prove the concept.

by fredmendoza7 hours ago|

parent|

[-]

clever guess but no lol. used claude for the writeup. the proof isn't the prose, it's the tape and the code. run it on your machine, you'll have a free private agent custom to whatever you need. that's the proof of concept.

by jchw7 hours ago|

parent|

prev|

[-]

I don't care anymore, if it happens to violate HN guidelines: Please, authors. Please write your own damn articles. We can absolutely tell that you're using Claude, I promise. (I mean, it might not be Claude specifically this time, but frankly I'd be willing to bet on it.) The AI writing is like nails on a chalkboard to me.

by operatingthetan7 hours ago|

parent|

[-]

The worst part is the phrases don't actually mean anything. It's the LLM equivalent of flowery prose. The author admitted below that the article was Claude. So there you go.

by polotics8 hours ago|

prev|

[-]

"Surgical "is the kind of wordage that LLMs seem to love to output. I have had to put in my .md file the explicit statement that the word "surgical" should only be used when referring to an actual operation at the block...

by fredmendoza8 hours ago|

parent|

[-]

you're right, they are tools. that's kind of the point. PAL is a subprocess that runs a python expression. Z3 is a constraint solver. regex is regex. calling them "surgical" is just about when they fire, not what they are. the model generates correctly 90%+ of the time. the guardrails only trigger on the 7 specific patterns we found in the tape. to be clear, the ~8.0 score is the raw model with zero augmentation. no tools, no tricks. just the naive wrapper. the guardrail projections are documented separately. all the code is in the article for anyone who wants to review it.

by mrtesthah8 hours ago|

parent|

prev|

[-]

The core issue is that the LLM is using rhetoric to try to convince or persuade you. That's what you need to tell it not to do.

by throwanem7 hours ago|

parent|

[-]

Which will not work. Don't think of a pink genitalia, I mean elephant...