It is too bad, though. People bad at English will themselves be reading this forever now and think this is the way real people write, speak, or are supposed to.
It's many things. The relentless ethusiasm about everything. Prefacing any answer to a question with an affirmation that it was a good question first. And yes, sorry, pedants of the web who feel witch-hunted because you knew how to employ keyboard shortcuts and used em-dashes in 2015 and have the receipts to prove it -- you never used 17 in the span of a single page. I think that was the first I can remember using ever and I had to contrive a way to do it where a semi-colon wouldn't clearly work better.
The purpose of language is to communicate meaning and intent, not to sound or feel a particular way, unless you're reading for entertainment or enjoyment.
This is the second post I'm commenting on within a span of like 30 minutes where someone did some really good work and shared it, but the top comments are complaining about AI usage.
Either LLM-assisted content needs to be banned entirely (might be), or complaining about it should be considered a breach of etiquette at sites like HN that are tech-centric.
Calling it a spellchecker is simply wrong if you give an LLM some bullet points and then instruct it to write an article. I find it more insincere because it's an extra layer between the author and the reader which substantially affects every aspect of the piece of writing, not just the spelling of individual words, or Microsoft Word nagging you to avoid passive voice.
If OP is not a native English speaker and is using an LLM to create a reasonable prose, then it might be the best way for them to try and communicate their ideas. It's probably better than Google translate. It affects how the reader interprets the writing, though.
My other point, which I also stand by, is that I find the default writing style of current LLMs exhausting to read. It feels like a college student has submitted an assignment on engaging writing and decided to use every technique they could find in their textbook, because they want to get top marks. It just feels forced to me.
--------------------------------
As an example, I asked claude to make my argument more "clear". See how it wrote it:
Style isn't separate from content — it is content. The way something is written shapes how a reader interprets its meaning, and that's always been true. Calling an LLM a "spellchecker" only holds if it's catching typos. The moment you hand it bullet points and ask it to produce an article, it's not correcting your writing — it's replacing it. That's a fundamentally different thing.
I'll grant one exception: if someone isn't a fluent English speaker and uses an LLM to bridge that gap, that's a legitimate trade-off, even if it still changes how the reader experiences the piece.
But my broader complaint stands independent of that debate: current LLMs produce a recognizable, exhausting prose style. Every sentence is engineered to be "engaging." Every paragraph hits the expected beats. It reads like someone who learned to write from a listicle about writing — technically compliant, but hollow. The effort to sound compelling ends up undercutting any sense that a real person with a real perspective is behind it.
That's just crazy, do you think people don't get discriminated because of that? they'll probably get flagged and blacklisted from HN just because of sharing a post riddled with grammar mistakes, it will look like spam to many. If they get lucky, the top comments would be correcting their grammar mistakes, not about the content.
If you didn't talk to me before today, you don't know how I talk. You don't know what sincere is like. the term you're looking for is authentic not sincere. questioning the sincerity of the OP is just wrong. You don't like people having control over how what they have to say is conveyed to others, because you have some irrational bias against the usage of a particular tool.
You argue and even use AI (you don't mind being insincere? I'd like to get your own original arguments, how about that?) to dismiss content because of style, thereby justifying the need for people to be careful of the style of the post they share. Have you considered that had they not used AI, you or others would be dismissing their post for other style-related reasons? because you care about style so much.
But you're right, style is content, it was wrong of me to claim otherwise. What I meant was probably "meaning". The writing style affects how you read the content, in this case you don't like how it forces you to read it, but the meaning OP is trying to communicate (what I meant by "content") is being glossed over.
The take away for me from this discussion, is people need to use better prompts, and better models, not that they shouldn't use an LLM, because even when their grammar and spelling is wrong, they get nitpicked against this way.
> The effort to sound compelling ends up undercutting any sense that a real person with a real perspective is behind it.
That's a fault and a bias by the reader, in my opinion. I didn't even think it was LLM written, I wasn't looking for it (we tend to find what we're looking for?). My focus was on what was done, validating the claims made, and analyzing the implications. I didn't care how they sounded, because I was able to actually read the content, and understand what they were saying. If it was the other way, and I was the OP, I would want people to focus on what I was saying, and appreciate that I took some action to ensure my post is readable.
I think they can use better prompts to make it sound and feel better, but it's a real shame that they have to. It is this sort of an interaction that makes me wish we had more LLMs making decisions instead of humans out there. Things like accents, writing styles, even last names, and spelling mistakes decide the fate of many today. The real value people bring, the real human potential is dismissed (not in this case, just making a general observation), cosmetic and performative factors override all else.
> it's not correcting your writing — it's replacing it. That's a fundamentally different thing.
It is my writing, in that I agreed the meaning of the rewritten content is what I intended to communicate. People get to have agency on how their meaning is conveyed. You don't have any say over that. Your criticism over how it feels, although I disagree, is legitimate, but your criticism based solely on the fact that AI rewrote the content is entirely invalid.
Let's imagine OP had a human copy write for them, editing and rewriting the entire content, would that change anything? If not, why are we talking about LLMs instead of the specifics of what bothered you uniquely, so that people reading this thread can use better prompts to avoid those annoying pitfalls?
I didn't even pick up on this being AI rewritten, I'm only taking yours and others' word for it. My biggest concern these days is that kids are growing up interacting with LLMs a lot, and their original work will be dismissed by older people because it sounds like an LLM. There are many cases of students having their work and exams dismissed, even facing disciplinary actions leading up to lawsuits, where teachers/academics claimed wrongly it was LLM generated content (and why I keep feeling that perhaps LLMs should replace those biased academics and teachers if possible).
LLM usage isn't going away, perhaps prompts and models will improve, but more likely than not, it is more economical and practical for humans to be forced to adapt one way or the other, to regular LLM usage by other humans. If you skip in 50 year increments and read books or news stories, you'll also see how the writing style and "feel" is very different. There is a very distinctive "feel" to how people on HN write, compared to reddit, gaming discord servers, twitter, bluesky, or the comments section of some conservative site. You'll see some groups use terms like "bro" and "bruh" a lot, others end everything with "lol", others yet include emoji in everything. All this will feel very weird and inappropriate to someone from the 1800s. I am not saying all that to dismiss your observations, but to say that this stuff isn't all that important. If you didn't think the cause of the annoying writing style was an LLM, I doubt you would have commented on it, so don't comment at all about it is my suggestion. There was no egregious writing style offense that was so serious that we need to talk about it, instead of the actual work OP is sharing.
I think the idea of sharing the raw prompt traces is good. Then I can feed that to an LLM and get the original information prior to expansion.
> What’s needed is something different:
> Requirement ptrace seccomp eBPF Binary rewrite Low overhead per syscall No (~10-20µs) Yes Yes Yes [...]
Even if you disallow executing anything outside of the .text section, you still need the syscall trap to protect against adversarial code which hides the instruction inside an immediate value:
foo: mov eax, 0xc3050f ;return a perfectly harmless constant
ret
...
call foo+1
(this could be detected if the tracing went by control flow instead of linearly from the top, but what if it's called through a function pointer?)I first assumed it was redirecting them to a library in user mode somehow, but actually the syscall is replaced with "int3", which also goes to the kernel. The whole reason why the "syscall" instruction was introduced in the first place was that it's faster than the old software interrupt mechanism which has to load segment descriptors.
So why not simply use KVM to intercept syscall (as well as int 80h), and then emulate its effect directly, instead of replacing the opcode with something else? Should be both faster and also less obviously detectable.
It is possible to restrict the call-flow graph to avoid the case you described, the canonical reference here is the CFI and XFI papers by Ulfar Erlingsson et.al. In XFI they/we did have a binary rewriter that tried to handle all the corner cases, but I wouldn't recommend going that deep, instead you should just patch the compiler (which funnily we couldn't do, because the MSVC source code was kept secret even inside MSFT, and GCC source code was strictly off-limits due to being GPL-radioactive...)
[1] <https://github.com/google/gvisor/blob/master/pkg/sentry/plat...>
Also gVisor (aka runsc) is a container runtime as well. And it doesn't gatekeep syscalls but chooses to re-implement them in userland.
I wonder if there's any mechanism that works for intercepting static ELF's like Go programs and such.
Inside the guest, there's no kernel to attach strace to — the shim IS the syscall handler. But we do have full observability: every syscall that hits the shim is logged to a trace ring buffer with the syscall number, arguments, and TSC timestamp. It's more complete than strace in some ways — you see denied calls too, with the policy verdict, and there's no observer overhead because the logging is part of the dispatch path.
So existing tools don't work, but you get something arguably better: a complete, tamper-proof record of every syscall the process attempted, including the ones that were denied before they could execute. I'll publish a follow-on tomorrow that details how we load and execute this rewritten binary and what the VMM architecture looks like.
How secure does this make a binary? For example would you be able to run untrusted binary code inside a browser using a method like this?
Then can websites just use C++ instead of javascript for example?
A process is already a hermetically sealed sandbox. Running untrusted code in a process is safe. But then the kernel comes along and pokes holes in your sandbox without your permission.
On Linux you should be able to turn off the holes by using seccomp.
example how we used it in early 2000s to implement pre linux namespace containerization.
https://www.usenix.org/legacy/publications/library/proceedin... (note the shepherd and where kubernetes arguably got the pod name from).
and security policies on top of it
https://www.usenix.org/legacy/event/lisa07/tech/full_papers/...
What's stopping the process from reading its own memory and seeing that the syscall was patched?
This is the kind of foundation that I would feel comfortable running agents on. It’s not the whole solution of course (yes agent, you’re allowed to delete this email but not that email can’t be solved at this level)… let me know when you tackle that next :-)
Fortunately libc doesn't mmap that much internally so I think I can get away alright with interposing lib's mmap call.