upvote
You can also look at past posts by the same author (before LLM usage proliferated) if you’re curious.

The project is still very cool, but it’s a little less enjoyable to read when everything sounds the same. It would be just as annoying for people to manually write in a corporate/marketing style, because humanity is what makes the small web interesting.

https://blog.tymscar.com/posts/privategithubcicd/

reply
I’m glad I’ve started this blog before the AI wave so I can prove people I’m just weird at writing.

It grinds my gears how so many people just talk about my writing style instead of the content.

reply
This, setting aside the llm issue, it is dealing with hardware in ways that -- one would think - would be celebrated on HN of all places. But we focus on presentation.
reply
Because their custom training data contains an emphasis on such verbiage. It doesn't come from the God-knows-how-many TB of web content the model is pre-trained on. There, such phrasing is only a drop in the sea. But the "yes, you're right" phrases, the em dash, etc., come from the later stage, for which content is created according to some (probably overprecise) guidelines.
reply
Right. The overuse of "genuinely" most of all. Seems like they put Claude through a few good rounds of training to always answer questions about its consciousness, thoughts, etc., with something about how it's "genuinely unsure," and as a result, the model learned to use "genuinely" as an intensifier in all sorts of inappropriate contexts.
reply
Oi, I personally use adverbs everywhere. Genuinely, kids these days.
reply
Marketing content.
reply
> Where do you think llms learned to write that way?

Not from individual human content, that's for sure - maybe MLM marketing copy? Sleazy 4AM ads?

I mean, every time this response comes up, I keep asking the person to point at something written prior to 2022 that gets 80%+ on the LLM detectors, and yet no one can find anything.

Maybe you, postalrat, can find something written in this style that was published prior to 2022.

reply
I have written the blog post. I know empirically that I have used 0% AI while writing it. I also know LLM detectors are total BS and they don't really work. I have tried a couple on this exact blog post, and QuillBot, for example, gave me 0% AI detected on it.

I have then used a blog post of mine from 2021. QuillBot gave me 8%...

The King James version of the Bible came out at almost 100% AI generated a while ago. It was the HN front page.

Stop thinking that if someone writes in a way that is fun or looks like what you would think an AI writes, then it is AI generated. Loads of the time it is, but sometimes it's not, and it really hurts those like me.

reply
It's a function of the LLM "thought process"! It's not really modeled after human speech. It is in short segments but not long form, same reason you see the same rather odd nuances in LLM generated code.

If they way you thought was to run a bunch of if statements, generate content, then feed that content back to get a "score" of what seems the most plausible, run the if statements again, and adjust / merge responses, then you would write similarly. The recognizable cadence of LLM generated content is pretty clearly the result of a lot of if statements being fused together.

reply