undefined

points

by nicole_express14 hours ago |

comments

by pmontra8 hours ago|

[-]

It is not LLM specific. The conclusion of the post states

> The web was already being poisoned for search and link ranking long before LLMs existed.

But it continues

> We are now plugging generative models directly into that poisoned pipeline and asking them to reason confidently about “truth” on our behalf.

So it's a shift from trust Google to trust the AI, which might be more insidious or not, depends on the individual attitude of each of us.

by bambax5 hours ago|

parent|

[-]

It's a shift but it's a little worse. Checking/auditing search results is easier and more ingrained; even if many people don't do it, everyone has been hit by spam at some point, everyone knows it exists.

LLMs are the same thing but have an air of authority about them that a web search lacks, at least for now.

by justusthane49 minutes ago|

parent|

[-]

I listen to a podcast. The hosts are not tech people. They don't know much about AI, but they play around with it to the extent that most people do. They're both media professionals with long careers in radio news. They closely follow the news, and are very aware of how LLMs hallucinate (and have experienced it themselves).

Recently one of them asked Gemini a very detailed question about some specific baseball stats and was exclaiming over the quality of the information he got back and how it would have been impossible or at least extremely difficult to find the information via a traditional search.

It wasn't until his cohost asked if he had verified the information that be realized no, he hadn't, he had just immediately taken it at face value.

I recognize this is a single anecdote, but I think it illustrates that there is a tendency to trust what an LLM gives you, when it's stated so factually and with so much detail -- even if you should know better.

by teiferer3 hours ago|

parent|

prev|

[-]

To me that's the opposite. Whatever an LLM gives me, I view with skepticism. If I google sth then I quickly get a sense of how much I can trust it and what the BS factor is. I can refine my view in either case, but my a priori trust with an LLM is much lower.

Maybe we just need to work on training the general population to have a similar bias. (It will be harder than it sounds. Unbelievable amounts of capital are being bet on this not happening.)

by saghm2 hours ago|

parent|

[-]

In a discussion with my father-in-law about whether ChatGPT was trained on copyrighted materials, he literally asked ChatGPT and treated its response that it wasn't as useful evidence. He went to MIT, so he's arguably more educated than most people will ever be, so it's hard for me to be optimistic that trying to just explain this to people better will move the needle significantly.

by bambax3 hours ago|

parent|

prev|

[-]

Yes, it's the same for me, but we're not representative of most people I'm afraid.

by SchemaLoad12 hours ago|

prev|

[-]

The difference imo is removing the information from the source. Previously you'd use the source of the information to gauge how much you trust it. If it's a reddit post or a no name website you'd likely be skeptical if it doesn't seem backed up by better sources. But now the info is coming from an LLM that you generally trust to be knowledgeable. And the language it uses backs up this feeling.

The OP post is highlighting how incredibly easy it is for a very small amount of information on the web to completely dictate the output of the LLM in to saying whatever you want.

by Levitating54 minutes ago|

parent|

[-]

> But now the info is coming from an LLM that you generally trust

But it's not from the LLM, the LLM clearly cites the wikipedia article as its source. This is just performing an internet search with extra steps, and ending up with misinformation because somebody vandalized wikipedia.

by seanhunter3 hours ago|

prev|

[-]

It's not. He vandalised wikipedia and then talked about LLMs in his writeup to gain attention.

by latexr2 hours ago|

prev|

[-]

> I'd google it and probably find the same result, and have no reason not to believe it.

Have you truly looked at the website?

https://6nimmt.com

I’d say there’s obvious reason to not believe it, or at least check another source. The website just seems fishy. Why would a website exist for just that one post? Sure, they could’ve made the website more believable, but that takes more effort and has more chances for something to jump out at you.

And therein lies a major difference between searching the web and asking an LLM. When doing the former, you can pick up on clues regarding what to trust. For example, a website you’ve visited often and has proven reliable will be more trustworthy to you than one you’ve never been to before. When asking an LLM, every piece of information is provided in the same interface, with the same authoritative certainty. You lose a major important signal.

by yen22313 hours ago|

prev|

[-]

A lot of people seem to think this to be an LLM problem, but you're right.

This is a general epistemological problem with relying on the Internet (or really, any piece of literature) as a source of truth

by chneu6 hours ago|

parent|

[-]

The LLM part of the "new" problem is the speed at which it can proliferate and the trust people seem to have in AI answers. Idk

by freakynit10 hours ago|

prev|

[-]

Because outside of the tech community (in fact, many even inside of it), almost 100% of the folks consider what these chatgpt like tools answer as the truth without questioning it, or cross-verifying it even once.

by hobofan6 hours ago|

parent|

[-]

In that case most of the mitigations listed by the author don't help though (e.g. surfacing the source). That's also no different than traditional works with citations (be it Youtube videos or peer-reviewed academic papers), where anybody rarely verifies what's written in the cited sources.

The only real alternatives would be:

- Kicking off a deep research-like investigation for each simple query

- Introducing a trusted middleman for sources, significantly cutting down the available information (e.g. restricting Wikipedia to locked-down/moderated pages)

- Not having any information at all, as at some point you can rarely every verify anything depending on how hard your definition of "verify" is

by locallost6 hours ago|

prev|

[-]

You would also find other results (this assumes what you're searching for is not a random made up thing). The issue with LLMs is IMHO bigger because it will give you answers as a matter of fact without any other consideration.

by refulgentis14 hours ago|

prev|

[-]

Closed it after “This house of cards only needs a $12 domain!”, right under “Sorry, Wikipedia.”, right under their Wikipedia edit.

by sdthjbvuiiijbb12 hours ago|

parent|

[-]

It's also clearly AI generated writing. That doesn't help its credibility or interest. I'm extremely suspicious of people who use AI to write an ostensibly personal blog, for all the usual obvious reasons.

by apublicfrog12 hours ago|

parent|

[-]

What are you basing that on? I'm usually pretty good at sniffing out AI writing, and it smells human to me.

by sdthjbvuiiijbb1 hours ago|

parent|

[-]

The line "This is the part that really matters." and the line "This is the circular citation pattern, and it’s one of the most under discussed attacks on the “retrieval augmented generation” trust model. " both raised flags. AI absolutely loves writing about the One Weird Trick that dentists don't want you to know. They love talking about "what really matters" or saying something is "the most under discussed" thing.

Then we get to the section "Why This Is A Bigger Deal Than It Looks". The title of this section again raises similar flags to before. But the bulleted list of:

1. The retrieval layer (immediately) 2. The model training corpus layer (months to years) 3. The agent layer (where the money is)

Absolutely reeks of AI. This list with this sequence of parentheticals is exactly how LLMs write, both structurally and the specific phrasing. This was the point where I felt confident enough to publicly accuse the post of AI writing.

I could go on with the prose in this section... How about "The attack surface is not hypothetical, it’s the default case."? Or "The cleanup problem for corpus poisoning is genuinely unsolved as of 2026."? (LLMs wildly overuse "genuine(ly)" and "real")

by riffraff5 hours ago|

parent|

prev|

[-]

I had the impression it was AI writing too because of the second half of the article. The first part looks genuine, the part since "trust laundering" smells fake: the scary single sentence followed by a whole paragraph of single clause sentences hints at AI.

Perhaps we've all just become paranoid, but even if it's not LLMs writing this, it now puts me off. And the AI image at the top of the page does not help with the feeling.

by chneu6 hours ago|

parent|

prev|

[-]

Agreed. Nothing about this post really stood out as AI. It didn't raise a single flag for me.

I think calling something AI generated is just a lazy way of dismissing stuff nowadays.

by jameshart1 hours ago|

parent|

[-]

This paragraph under ‘Trust Laundering’ is when it hit my AI writing trigger threshold:

> This is the circular citation pattern, and it’s one of the most under discussed attacks on the “retrieval augmented generation” trust model. It doesn’t require compromising Wikipedia’s infrastructure with l33t hacker skills. It doesn’t require social engineering an editor. You just simply write the source yourself, cite yourself on Wikipedia, and let the trust flow downstream. Easy peasy!

“It doesn’t X. It doesn’t Y. You just Z. Conclusion”

Once I saw that some other elements stood out too.

There’s a set of bullet points under ‘Thae Approach’ where each bullet starts with a bolded phrase: “one domain”, “one press release”, “one Wikipedia edit”, followed by a bolded sentence “The whole thing took maybe about twenty minutes”.

The emphasis here on irrelevant quantifiable optimizations - who cares that it only needs one of each of three things and it took under twenty minutes? - with unnecessary faux-profundity is a strong AI tell.

Add to that that the writer talks in the article about using AI generation to produce the content for the poisoning site, the suggestion that he used it to write up a blog post about this is hardly an implausible suggestion.

by sdthjbvuiiijbb1 hours ago|

parent|

prev|

[-]

If this truly set off zero flags for you then you're probably just not very attuned to LLM writing style. I've noticed that most people are not.

I posted a bunch of specifics in a reply to the GP since I was quite annoyed with being accused of "a lazy way of dismissing stuff". It's nothing of the sort. I am a very good reader and I have read a lot of LLM writing and a lot of human writing.

by malfist11 hours ago|

parent|

prev|

[-]

Why is agents (where the money is)? Fake profundity is abound in the post

by esquivalience7 hours ago|

parent|

[-]

The author has been using parenthetical comments like that since at least 2017, judging by a review of old posts on that site.