And therein lies the rub; for years now Google's search results have returned useless SEO garbage. For now, it definitely seems like an LLM answer is better than what was being returned and I guess this is the reason why Google ripped it out.
Agentic AI has its faults, but one thing I've found it to be very good at is surfacing the "unknown unknowns": things I didn't know I should have searched for but that are directly relevant to my problem.
Sometimes that is fine, sometimes it is not
I've got a pretty solid algorithm for checking correctness: I ask the LLM for its sources, I try to find 3-5 independent ones (that are not just copying each others' answers), and if they all agree, that's very likely to be the correct answer. Simple math here: if you have 5 sources and they are each 60% likely to be correct, then an LLM choosing at random from them would have a 60% success rate, while someone checking all 5 of them for agreement would have a 1 - (0.4^5) = 99% chance of being correct. It's a good algorithm for doing other things like verifying scientific papers, too: you look for indendent research groups that have all reproduced the same findings.
I did the same thing with ten-blue-links websearch as well, and hope this would be the habit of anyone else too. (Although I know it wasn't, because I worked on Google websearch 15 years ago, on a project to increase the credibility of search results, and we did cafeteria UX studies about "What makes a credible result?" and everybody said "Because it appears as the top result on Google.")
Say I want to look up some game from my childhood, which I barely remember any details for. Going to google and trying is likely going to be very difficult unless I happen to get lucky with some key element. But if an LLM can get it right even a minority of the time, it can lead to me quickly finding the game I'm looking for.
This does depend upon the ability to evaluate the answer, like checking against source or some other option where you know a good answer from bad. If you can't, then it does become much more dangerous. Perhaps part of the reason AI seem to empower experts more than novices in some domains?
I worry that the LLMs are just the equivalent of a ‘lagging indicator’ of web quality though - that they will also soon be overwhelmed with the sheer volume of plausible nonsense that is the web now, just like search engines are.
Model collapse everywhere.
The other bots either make up links or simply don't provide any information that is distinguishable from the LLM predictive output.
Ironically Gemini is also very bad at this, while it should have been the best at Web search.
Gemini also does something very patchy, which is to provide "links" which are in fact GET queries into classic Google search. I'm guessing they did it this way because the links generated/hallucinated by the LLM were too unreliable.
Type your question in Android/Chrome search bar:
"Is …?"
AI Overview on the search results page:
"No…"
Click through to the AI mode tab/"Dive deeper with AI" CTA:
"Yes…"
Sorry, no, I hate that.
I know that deepseek has links for every chain it makes where you can read the source and it's actually a good thing to check on that.
LLMs, that can supply valid links, give me a completely different variety of results. Either I am too dumb to search manually, too impatient or google search is just broken, but Gemini usually gives me something I can work with. I just wished I could blacklist some sources like medium.
This will remove any results from there for you.
Alternatively, site:news.ycombinator.com would search this website explicitly.
Even though the result is often coherent and confidently synthesizes information from multiple experiences, it can also hallucinate, suffer from recency bias, or accidentally merge memories from different decades. AFAICT, without access to the underlying telemetry, human responses are for entertainment purposes only.
Have you tried explicitly asking for links to primary sources?
I have seen it hallucinate things confidently but that is usually when it has no direct sources to pin down the output.