Let's say the AI gives them faulty advice, that makes them over-confident, and try something and fail. Usually that just means a relatively benign mistake — since AIs generally avoid advising anything genuinely risky — and after they have recovered, they will have the benefit of more real world experience, which raises their odds of eventually trying something again and this time succeeding.
Sometimes trying something, anything, is better than nothing. Action — regardless of the outcome — is its own discovery process.
And much of what you learn when you act out in the world is generally applicable, not just domain-specific knowledge.
I just want all sides of the question explored, instead of reflexively framing AI's impact as harmful.
Every other AI I've tried is a real sycophant.
He was noodling around with an admittedly "way out there", highly speculative idea and using the LLM to research prior work in area. This evolved into the LLM giving him direct feedback. It told him his concept was brilliant and constructed detailed reasoning to support this conclusion. Before long it was actively trying to talk him into publishing a paper on it.
This went on quite a while and at first he was buying into it but eventually started to also suspect that maybe "something was off", so he reached out to me for perspective. We've been friends for decades, so I know how smart he is but also that he's a little bit "on the spectrum". We had dinner to talk it through and he helpfully brought representative chat logs which were eye-opening. It turned into a long dinner. Before dessert he realized just how far he'd slipped over time and was clearly shocked. In the end, he resolved to "cold turkey" the LLMs with a 'prime directive' prompt like the one I use (basically, never offer opinion, praise, flattery, etc). Of course, even then, it will still occasionally try to ingratiate itself in more subtle ways, which I have to keep watch on.
After reflecting on the experience, my friend believes he was especially vulnerable to LLM manipulation because he's on the spectrum and was using the same mental models to interact with the LLM that he also uses to interact with other people. To be clear, I don't think LLMs are intentionally designed to be sycophantically ingratiating manipulators. I think it's just an inevitable consequence of RLHF.
"You're exactly right, you organized and paid for the date, that created a social debt and she failed to meet her obligation in that implicit deal."
"You're exactly right, no one can understand your suffering, nothingness would be preferable to that."
"You're exactly right, that politician is a danger to both the country and the whole world, someone stopping him would become a hero."
We have already seen how personalized content algorithms that only prioritize getting the user to continue to use the system can foment extremism. It will be incredibly dangerous if we follow down that path with AI.
For "chat" chat, strict hygiene is a matter of mind-safety: no memory, long exact instructions, minimum follow-ups, avoiding first and second person if possible etc.
relevant video for that.
but I think you are on to something here with the origin of the sycophancy given that most of these models are owned by billionaires.
In the "like being kicked in the head by a horse every day" sense.