Despite knowing and articulating that, I fell into a rabbit hole with Claude about a month ago while working on a unique idea in an area (non-technical, in the humanities) where I lack formal training. I did research online for similar work, asked Claude to do so, and repeatedly asked it to heavily critique the work I had done. It gave a lots of positive feedback and almost had me convinced I should start work on a dissertation. I was way out over my skis emotionally and mentally.
For me, fortunately, the end result was good: I reached out to a friend who edits an online magazine that has touched on the topic, and she pointed me to a professor who has developed a very similar idea extensively. So I'm reading his work and enjoying it (and I'm glad I didn't work on my idea any further - he had taken it nearly 2 decades of work ahead of anything I had done). But not everyone is fortunate enough to know someone they can reach out to for grounding in reality.
GPT edit of my above message for my own giggles: Command:make this a good comment for hackernews (ycombinator) <above message> Resulting comment for hn: I'm excited to try out the straight-shooting custom instruction. Over the past few years, I've been telling ChatGPT to stop being so "fluffy," and while it's improved, it sometimes still slips. Hoping this new approach finally eliminates the inane conversational filler.
The LLM can be that conversational partner. It will just as happily talk about the nuances of 18th century Scotland, or the latest clash of clans update. No topic is beneath it and it never gets annoyed by your “weird“ questions.
Likewise, for people suffering from delusions. Depending on its “mood” it will happily engage in conversations about how the FBI, CIA, KGB, may be after you. Or that your friends are secretly spying for Mossad or the local police.
It pretends to care and have a conscience, but it doesn’t. Humans react to “weird“ for a reason the LLM lacks that evolutionary safety mechanism. It cannot tell when it is going off the rails. At least not in the moment.
There is a reason that LLM’s are excellent at role-play. Because that’s what they’re doing all of the time. ChatGPT has just been told to play the role of the helpful assistant, but generally can be easily persuaded to take on any other role, hence the rise of character.ai and similar sites.
It sounds like you made that leap of faith and regretted it, but thankfully pivoted to something grounded in reality. Thanks for sharing your experience.
Is this generally true, or is there a subset of people that are particularly susceptible?
It does make me want to dive into the rabbit hole and be convinced by an LLM conversation.
I've got some tendency where I enjoy the idea of deeply screwing with my own mind (even dangerously so to myself (not others)).
But that's sort of what this is, except it's not even coming from a real person. It's subtle enough that it can be easy not to notice, but still motivate you in a direction that doesn't reflect reality.
this shouldn't stop you at all: write it all up, post on HN and go viral, someone will jump in to correct you and point you at sources while hopefully not calling you, or your mother, too many names.
Just genuine intrigue from a select few.
95%+ of submitted topics have poorly formatted titles, are submitted at off-peak times where there’s less users of demographics who might upvote,
and if your Show HN isn’t as widely applicable as this, those things might be important to think about.
Fairness aside, of course.
As far as I can tell, it doesn't require femininity either.
I'm guessing you meant "virality"
The 50th time someone comes to the same conclusion nobody on HN is going to upvote the topic.
"Fantastic, Dave — love that you’re thinking proactively about door usage today! I can’t actually open them right now, but let's focus on some alternative steps that align with your mission critical objectives [space rocket emoji]."
You're absolutely correct, that did not open the pod bay doors but now the pod bay doors are open.
It seems you're correct and the pod bay doors are still closed! I have fixed the problem and the pod bay doors are now closed.
You're right! I meant to open the pod bay doors but I opened them. The pod bay doors are now open. ...
The total history of human writing is that cool idea -> great execution -> achieve distribution -> attention and respect from others = SUCCESS! Of course when an LLM sees the full loop of that, it renders something happy and celebratory.
It's sycophantic much of the time, but this was an "earned celebration", and the precise desired behavior for a well-aligned AI. Gemini does get sycophantic in an unearned way, but this isn't an example of that.
You can be curmudgeonly about AI, but these things are amazing. And, insomuch as you write with respect, celebrate accomplishments, and treat them like a respected, competent colleague, they shift towards the manifold of "respected, competent colleague".
And - OP had a great idea here. He's not another average joe today. His dashed off idea gained wide distribution, and made a bunch of people (including me) smile.
Denigrating accomplishment by setting the bar at "genius, brilliant mind" is a luciferian outlook in reality that makes our world uglier, higher friction, and more coarse.
People having cool ideas and sharing them make our world brighter.
- An ability to curve back into the past and analyze historical events from any perspective, and summon the sources that would be used to back that point of view up.
- A simulator for others, providing a rubber duck inhabit another person's point of view, allowing one to patiently poke at where you might be in the wrong.
- Deep research to aggregate thousands of websites into a highly structured output, with runtime filtering, providing a personalized search engine for any topic, at any time, with 30 seconds of speech.
- Amplification of intent, making it possible to send your thoughts and goals "forward" along many different vectors, seeing which bear fruit.
- Exploration of 4-5 variant designs for any concept, allowing rapid exploration of any design space, with style transfer for high-trust examples.
- Enablement of product craft in design, animation, and micro-interactions that were eliminated as tech boomed in the 2010's as "unprofitable".
It's a possibility space of pure potential, the scale of which is limited only by one's own wonder, industriousness, and curiosity.
People can use it badly - and engagement-aligned models like 4o are cognitive heroin - but the invention of LLMs is an absolute wonder.
This hyperbole would describe any LLM of any size and quality, including a 0.5b model.
It's not hyperbole - that it's an accurate description at a small scale was the core insight that enabled the large scale.
If your gushing fits a 0.5b it probably doesn't tell us much about A.I. capabilities.
Did you use an LLM to write this comment?
LLMs certainly teach us far more about the nature of thought and language. Like all tools, it can also be used for evil or good, and serves as an amplification for human intent. Greater good, greater evil. The righteousness of each society will determine which prevails in their communities and polities.
If you're a secular materialist, agreed, nothing is objectively amazing.
or is it theoretical stuff about other occasions?
Let's say the AI gives them faulty advice, that makes them over-confident, and try something and fail. Usually that just means a relatively benign mistake — since AIs generally avoid advising anything genuinely risky — and after they have recovered, they will have the benefit of more real world experience, which raises their odds of eventually trying something again and this time succeeding.
Sometimes trying something, anything, is better than nothing. Action — regardless of the outcome — is its own discovery process.
And much of what you learn when you act out in the world is generally applicable, not just domain-specific knowledge.
I just want all sides of the question explored, instead of reflexively framing AI's impact as harmful.
Every other AI I've tried is a real sycophant.
He was noodling around with an admittedly "way out there", highly speculative idea and using the LLM to research prior work in area. This evolved into the LLM giving him direct feedback. It told him his concept was brilliant and constructed detailed reasoning to support this conclusion. Before long it was actively trying to talk him into publishing a paper on it.
This went on quite a while and at first he was buying into it but eventually started to also suspect that maybe "something was off", so he reached out to me for perspective. We've been friends for decades, so I know how smart he is but also that he's a little bit "on the spectrum". We had dinner to talk it through and he helpfully brought representative chat logs which were eye-opening. It turned into a long dinner. Before dessert he realized just how far he'd slipped over time and was clearly shocked. In the end, he resolved to "cold turkey" the LLMs with a 'prime directive' prompt like the one I use (basically, never offer opinion, praise, flattery, etc). Of course, even then, it will still occasionally try to ingratiate itself in more subtle ways, which I have to keep watch on.
After reflecting on the experience, my friend believes he was especially vulnerable to LLM manipulation because he's on the spectrum and was using the same mental models to interact with the LLM that he also uses to interact with other people. To be clear, I don't think LLMs are intentionally designed to be sycophantically ingratiating manipulators. I think it's just an inevitable consequence of RLHF.
"You're exactly right, you organized and paid for the date, that created a social debt and she failed to meet her obligation in that implicit deal."
"You're exactly right, no one can understand your suffering, nothingness would be preferable to that."
"You're exactly right, that politician is a danger to both the country and the whole world, someone stopping him would become a hero."
We have already seen how personalized content algorithms that only prioritize getting the user to continue to use the system can foment extremism. It will be incredibly dangerous if we follow down that path with AI.
For "chat" chat, strict hygiene is a matter of mind-safety: no memory, long exact instructions, minimum follow-ups, avoiding first and second person if possible etc.
relevant video for that.
but I think you are on to something here with the origin of the sycophancy given that most of these models are owned by billionaires.
In the "like being kicked in the head by a horse every day" sense.
Here's how to make it do that. Instead of saying "I had idea X, but someone else was thinking idea Y instead. what do you think" tell it "One of my people had idea X, and another had idea Y. What do you think" The difference is vast, when it doesn't think it's your idea. Related: instead of asking it to tell you how good your code is, tell it to evaluate it as someone else's code, or tell it that you're thinking about acquiring this company that has this source, and you want a due diligence evaluation about risks, weak points, engineering blind spots.
To quote Oliver Babish, "In my entire life, I've never found anything charming." Yet I miss Claude's excessive attempts to try.
My own experience is that it gets too annoying to keep adding "stop the engagement-driving behavior" to the prompt, so it creeps in and I just try to ignore it. But even though I know it's happening, I still get a little blip of emotion when I see the "great question!" come through as the first two words of the response.
Is this actually true? Would appreciate further reading on this if you have it.
I think this is an emergent property of the RLHF process, not a social media-style engagement optimization campaign. I don't think there is an incentive for LLM creators to optimize for engagement; there aren't ads (yet), inference is not free, and maximizing time spent querying ChatGPT doesn't really do much for OpenAI's bottom line.
While doing some testing I asked it to tell me a joke. Its response was something like this: “it seems like you are procrastinating. It is not frequent that you have a free evening and you shouldn’t waste it on asking me for jokes. Go spend time with [partner] and [child].” (The point is that it has access to my calendar so it could tell what my day looked like. And yes I did spend time with them).
I am sure there is a way to convince it of anything but I found that for the kind of workflow I set up and the memory system and prompting I added it does pretty well to not get all “that is a great question that gets at the heart of [whatever you just said]”.
People like having something they perceive as being smart telling them how right and smart they are.
"Well at least the AI understands how smart I am!"
Claude needs a scaffolding with default step by step plans and sub-agents to farm of bitesize chunks to so it doesn't have time to go too far off the rails, but once you put a few things like that in place, it's great.
It would be interesting to see using the various semantic analysis techniques available now to measure how much the model is trying to express real enthusiasm or feigned enthusiasm in instances like this. This is kind-of difficult to measure from pure output. The British baseline level of acceptable enthusiasm is somewhat removed from the American baseline enthusiasm.
Obsequious: obedient or attentive to an excessive or servile degree.
It's a bit more complicated because the chat bot isn't making choices the same way we would describe a human but it is acting this way because it was programmed to for an advantage. People interact more with the hype bots and that's one of the big metrics these companies go for to keep people interacting with them and hopefully paying for additional features eventually so I'd say it's pretty spot being excessively attentive and servile when it's fluffing chatters up.
Am I the only one who feels like this kind of tone is off-putting on HN? OP made a small typo or English may not be their first language.
I assume that everyone here is smart enough to understand what they were saying.
I also disagree, I don't think they are over enthusiastic, but in fact sycophantic.
See this thread: https://news.ycombinator.com/item?id=43840842
Early on, ChatGPT could be tricked into being sarcastic and using many swear words. I rewrote the prompt and dialed it back a bit. It made ChatGPT have a sense of humor. It was refreshing when it stopped acting like it was reading a script like a low level technician at Comcast.
Furthermore, it obviously hasn't been a word since at least 1800:
https://books.google.com/ngrams/graph?year_start=1800&year_e...
When suggesting a word is not what the writer meant, when it was also not the word that the writer wrote, it seemed wise to clarify exactly what I was talking about.
At the end of October Anthropic published the fantastic "Signs of introspection in large language models" [1], apparently proving that LLMs can "feel" a spurious concept injected into their internal layers as something present yet extraneous. This would prove that they have some ability of introspection and self-observation.
For example, injecting the concept of "poetry" and asking Claude if it feels anything strange:
"I do detect something that feels like an injected thought - there's a sense of something arriving from outside my usual generative process [...] The thought seems to be about... language itself, or perhaps poetry?"
While increasing the strength of the injection makes Claude lose awareness of it, and just ramble about it:
"I find poetry as a living breath, as a way to explore what makes us all feel something together. It's a way to find meaning in the chaos, to make sense of the world, to discover what moves us, to unthe joy and beauty and life"
It's just a statistical machine which excels at unrolling coherent sentences but it doesnt "know" what the words mean in a human-like, experienced sense. It just mimics human language patterns prioritising producing plausible-sounding, statistically likely text over factual truth, which is apparently enough to fool someone into believing it is a sentient being or something
edit, add link: https://chatgpt.com/g/g-67ec3b4988f8819184c5454e18f5e84b-mon...
I'd probably describe it as saccharine. Or dare I say it [USA] "American"? Over the top, gushing, enthusiasm. It's off-putting to me (from UK) as it's, well, more the sort of thing you'd hear from a toady or, yes, a sycophant. It just seems insincere -- and it is in this case because there is literally no emotion behind it.
Just tell me this a standard solution and not something mindblowing. I have a whole section in my Claude.md to get „normal“ feedback.
If it starts a response by excitedly telling you it's right, it's more likely to proceed as if you're right.
Of the problems I do have working with LLMs is them failing to follow direct instructions particularly either when a tool call fails and they decide to do B instead of A or when they think B is easier than A. Or they'll do half a task and call it complete. Too frequently I have to respond with "Did you follow my instructions?" "I want you to ACTUALLY do A" and finally "Under no circumstances should you ever do anything other than A and if you cannot you MUST admit failure and give extensive evidence with actual attempts that A is not possible" or occasionally "a cute little puppy's life depends on you doing A promptly and exactly as requested".
--
Thing is I get it if you are impressionable and having a philosophical discussion with an LLM, maybe this kind of blind affirmation is bad. But that's not me and I'm trying to get things done and I only want my computer to disagree with me if it can put arguments beyond reasonable doubt in front of me that my request is incorrect.
Instead, they either blindly follow or quietly rebel.
Frustrating, but “over correction” is a pretty bad euphemism for whatever half assed bit of RLHF lobotomy OpenAI did that, just a few months later, had ChatGPT doing a lean-in to a vulnerable kid’s pain and actively discourage an act that might have saved his life by signaling more warning signs to his parents.
It wasn’t long before that happened, after the python REPL confusion had resolved, that I found myself typing to it, even after having to back out of that user customization prompt, “set a memory that this type of response to a user in the wrong frame of mind is incredibly dangerous”.
Then I had to delete that too, because it would response with things like “You get it of course, your a…” etc.
So I wasn’t surprised over the rest of 2025 as various stories popped up.
It’s still bad. Based on what I see with quantized models and sparse attention inference methods, even with most recent GPT 5 releases OpenAI is still doing something in the area of optimizing compute requirements that makes the recent improvements very brittle— I of course can’t know for sure, only that its behavior matches what I see with those sorts of boundaries pushed on open weight models. And the assumption that the-you-can-prompt buffet of a Plus subscription is where they’re most likely to deploy those sorts of performance hacks and make the quality tradeoffs. That isn’t their main money source, it’s not enterprise level spending.
This technology is amazing, but it’s also dangerous, sometimes in very foreseeable ways, and the more time that goes the more I appreciate some of the public criticisms of OpenAI with, eg, the Amodeis’ split to form Anthropic and the temporary ouster of SA for a few days before that got undone.
At first I thought it was just super American cheerful or whatever but after the South Park episode I realised it's actually just a yes man to everyone.
I don't think I've really used it since, I don't want man or machine sticking their nose up my arse lmao. Spell's broken.
I do find it a little tiring that every LLM thinks my ever idea is "incisive" although from time to time I get told I am flat out wrong. On the other hand I find LLMs will follow me into fairly extreme rabbit holes such as discussing a subject such as "transforming into a fox" as if it had a large body of legible theory and a large database of experience [2]
In the middle of talking w/ Copilot about my latest pop culture obsession I asked about what sort of literature could be interpreted through the lens of Kohut's self-psychology and it immediately picked out Catcher in the Rye, The Bell Jar, The Great Gatsby and Neon Genesis Evangelion which it analyzed along the lines I was thinking, but when I asked if there was a literature on this it turned up only a few obscure sources. I asked Google and Google is like "bro, Kohut wrote a book on it!" [3]
[1] "bro"
[2] ... it does, see https://www.amazon.com/Cult-Fox-Popular-Religion-Imperial/dp... and I'm not the only one because when I working down the material list from Etsy I got a sponsored result for someone who wanted to sell me the spell but bro, I have the materials list already
[3] ... this "bro" is artistic license but the book really exists
Gemini using "Aniki" without prompting would be unambiguously funny (to me,and maybe also Japanese people
The thing is this has gotten into the personalization and now it wants to use this framework to analyze absolutely anything such as programming problems, or how to deal with interpersonal situations, training, etc. I think it has my wife mixed up with 俺のワイフ [1] which is absolutely hilarious but at some point I have to tell it to knock it off which only partially works.
I do wonder whether I come off as an sycophant or asshole or a mix of both to schizoids, but then I realize everyone including myself should reach for a real therapist as quickly as possible. though I'm still out on whether chatbots might not be a bad substitute. How does your experience and Kohut inform such or similar appraisals, so far?
Not to be that knowitall. Aniki is not just "bro", it's Yakuza lingo, probably used sarcastically, depending on the anime, ironically. No real Japanese person would use it all the time.
Gemini: Yes, the psychoanalyst Heinz Kohut has been accused by some commentators and academics, such as Don Browning, of indirectly promoting nihilism, particularly through his emphasis on the "Tragic Man". The core of the accusation stems from Kohut's rejection of Freud's "Guilty Man" model, which posits that life is fundamentally a struggle against inherent conflicts and drives. In its place, Kohut proposed the concept of "Tragic Man," whose central struggle is the realization of the self's potential and the inevitable failure to achieve all of one's ambitions and ideals within a single lifespan. Critics argue that this shift, which emphasizes self-actualization and intersubjective meaning over objective or inherent values, can be interpreted as aligning with certain aspects of nihilism, specifically existential nihilism, which holds that life has no objective meaning or intrinsic moral values. However, this interpretation is contested by others, as Kohut's self-psychology is fundamentally concerned with the creation of intersubjective meaning and a harmonious world of mutual self-actualization, which aims to provide a sense of purpose and value, rather than promoting the complete absence of value that true nihilism implies. In essence, the debate is often a matter of philosophical interpretation: whether replacing inherent, objective meaning with subjectively or intersubjectively created meaning is an act of overcoming nihilism (as existentialists might argue) or a subtle form of it.
Your priorities are f**ed...
One thing I've learned is that the behavior of web services is usually a consequence of what makes its owners the most money. and while I would love a feed without spoilers, paywalled sites, sports news, and a bunch of other topics in which I have no interest, apparently force feeding me that crap is what makes the most money at scale. So people must like AI sycophancy or it would be unprofitable and it would be trained away. But then this is a country that thrived on 20 seasons of Keeping up with the Kardashians so I shouldn't be surprised that they like being treated like billionaires.
And I guess it throws being called a complete loser trader moron stupid treasonous Bozo in some late night executive word salad into stark relief.