upvote
Can you reproduce irreproducibility?

Give me a question which the LLM answers vastly differently on runs.

I keep hearing how it's dumb and wrong but no one ever shares the chat or prompt

reply
Try this with ChatGPT or GROK or Claude

How many days of the week contain the letter d?

The answer I get with ChatGPT, and Grok is 3 and 6 with Claude.

reply
I just used ChatGPT only, twice. Web interface in a Firefox private window, and in a Chrome incognito window. I asked them both the identical question "How many names of the days of the week contain the letter D?"

In Firefox I got 6. In Chrome I got 7. LLMs are not even self-consistent.

I have the screenshots if anyone cares.

reply