If this incident gets into its training data, then its highly likely that it will repeat it again with the same confession since this is a text predictor not a thinker.
I remember this discussed when a similar issue went viral with someone building a product using replit's AI and it deleted his prod database.
Yet, since I'm also a Human being, and can work to understand the mistake myself, the probability that I can expect a correction of the behavior is much higher. I have found that it significantly helps if there's an actual reasonable paycheck on the line.
As opposed to the language model which demands that I drop more quarters into it's slots and then hope for the best. An arcade model of work if there ever was one. Who wants that?
In my experience, this isn't true. At least with a version or so ago of ChatGPT, I could make it trip on custom word play games, and when called out, it would acknowledge the failure, explain how it failed to follow the rule of the game, then proceed to make the same mistake a couple of sentences later.