It's not a first experiment it's experiment 50 or 60 and, a reaction to AI gaslighting.
Most people dont really understand coding but shopping is a far simpler task and so it's easier to see how and where it fails (i.e. with even mildly complex instructions).