This seems like a classic case of doing it being proof that it can happen, but not doing it being insufficient proof that it's impossible. I don't think there's a "quick test" of whether there might be a more effective prompt that would cause it to reproduce more effectively.