The question is revealing that the model has a model of language but not of reality. It knows what words go together, but not real-world concepts.
For most of them, we’d worry that a human answerer using maximum effort to produce the same outcome was having a stroke.
also, naysayers apparently DO have a compelling reason.