upvote
The issue is that those hypothetical scenarios do not have to look like how patients actually interact with the tool.

Real life use is full of ill posed questions open ended statements inaccurate assessment of symptoms, and conclusory remarks sprinkled in between. Real use of chat bots for Health by non-clinicians looks very different than scenario based evaluation.

reply
You would pass those hypothetical scenarios to doctors too, and then the analyses of results would be done by doctors who don't know if it's an AI or doctor result.
reply
From the paper

> Three physicians independently assigned gold-standard triage levels based on cited clinical guidelines and clinical expertise, with high inter-rater agreement

reply
deleted
reply
You can start by comparing "doctor" care vs "doctor who also uses AI" care
reply