upvote
Also, you need to see an analysis of the incorrect calls. The goal of a human Dr is not to get the highest accuracy, it's to limit total harm to the patient. There can be cases where the odds favor picking X (but it may not be by that much), but the safe thing to do is to rule out some other option first, or start a safe treatment that covers several other possible options.

Simply getting the "high score" on this evaluation is not necessarily good medical treatment.

reply
Exactly this. Most diagnosis isn’t about pinpointing the underlying exact cause, it’s ruling out the really bad stuff and minimising harm. Differential diagnosis just isn’t real world medicine.
reply
Yeah 100% this. We've all used AI. It's obvious that it can sometimes outperform humans in a "did it get the right answer" benchmark while being wildly worse overall because of worse failure modes.

I bet the AI's incorrect answers are less "I don't know, let's get a second opinion" and more "you're perfectly fine, 0% chance this is cancer".

reply
At many (otherwise) world-leading facilities even just reviewing the patient history is a slog. There is rarelly any ability to keyword search the records or even filter the records by location, title and occupation of the healthcare professional making it, etc. Especially very ill people will have hundreds and hundreds of recent entries.

And stepping through those entries isn’t like browsing a modern local-first app [1], where you will just scroll through dozens of entries in milliseconds. It’s not like the slightly older and slightly slower Gmail interface. You’re clicking on each record and waiting 400ms-3s for it to load, as if instead of a 25Gb fiber connection you’re on dialup requesting the record from Epic’s headquarters in the US and proxying them via Australia.

[1] https://bugs.rocicorp.dev/p/roci

reply