upvote
Then Google shouldn't be using something so unreliable for anything important. Arguing that random users should know the difference between cheap and frontier models is also not compelling. It's all the same "AI" to most people.
reply
You are mistaken. ChatGPT Health [1] is a model specifically designed for health applications and was co-developed with a benchmark suite, HealthBench [2], for testing against health conditions. This study suggests that the people working on HealthBench have some concerning external validity problems.

[1] https://openai.com/index/introducing-chatgpt-health/

[2] https://cdn.openai.com/pdf/bd7a39d5-9e9f-47b3-903c-8b847ca65...

reply
GP was referring to Google's search AI not ChatGPT Health.
reply