What did they create? Two key innovations:
— Dr. Dominic Ng (@DrDominicNg) June 30, 2025
1. SDBench: A testing environment using 304 real medical mysteries from NEJM where AI starts with just "29yo woman with sore throat" and must decide what to ask/test next
2. MAI-DxO: An AI system that simulates 5 doctors working… pic.twitter.com/G8orzGAuOk
MAI-DxO isn't a new model but instead a framework built on top of existing LLM's (ChatGPT, Claude, Gemini).
— Dr. Dominic Ng (@DrDominicNg) June 30, 2025
How does this framework work?
It asks the LLM to simulate a virtual panel of 5 specialised AI doctors:
Dr. Hypothesis (tracks diagnoses)
Dr. Test-Chooser (selects optimal… pic.twitter.com/sYCV6zKmuU
But there's five issues I see:
— Dr. Dominic Ng (@DrDominicNg) June 30, 2025
1. They used ZERO healthy patients
95% of sore throats are viral and this AI was only tested on incredibly rare diagnostic cases.
We don't know if it will order biopsies on every patient with a sore throat "just to rule out rhabdomyosarcoma."
3. The physician comparison was rigged
— Dr. Dominic Ng (@DrDominicNg) June 30, 2025
Docs were banned from:
❌ Googling symptoms
❌ Consulting colleagues
❌ Using UpToDate/medical databases
❌ Calling specialists
That's not how we practice!!
It's like testing a chef who can't use recipes or taste their food.
5. No "When to Stop" Testing
— Dr. Dominic Ng (@DrDominicNg) June 30, 2025
Great doctors know when NOT to test. This AI was never evaluated on:
"This headache is just stress"
"Let's wait and see"
"More tests will cause more harm than good"
The benchmark rewards finding zebras, not recognising horses.
Final thought: We don't need AI that can diagnose every rare disease. We need AI that knows when to diagnose and when to reassure. That's the real art of medicine.
— Dr. Dominic Ng (@DrDominicNg) June 30, 2025
But what do you think?
If you liked this post please follow me @DrDominicNg and retweet.
— Dr. Dominic Ng (@DrDominicNg) June 30, 2025
It takes me some time to read and write these posts so I'd love to get more people's thoughts on it!
I've also just started a new newsletter on neuroscience:https://t.co/l0Ii2PPLCg
No comments:
Post a Comment