Read news on lower accuracy with our app.
Read more in the app
Reasoning language models have lower accuracy on medical multiple choice questions when "None of the other answers" replaces the correct response.