New! Sign up for our free email newsletter.
Science News
from research organizations

AI thought knee X-rays show if you drink beer -- they don't

Study shows how easily AI models can give right answers for the wrong reasons

Date:
December 11, 2024
Source:
Dartmouth College
Summary:
A new study highlights a hidden challenge of using AI in medical imaging research -- the phenomenon of highly accurate yet potentially misleading results known as 'shortcut learning.' The researchers analyzed thousands of knee X-rays and found that AI models can 'predict' unrelated and implausible traits such as whether patients abstained from eating refried beans or beer. While these predictions have no medical basis, the models achieved high levels of accuracy by exploiting subtle and unintended patterns in the data.
Share:
FULL STORY

Artificial intelligence can be a useful tool to health care professionals and researchers when it comes to interpreting diagnostic images. Where a radiologist can identify fractures and other abnormalities from an X-ray, AI models can see patterns humans cannot, offering the opportunity to expand the effectiveness of medical imaging.

But a study in Scientific Reports highlights a hidden challenge of using AI in medical imaging research -- the phenomenon of highly accurate yet potentially misleading results known as "shortcut learning."

The researchers analyzed more than 25,000 knee X-rays from the National Institutes of Health-funded Osteoarthritis Initiative and found that AI models can "predict" unrelated and implausible traits such as whether patients abstained from eating refried beans or beer. While these predictions have no medical basis, the models achieved surprising levels of accuracy by exploiting subtle and unintended patterns in the data.

"While AI has the potential to transform medical imaging, we must be cautious," says the study's senior author, Dr. Peter Schilling, an orthopaedic surgeon at Dartmouth Health's Dartmouth Hitchcock Medical Center and an assistant professor of orthopaedics in Dartmouth's Geisel School of Medicine.

"These models can see patterns humans cannot, but not all patterns they identify are meaningful or reliable," Schilling says. "It's crucial to recognize these risks to prevent misleading conclusions and ensure scientific integrity."

The researchers examined how AI algorithms often rely on confounding variables -- such as differences in X-ray equipment or clinical site markers -- to make predictions rather than medically meaningful features. Attempts to eliminate these biases were only marginally successful -- the AI models would just "learn" other hidden data patterns.

"This goes beyond bias from clues of race or gender," says Brandon Hill, a co-author of the study and a machine learning scientist at Dartmouth Hitchcock. "We found the algorithm could even learn to predict the year an X-ray was taken. It's pernicious -- when you prevent it from learning one of these elements, it will instead learn another it previously ignored. This danger can lead to some really dodgy claims, and researchers need to be aware of how readily this happens when using this technique."

The findings underscore the need for rigorous evaluation standards in AI-based medical research. Overreliance on standard algorithms without deeper scrutiny could lead to erroneous clinical insights and treatment pathways.

"The burden of proof just goes way up when it comes to using models for the discovery of new patterns in medicine," Hill says. "Part of the problem is our own bias. It is incredibly easy to fall into the trap of presuming that the model 'sees' the same way we do. In the end, it doesn't."

"AI is almost like dealing with an alien intelligence," Hill continues. "You want to say the model is 'cheating,' but that anthropomorphizes the technology. It learned a way to solve the task given to it, but not necessarily how a person would. It doesn't have logic or reasoning as we typically understand it."

Schilling, Hill, and study co-author Frances Koback, a third-year medical student in Dartmouth's Geisel School, conducted the study in collaboration with the Veterans Affairs Medical Center in White River Junction, Vt.


Story Source:

Materials provided by Dartmouth College. Note: Content may be edited for style and length.


Journal Reference:

  1. Ravi Aggarwal, Viknesh Sounderajah, Guy Martin, Daniel S. W. Ting, Alan Karthikesalingam, Dominic King, Hutan Ashrafian, Ara Darzi. Diagnostic accuracy of deep learning in medical imaging: a systematic review and meta-analysis. npj Digital Medicine, 2021; 4 (1) DOI: 10.1038/s41746-021-00438-z

Cite This Page:

Dartmouth College. "AI thought knee X-rays show if you drink beer -- they don't." ScienceDaily. ScienceDaily, 11 December 2024. <www.sciencedaily.com/releases/2024/12/241211143855.htm>.
Dartmouth College. (2024, December 11). AI thought knee X-rays show if you drink beer -- they don't. ScienceDaily. Retrieved January 26, 2025 from www.sciencedaily.com/releases/2024/12/241211143855.htm
Dartmouth College. "AI thought knee X-rays show if you drink beer -- they don't." ScienceDaily. www.sciencedaily.com/releases/2024/12/241211143855.htm (accessed January 26, 2025).

Explore More

from ScienceDaily

RELATED STORIES


pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy