New AI model efficiently reaches clinical-expert-level accuracy in complex medical scans

New framework has wide adaptability across a variety of imaging modalities

October 1, 2024

By Kevin McClanahan

5 min read

UCLA researchers have developed a deep-learning framework that teaches itself quickly to automatically analyze and diagnose MRIs and other 3D medical images – with accuracy matching that of medical specialists in a fraction of the time. An article describing the work and the system’s capabilities is published in Nature Biomedical Engineering(Link is external).

Unlike the few other models being developed to analyze 3D images, the new framework has wide adaptability across a variety of imaging modalities. The developers have studied it with 3D retinal scans (optical coherence tomography) for disease risk biomarkers, ultrasound videos for heart function, 3D MRI scans for liver disease severity assessment, and 3D CT for chest nodule malignancy screening. They say it provides a foundation that could prove valuable in numerous other clinical settings as well, and studies are planned.

Artificial neural networks train themselves by performing many repeated calculations and screening extremely large datasets examined and labeled by clinical experts. Unlike standard 2D images that show length and width, 3D imaging technologies add depth, and these “volumetric,” or 3D, images take more time, skill and attention for an expert’s interpretation. For example, a 3D retinal imaging scan may be composed of nearly 100 2D images, requiring a few-minute close inspection by a highly trained clinical specialist to detect delicate disease biomarkers, such as measuring the volume of an anatomical swelling.

“While there are many AI (artificial intelligence) methods for analyzing 2D biomedical imaging data, compiling and annotating large volumetric datasets that would be required for standard 3D models to exhaust AI’s full potential is infeasible with standard resources. Several models exist, but their training efforts typically focus on a single imaging modality and a specific organ or disease,” said Oren Avram, PhD,(Link is external) a postdoctoral researcher at UCLA Computational Medicine(Link is external) and a co-first author of the paper.

The UCLA computer model, called SLIViT, for SLice Integration by Vision Transformer, consists of a unique combination of two artificial intelligence components and a unique learning approach that researchers say allow it to accurately predict disease risk factors from medical scans across multiple volumetric modalities with moderately sized labeled datasets.

“SLIViT overcomes the training dataset size bottleneck by leveraging prior ‘medical knowledge’ from the more accessible 2D domain,” said Berkin Durmus, a UCLA PhD student and co-first author of the article. He and Avram are researchers affiliated with the UCLA Samueli School of Engineering(Link is external) and other UCLA schools and departments.

“We show that SLIViT, despite being a generic model, consistently achieves significantly better performance compared to domain-specific state-of-the-art models. It has clinical applicability potential, matching the accuracy of manual expertise of clinical specialists while reducing time by a factor of 5,000. And unlike other methods, SLIViT is flexible and robust enough to work with clinical datasets that are not always in perfect order,” Durmus said.

Avram said SLIViT’s automated annotation may benefit patients and clinicians by improving diagnostic efficiency and timeliness, and it advances and expedites medical research by reducing data acquisition costs and duration. Additionally, it provides a foundation model to accelerate development of future predictive models.

“What thrilled me most was SLIViT’s remarkable performance under real-life conditions, particularly with low-number training datasets,” said SriniVas R. Sadda, MD, a professor of Ophthalmology at UCLA Health and the Artificial Intelligence & Imaging Research director at the Doheny Eye Institute(Link is external). “SLIViT thrives with just hundreds – not thousands – of training samples for some tasks, giving it a substantial advantage over other standard 3D-based methods in almost every practical case related to 3D biomedical imaging annotation.”

Eran Halperin, PhD(Link is external), an adjunct professor of computer science at UCLA Samueli and a professor at the Computational Medicine Department, which is affiliated with both UCLA Samueli and the UCLA David Geffen School of Medicine(Link is external), said that even if financial resources were unlimited, ongoing research will always face challenges posed by limited training datasets – in clinical environments, for instance, or when considering emerging biomedical-imaging modalities.

“When a new disease-related risk factor is identified, it can take months to train specialists to accurately annotate the new factor at scale in biomedical images,” he said. “But with a relatively small dataset, which a single trained clinician can annotate in just a few days, SLIViT can dramatically expedite the annotation process for many other non-annotated volumes, achieving performance levels comparable to clinical specialists.”

Halperin and Sadda are co-senior authors of the paper.

In addition to expanding their studies to include additional treatment modalities, the researchers plan to investigate how SLIViT can be leveraged for predictive disease forecasting to enhance early diagnosis and treatment planning. To promote its clinical applicability, they also will explore ways to ensure that systematic biases in AI models do not contribute to health disparities.

# # #

Authors A full list of authors is provided with the article.

Funding This work was supported by National Institutes of Health/National Institute of General Medical Sciences grant 5R25GM135043, NIH/National Institute of Biomedical Imaging and Bioengineering grant R01EB035028, NIH/National Eye Institute grants R01EY023164 and 1R01EY030614, and an Unrestricted Grant from Research to Prevent Blindness Inc. This research was conducted using the UK Biobank Resource under application #33127.

Competing interests E.H. has an affiliation with Optum. S.R.S. has affiliations with Abbvie/Allergan, Alexion, Amgen, Apellis, ARVO, Astellas, Bayer, Biogen, Boerhinger Ingelheim, Carl Zeiss Meditec, Centervue, Character, Eyepoint, Heidelberg, iCare, IvericBio, Jannsen, Macula Society, Nanoscope, Nidek, NotalVision, Novartis, Optos, OTx, Pfizer, Regeneron, Roche, Samsung Bioepis, Topcon.

Article(Link is external) Accurate prediction of disease-risk factors from volumetric medical scans by a deep vision model pre-trained with 2D scans. Nature Biomedical Engineering DOI: 10.1038/s41551-024-01257-9.

View All News & Insights