The Paradox of Medical AI Implementation: From Missed Opportunities to Missing Evidence

In 2012, the era of deep learning AI got legs with the convolutional neural network (AlexNet) that won the ImageNet challenge. Those images were everyday objects, animals, and scenes, unrelated to health and medicine. Over 7 years ago, I wrote a review in Nature Medicine entitled High-Performance Medicine that summarized the remarkable progress being made for AI interpretation of medical images. Now virtually every type of medical images has undergone extensive assessment with AI, including X-ray, CT, MRI, ultrasound, pathology slides, skin abnormalities, electrocardiograms, endoscopy, and retinal photos. A few weeks ago here in Ground Truths and subsequently in The Lancet , I wrote about 3 AI tools that should be used for every mammogram, based on the largest randomized trial of >100,000 women and 2 recent FDA approvals. There have been 44 randomized trials for colonoscopy that consistently, and in aggregate, demonstrate a substantial advantage of AI-assist for detecting adenomatous polyps compared with gastroenterologists without AI, yet that has not been made part of standard medical practice. In this edition of Ground Truths, I’m going to review the striking and paradoxical contrast between adoption of AI in the deep learning era (DL, pre-transformer model) with contemporary large language models (LLMs), a.k.a. generative AI, an outgrowth of the transformer model (yes, still a form of deep learning), made widely known by the release of ChatGPT in late 2022.

To read more, click here.