Dear colleagues,
Since our prior Perspectives piece on artificial intelligence (AI) in GI and Hepatology in 2022, the field has seen almost exponential growth. Expectations are high that AI will revolutionize our field and significantly improve patient care. But as the global discussion on AI has shown, there are real challenges with adoption, including issues with accuracy, reliability, and privacy.
In this issue, Dr. Nabil M. Mansour and Dr. Thomas R. McCarty explore the current and future impact of AI on gastroenterology, while Dr. Basile Njei and Yazan A. Al Ajlouni assess its role in hepatology. We hope these pieces will help your discussions in incorporating or researching AI for use in your own practices. We welcome your thoughts on this issue on X @AGA_GIHN.
Gyanprakash A. Ketwaroo, MD, MSc, is associate professor of medicine, Yale University, New Haven, Conn., and chief of endoscopy at West Haven (Conn.) VA Medical Center. He is an associate editor for GI & Hepatology News.
Artificial Intelligence in Gastrointestinal Endoscopy
BY THOMAS R. MCCARTY, MD, MPH; NABIL M. MANSOUR, MD
The last few decades have seen an exponential increase and interest in the role of artificial intelligence (AI) and adoption of deep learning algorithms within healthcare and patient care services. The field of gastroenterology and endoscopy has similarly seen a tremendous uptake in acceptance and implementation of AI for a variety of gastrointestinal conditions. The spectrum of AI-based applications includes detection or diagnostic-based as well as therapeutic assistance tools. From the first US Food and Drug Administration (FDA)-approved device that uses machine learning to assist clinicians in detecting lesions during colonoscopy, to other more innovative machine learning techniques for small bowel, esophageal, and hepatobiliary conditions, AI has dramatically changed the landscape of gastrointestinal endoscopy.
Approved applications for colorectal cancer
In an attempt to improve colorectal cancer screening and outcomes related to screening and surveillance, efforts have been focused on procedural performance metrics, quality indicators, and tools to aid in lesion detection and improve quality of care. One such tool has been computer-aided detection (CADe), with early randomized controlled trial (RCT) data showing significantly increased adenoma detection rate (ADR) and adenomas per colonoscopy (APC).1-3
Ultimately, this data led to FDA approval of the CADe system GI Genius (Medtronic, Dublin, Ireland) in 2021.4 Additional systems have since been FDA approved or 510(k) cleared including Endoscreener (Wision AI, Shanghai, China), SKOUT (Iterative Health, Cambridge, Massachusetts), MAGENTIQ-COLO (MAGENTIQ-EYE LTD, Haifa, Israel), and CAD EYE (Fujifilm, Tokyo), all of which have shown increased ADR and/or increased APC and/or reduced adenoma miss rates in randomized trials.5
Yet despite the promise of improved quality and subsequent translation to better patient outcomes, there has been a noticeable disconnect between RCT data and more real-world literature.6 In a recent study, no improvement was seen in ADR after implementation of a CADe system for colorectal cancer screening — including both higher and lower-ADR performers. Looking at change over time after implementation, CADe had no positive effect in any group over time, divergent from early RCT data. In a more recent multicenter, community-based RCT study, again CADe did not result in a statistically significant difference in the number of adenomas detected.7 The differences between some of these more recent “real-world” studies vs the majority of data from RCTs raise important questions regarding the potential of bias (due to unblinding) in prospective trials, as well as the role of the human-AI interaction.
Importantly for RCT data, both cohorts in these studies met adequate ADR benchmarks, though it remains unclear whether a truly increased ADR necessitates better patient outcomes — is higher always better? In addition, an important consideration with evaluating any AI/CADe system is that they often undergo frequent updates, each promising improved accuracy, sensitivity, and specificity. This is an interesting dilemma and raises questions about the enduring relevance of studies conducted using an outdated version of a CADe system.
Additional unanswered questions regarding an ideal ADR for implementation, preferred patient populations for screening (especially for younger individuals), and the role and adoption of computer-aided polyp diagnosis/characterization (CADx) within the United States remain. Furthermore, questions regarding procedural withdrawal time, impact on sessile serrated lesion detection, cost-effectiveness, and preferred adoption strategies have begun to be explored, though require more data to better define a best practice approach. Ultimately, answers to some of these unknowns may explain the discordant results and help guide future implementation measures.
Innovative applications for alternative gastrointestinal conditions
Given the fervor and excitement, as well as the outcomes associated with AI-based colorectal screening, it is not surprising these techniques have been expanded to other gastrointestinal conditions. At this time, all of these are fledgling, mostly single-center tools, not yet ready for widespread adoption. Nonetheless, these represent a potentially important step forward for difficult-to-manage gastrointestinal diseases.
Machine learning CADe systems have been developed to help identify early Barrett’s neoplasia, depth and invasion of gastric cancer, as well as lesion detection in small bowel video capsule endoscopy.8-10 Endoscopic retrograde cholangiopancreatography (ERCP)-based applications for cholangiocarcinoma and indeterminate stricture diagnosis have also been studied.11 Additional AI-based algorithms have been employed for complex procedures such as endoscopic submucosal dissection (ESD) or peroral endoscopic myotomy (POEM) to delineate vessels, better define tissue planes for dissection, and visualize landmark structures.12,13 Furthermore, AI-based scope guidance/manipulation, bleeding detection, landmark identification, and lesion detection have the potential to revolutionize endoscopic training and education. The impact that generative AI can potentially have on clinical practice is also an exciting prospect that warrants further investigation.
Artificial intelligence adoption in clinical practice
Clinical practice with regard to AI and colorectal cancer screening largely mirrors the disconnect in the current literature, with “believers” and “non-believers” as well as innovators and early adopters alongside laggards. In our own academic practices, we continue to struggle with the adoption and standardized implementation of AI-based colorectal cancer CADe systems, despite the RCT data showing positive results. It is likely that AI uptake will follow the technology predictions of Amara’s Law — i.e., individuals tend to overestimate the short-term impact of new technologies while underestimating long-term effects. In the end, more widespread adoption in community practice and larger scale real-world clinical outcomes studies are likely to determine the true impact of these exciting technologies. For other, less established AI-based tools, more data are currently required.
Conclusions
Ultimately, AI-based algorithms are likely here to stay, with continued improvement and evolution to occur based on provider feedback and patient care needs. Current tools, while not all-encompassing, have the potential to dramatically change the landscape of endoscopic training, diagnostic evaluation, and therapeutic care. It is critically important that relevant stakeholders, both endoscopists and patients, be involved in future applications and design to improve efficiency and quality outcomes overall.
Dr. McCarty is based in the Lynda K. and David M. Underwood Center for Digestive Disorders, Houston Methodist Hospital. Dr. Mansour is based in the section of gastroenterology, Baylor College of Medicine, Houston. Dr. McCarty reports no conflicts of interest. Dr. Mansour reports having been a consultant for Iterative Health.
The Promise and Challenges of AI in Hepatology
BY BASILE NJEI, MD, MPH, PHD; YAZAN A. AL-AJLOUNI, MPHIL
In the dynamic realm of medicine, artificial intelligence (AI) emerges as a transformative force, notably within hepatology. The discipline of hepatology, dedicated to liver and related organ diseases, is ripe for AI’s promise to revolutionize diagnostics and treatment, pushing toward a future of precision medicine. Yet, the path to fully realizing AI’s potential in hepatology is laced with data, ethical, and integration challenges.
The application of AI, particularly in histopathology, significantly enhances disease diagnosis and staging in hepatology. AI-driven approaches remedy traditional histopathological challenges, such as interpretative variability, providing more consistent and accurate disease analyses. This is especially evident in conditions like metabolic dysfunction-associated steatohepatitis (MASH) and hepatocellular carcinoma (HCC), where AI aids in identifying critical gene signatures, thereby refining therapy selection.
Similarly, deep learning (DL), a branch of AI, has attracted significant interest globally, particularly in image recognition. AI’s incorporation into medical imaging marks a significant advancement, enabling early detection of malignancies like HCC and improving diagnostics in steatotic liver disease through enhanced imaging analyses using convolutional neural networks (CNN). The abundance of imaging data alongside clinical outcomes has catalyzed AI’s integration into radiology, leading to the swift growth of radiomics as a novel domain in medical research.
AI has also been shown to identify nuanced alterations in electrocardiograms (EKGs) associated with liver conditions, potentially detecting the progression of liver diseases at an earlier stage than currently possible. By leveraging complex algorithms and machine learning, AI can analyze EKG patterns with a precision and depth unattainable through traditional manual interpretation. Given that liver diseases, such as cirrhosis or hepatitis, can induce subtle cardiac changes long before other clinical symptoms manifest, early detection through AI-enhanced EKG analysis could lead to timely interventions, potentially halting or reversing disease progression. This approach further enriches our understanding of the intricate interplay between liver function and cardiac health, highlighting the potential for AI to transform not just liver disease diagnostics but also to foster a more integrated approach to patient care.
Beyond diagnostics, the burgeoning field of generative AI introduces groundbreaking possibilities in treatment planning and patient education, particularly for chronic conditions like cirrhosis. Generative AI produces original content, including text, visuals, and music, by identifying and learning patterns from its training data. When it leverages large language models (LLMs), it entails training on vast collections of textual data and using AI models characterized by many parameters. A notable instance of generative AI employing LLMs is ChatGPT (General Pretrained Transformers). By simulating disease progression and treatment outcomes, generative AI can foster personalized treatment strategies and empower patients with knowledge about their health trajectories. Yet, realizing these potential demands requires overcoming data quality and interpretability challenges, and ensuring AI outputs are accessible and actionable for clinicians and patients.
Despite these advancements, leveraging AI in hepatology is not devoid of hurdles. The development and training of AI models require extensive and diverse datasets, raising concerns about data privacy and ethical use. Addressing these concerns is paramount for successfully integrating AI into clinical hepatology practice, necessitating transparent algorithmic processes and stringent ethical standards. Ethical considerations are central to AI’s integration into hepatology. Algorithmic biases, patient privacy, and the impact of AI-driven decisions underscore the need for cautious AI deployment. Developing transparent, understandable algorithms and establishing ethical guidelines for AI use are critical steps towards ethically leveraging AI in patient care.
In conclusion, AI’s integration into hepatology holds tremendous promise for advancing patient care through enhanced diagnostics, treatment planning, and patient education. Overcoming the associated challenges, including ethical concerns, data diversity, and algorithm interpretability, is crucial. As the hepatology community navigates this technological evolution, a balanced approach that marries technological advancements with ethical stewardship will be key to harnessing AI’s full potential, ensuring it serves the best interests of patients and propels the field of hepatology into the future.
We predict a trajectory of increased use and adoption of AI in hepatology. AI in hepatology is likely to meet the test of pervasiveness, improvement, and innovation. The adoption of AI in routine hepatology diagnosis and management will likely follow Amara’s law and the five stages of the hype cycle. We believe that we are still in the infant stages of adopting AI technology in hepatology, and this phase may last 5 years before there is a peak of inflated expectations. The trough of disillusionment and slopes of enlightenment may only be observed in the next decades.
Dr. Njei is based in the Section of Digestive Diseases, Yale School of Medicine, New Haven, Conn. Mr. Al-Ajlouni is a senior medical student at New York Medical College School of Medicine, Valhalla, N.Y. They have no conflicts of interest to declare.
Summary content
7 Key Takeaways
-
1
Developed a paper-based colorimetric sensor array for chemical threat detection.
-
2
Can detect 12 chemical agents, including industrial toxins.
-
3
Production cost is under 20 cents per chip.
-
4
Utilizes dye-loaded silica particles on self-adhesive paper.
-
5
Provides rapid, simultaneous identification through image analysis.
-
6
Inspired by the mammalian olfactory system for pattern recognition.
-
7
Future developments include a machine learning-enabled reader device.
The guidelines emphasize four-hour gastric emptying studies over two-hour testing. How do you see this affecting diagnostic workflows in practice?
Dr. Staller: Moving to a four-hour solid-meal scintigraphy will actually simplify decision-making. The two-hour reads miss a meaningful proportion of delayed emptying; standardizing on four hours reduces false negatives and the “maybe gastroparesis” purgatory that leads to repeat testing. Practically, it means closer coordination with nuclear medicine (longer slots, consistent standardized meal), updating order sets to default to a four-hour protocol, and educating front-line teams so patients arrive appropriately prepped. The payoff is fewer equivocal studies and more confident treatment plans.
Metoclopramide and erythromycin are the only agents conditionally recommended for initial therapy. How does this align with what is being currently prescribed?
Dr. Staller: This largely mirrors real-world practice. Metoclopramide remains the only FDA-approved prokinetic for gastroparesis, and short “pulsed” erythromycin courses are familiar to many of us—recognizing tachyphylaxis limits durability. Our recommendation is “conditional” because the underlying evidence is modest and patient responses are heterogeneous, but it formalizes what many clinicians already do: start with metoclopramide (lowest effective dose, limited duration, counsel on neurologic adverse effects) and reserve erythromycin for targeted use (exacerbations, bridging).
Several agents, including domperidone and prucalopride, received recommendations against first-line use. How will that influence discussions with patients who ask about these therapies?
Dr. Staller: Two points I share with patients: evidence and access/safety. For domperidone, the data quality is mixed, and US access is through an FDA IND mechanism; you’re committing patients to EKG monitoring and a non-trivial administrative lift. For prucalopride, the gastroparesis-specific evidence isn’t strong enough yet to justify first-line use. So, our stance is not “never,” it’s just “not first.” If someone fails or cannot tolerate initial therapy, we can revisit these options through shared decision-making, setting expectations about benefit, monitoring, and off-label use. The guideline language helps clinicians have a transparent, evidence-based conversation at the first visit.
The guidelines suggest reserving procedures like G-POEM and gastric electrical stimulation for refractory cases. In your practice, how do you decide when a patient is “refractory” to medical therapy?
Dr. Staller: I define “refractory” with three anchors.
1. Adequate trials of foundational care: dietary optimization and glycemic control; an antiemetic; and at least one prokinetic at appropriate dose/duration (with intolerance documented if stopped early).
2. Persistent, function-limiting symptoms: ongoing nausea/vomiting, weight loss, dehydration, ER visits/hospitalizations, or malnutrition despite the above—ideally tracked with a validated instrument (e.g., GCSI) plus nutritional metrics.
3. Objective correlation: delayed emptying on a standardized 4-hour solid-meal study that aligns with the clinical picture (and medications that slow emptying addressed).
At that point, referral to a center with procedural expertise for G-POEM or consideration of gastric electrical stimulation becomes appropriate, with multidisciplinary evaluation (GI, nutrition, psychology, and, when needed, surgery).
What role do you see dietary modification and glycemic control playing alongside pharmacologic therapy in light of these recommendations?
Dr. Staller: They’re the bedrock. A small-particle, lower-fat, calorie-dense diet—often leaning on nutrient-rich liquids—can meaningfully reduce symptom burden. Partnering with dietitians early pays dividends. For diabetes, tighter glycemic control can improve gastric emptying and symptoms; I explicitly review medications that can slow emptying (e.g., opioids; consider timing/necessity of GLP-1 receptor agonists) and encourage continuous glucose monitor-informed adjustments. Pharmacotherapy sits on top of those pillars; without them, medications will likely underperform.
The guideline notes “considerable unmet need” in gastroparesis treatment. Where do you think future therapies or research are most urgently needed?
Dr. Staller: I see three major areas.
1. Truly durable prokinetics: agents that improve emptying and symptoms over months, with better safety than legacy options (e.g., next-gen motilin/ghrelin agonists, better-studied 5-HT4 strategies).
2. Endotyping and biomarkers: we need to stop treating all gastroparesis as one disease. Clinical, physiologic, and microbiome/omic signatures that predict who benefits from which therapy (drug vs G-POEM vs GES) would transform care.
3. Patient-centered trials: larger, longer RCTs that prioritize validated symptom and quality-of-life outcomes, include nutritional endpoints, and reflect real-world medication confounders.
Our guideline intentionally highlights these gaps to hopefully catalyze better trials and smarter referral pathways.
Dr. Staller is with the Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston.


