Ancillary Features Don’t Improve LI-RADS Accuracy
The study raises questions about whether ancillary features still warrant inclusion in liver imaging workflows.
-
09/05/2025
A large analysis of 9257 CT and MRI liver scans from 46 studies found that adding ancillary features to the Liver Imaging Reporting and Data System did not improve its diagnostic accuracy for hepatocellular carcinoma.
The findings suggest that radiologists may not gain additional diagnostic value from using ancillary features (AFs) beyond the major features already defined in Liver Imaging Reporting and Data System (LI-RADS).
LI-RADS is a standardized system for interpreting liver imaging in adults at risk for hepatocellular carcinoma (HCC). It categorizes liver lesions based on imaging findings from contrast-enhanced CT or MRI. While major features in LI-RADS have been extensively studied, the value of AFs—optional imaging findings such as fat in mass, restricted diffusion, or T2 hyperintensity—has remained unclear.
Researchers used individual participant data (IPD) from 7811 adults (6792 men, 1019 women; mean age, 58.7 years; standard deviation, 10.7) to compare three classification strategies. The first used only major features. The second included individual AFs. The third allowed AFs that favor malignancy or HCC to upgrade category LR-4 lesions to LR-5.
Lesions were confirmed as HCC, other malignancies, or benign using histopathology or composite imaging reference standards. Of the 9257 observations, 6034 were HCC, 755 were non-HCC malignancies, and 1265 were benign. Most observations were evaluated with MRI (8159), and the remainder with CT (1098).
Across all strategies, there were no significant differences in diagnostic accuracy. The area under the receiver operating characteristic curve (AUC), which measures how well a test distinguishes between disease and no disease, was not affected by the use of AFs. The P value for differences in AUC ranged from .65 to >.99.
There were also no differences in other diagnostic performance measures. The positive predictive value (PPV), sensitivity, and specificity of the LR-5 category for diagnosing HCC remained unchanged across strategies. P values ranged from .11 to >.99.
A secondary analysis limited to 9 low-risk-of-bias studies confirmed these findings. Allowing AFs to upgrade LR-4 lesions to LR-5 also did not improve diagnostic performance.
The proportion of LR-4 lesions upgraded to LR-5 based on AFs varied. Using AFs favoring malignancy, 5.2% to 76.0% of lesions were upgraded. Using AFs favoring HCC, upgrades ranged from 4% to 14.7%. Despite these reclassifications, diagnostic accuracy remained the same.
The findings suggest that LI-RADS major features alone are sufficient for accurate HCC diagnosis. Although AFs are associated with malignancy or benignity, they did not improve diagnostic outcomes when applied individually. The study also noted that AFs may introduce complexity and variability between readers without improving diagnostic utility.
Researchers used a bivariate mixed-effects model and adhered to strict IPD meta-analysis methods to ensure robustness. These results suggest that AFs may be unnecessary for improving LI-RADS diagnostic performance in clinical practice.
Full disclosures can be found in the published study.
Source: Radiology
Summary content
7 Key Takeaways
-
1
Developed a paper-based colorimetric sensor array for chemical threat detection.
-
2
Can detect 12 chemical agents, including industrial toxins.
-
3
Production cost is under 20 cents per chip.
-
4
Utilizes dye-loaded silica particles on self-adhesive paper.
-
5
Provides rapid, simultaneous identification through image analysis.
-
6
Inspired by the mammalian olfactory system for pattern recognition.
-
7
Future developments include a machine learning-enabled reader device.
The guidelines emphasize four-hour gastric emptying studies over two-hour testing. How do you see this affecting diagnostic workflows in practice?
Dr. Staller: Moving to a four-hour solid-meal scintigraphy will actually simplify decision-making. The two-hour reads miss a meaningful proportion of delayed emptying; standardizing on four hours reduces false negatives and the “maybe gastroparesis” purgatory that leads to repeat testing. Practically, it means closer coordination with nuclear medicine (longer slots, consistent standardized meal), updating order sets to default to a four-hour protocol, and educating front-line teams so patients arrive appropriately prepped. The payoff is fewer equivocal studies and more confident treatment plans.
Metoclopramide and erythromycin are the only agents conditionally recommended for initial therapy. How does this align with what is being currently prescribed?
Dr. Staller: This largely mirrors real-world practice. Metoclopramide remains the only FDA-approved prokinetic for gastroparesis, and short “pulsed” erythromycin courses are familiar to many of us—recognizing tachyphylaxis limits durability. Our recommendation is “conditional” because the underlying evidence is modest and patient responses are heterogeneous, but it formalizes what many clinicians already do: start with metoclopramide (lowest effective dose, limited duration, counsel on neurologic adverse effects) and reserve erythromycin for targeted use (exacerbations, bridging).
Several agents, including domperidone and prucalopride, received recommendations against first-line use. How will that influence discussions with patients who ask about these therapies?
Dr. Staller: Two points I share with patients: evidence and access/safety. For domperidone, the data quality is mixed, and US access is through an FDA IND mechanism; you’re committing patients to EKG monitoring and a non-trivial administrative lift. For prucalopride, the gastroparesis-specific evidence isn’t strong enough yet to justify first-line use. So, our stance is not “never,” it’s just “not first.” If someone fails or cannot tolerate initial therapy, we can revisit these options through shared decision-making, setting expectations about benefit, monitoring, and off-label use. The guideline language helps clinicians have a transparent, evidence-based conversation at the first visit.
The guidelines suggest reserving procedures like G-POEM and gastric electrical stimulation for refractory cases. In your practice, how do you decide when a patient is “refractory” to medical therapy?
Dr. Staller: I define “refractory” with three anchors.
1. Adequate trials of foundational care: dietary optimization and glycemic control; an antiemetic; and at least one prokinetic at appropriate dose/duration (with intolerance documented if stopped early).
2. Persistent, function-limiting symptoms: ongoing nausea/vomiting, weight loss, dehydration, ER visits/hospitalizations, or malnutrition despite the above—ideally tracked with a validated instrument (e.g., GCSI) plus nutritional metrics.
3. Objective correlation: delayed emptying on a standardized 4-hour solid-meal study that aligns with the clinical picture (and medications that slow emptying addressed).
At that point, referral to a center with procedural expertise for G-POEM or consideration of gastric electrical stimulation becomes appropriate, with multidisciplinary evaluation (GI, nutrition, psychology, and, when needed, surgery).
What role do you see dietary modification and glycemic control playing alongside pharmacologic therapy in light of these recommendations?
Dr. Staller: They’re the bedrock. A small-particle, lower-fat, calorie-dense diet—often leaning on nutrient-rich liquids—can meaningfully reduce symptom burden. Partnering with dietitians early pays dividends. For diabetes, tighter glycemic control can improve gastric emptying and symptoms; I explicitly review medications that can slow emptying (e.g., opioids; consider timing/necessity of GLP-1 receptor agonists) and encourage continuous glucose monitor-informed adjustments. Pharmacotherapy sits on top of those pillars; without them, medications will likely underperform.
The guideline notes “considerable unmet need” in gastroparesis treatment. Where do you think future therapies or research are most urgently needed?
Dr. Staller: I see three major areas.
1. Truly durable prokinetics: agents that improve emptying and symptoms over months, with better safety than legacy options (e.g., next-gen motilin/ghrelin agonists, better-studied 5-HT4 strategies).
2. Endotyping and biomarkers: we need to stop treating all gastroparesis as one disease. Clinical, physiologic, and microbiome/omic signatures that predict who benefits from which therapy (drug vs G-POEM vs GES) would transform care.
3. Patient-centered trials: larger, longer RCTs that prioritize validated symptom and quality-of-life outcomes, include nutritional endpoints, and reflect real-world medication confounders.
Our guideline intentionally highlights these gaps to hopefully catalyze better trials and smarter referral pathways.
Dr. Staller is with the Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston.