Commercial AI fracture detection: meta-analysis of 17 studies finds good overall accuracy, with weaker performance on ribs and spine
Source: Scientific Reports·Published: October 2024
Authors: Husarek J, Hess S, Razaeian S, Ruder TD, Sehmisch S, Müller M, Liodakis E·DOI: 10.1038/s41598-024-73058-8Open Access
Key figure: Figure 4 — Forest plot of pooled sensitivity and specificity for stand-alone commercial AI, broken out by anatomical region, showing the sharp performance drop on rib and spine radiographs. View in source
Bottom line: Across 38,978 radiographs, four of five commercial AI fracture detection systems achieved sensitivities above 90 percent and specificities in the 80 to 90 percent range. Accuracy dropped on rib and spine radiographs. The best configuration was not AI alone or human alone, but human plus AI.
What the study did
The authors systematically searched PubMed and Embase for peer-reviewed diagnostic test accuracy studies of commercially available AI fracture detection solutions used on conventional radiography. Seventeen studies covering seven certified products and 38,978 x-rays with 8,150 fractures were pooled using a bivariate random-effects model. Pre-specified subgroup analyses examined product, rater type (stand-alone AI, human unaided, human aided by AI), anatomical region, and the influence of industry funding.
What they found
Four of five products evaluated in stand-alone mode achieved sensitivity above 90 percent with specificity of 80 to 90 percent; one outlier product had sensitivity below 60 percent and specificity above 95 percent. Pooled accuracy was good to excellent across most anatomical regions, with two clear exceptions: rib radiographs (poor sensitivity, moderate specificity) and spine radiographs (excellent sensitivity, poor specificity). Human radiologists aided by AI achieved significantly higher specificity than stand-alone AI (p < 0.001) with no loss of sensitivity (p = 0.316). Industry-funded studies reported sensitivity 5 percent higher and specificity 4 percent lower than independently funded studies, a small but measurable effect.
Why it matters for orthopedic practice
Emergency departments and urgent-care clinics are the clearest fit for this technology, where radiograph volume and time-to-disposition pressure are highest. The data support AI as an assistive layer, not a replacement for radiologist review. Orthopedic trainees reading shoulder, wrist, ankle, and hand films with AI prefill should expect a genuine boost in specificity, alongside a smaller benefit for unaided sensitivity. The lower performance on rib and spine imaging is a concrete limitation to communicate when these tools are deployed in trauma and spine services.
Limitations
Diagnostic test accuracy studies vary in reference standard, patient population, and spectrum of injuries included. Only four rib studies and four spine studies were available for those subgroup estimates, so those numbers carry wider uncertainty. The funding analysis compared four industry-funded trials against eleven non-funded trials, and only six percent of commercial AI evaluations in the broader literature are fully independent. External validation across diverse scanner vendors and patient demographics was inconsistent across studies. Clinical outcome benefits, including missed-fracture rates and downstream treatment decisions, were not measured.
Husarek J, Hess S, Razaeian S, Ruder TD, Sehmisch S, Müller M, Liodakis E. Artificial intelligence in commercial fracture detection products: a systematic review and meta-analysis of diagnostic test accuracy. Sci Rep. 2024;14:23053. doi:10.1038/s41598-024-73058-8
Publishing AI research in orthopedics?
OSCRSJ accepts case reports and series on novel AI-assisted diagnoses and surgical planning. Free to publish in 2026.
Submit a manuscriptOSCRSJ News items are editorial summaries for educational purposes. They are not clinical recommendations, endorsements, or substitutes for the primary literature. Always consult the source paper and applicable specialty-society guidelines before changing practice.