In an independent prospective cohort of 1,458 patients (1,904 lesions), a widely used smartphone AI skin-cancer app failed to capture images in 16.6% of lesions (317/1,904) even under optimal conditions; among captured lesions, the CNN’s sensitivity was 82.5% and specificity 76.8% vs final clinical/histopathologic diagnosis. In-app teledermatology was available for 65.7% of images, and adding telederm increased specificity to 86.8% (trial: ClinicalTrials.gov NCT05246163).
Why It Matters To Your Practice
Real-world usability is a clinical performance issue: a 16.6% capture failure rate means a meaningful fraction of “lesions of concern” won’t generate any AI output at the point of care.
The app’s baseline tradeoff (82.5% sensitivity, 76.8% specificity) implies both missed cancers and avoidable false alarms—relevant to triage, reassurance, and referral pathways.
Melanoma was present (32 cases), underscoring that high-stakes lesions were in the tested mix.
Clinical Implications
If patients bring app results, treat them as adjunctive data: confirm with standard history/exam/dermoscopy and follow usual biopsy/referral thresholds.
Expect “no result” scenarios: build a workflow for failed image capture (e.g., default to clinician assessment rather than repeated patient retakes that delay care).
When telederm is available, it may reduce false positives: combining CNN + telederm increased specificity to 86.8%, potentially lowering unnecessary urgent referrals.
Insights
This was prospective and user-representative (lesions of concern in an early-access consultation), which is closer to real practice than retrospective image sets.
Performance depended not just on the model but on the end-to-end system: image acquisition, device variability, and whether telederm review was actually available (65.7% of images).
The study explicitly tested different photographic conditions (angle, lighting, user) and smartphone models—highlighting that “model accuracy” can degrade via implementation factors.
The Bottom Line
In prospective real-world use, a popular smartphone CNN skin-cancer tool had substantial capture failures (16.6%) and only moderate standalone accuracy (82.5% sensitivity/76.8% specificity).
Adding teledermatology review improved specificity (86.8%), suggesting clinician-in-the-loop designs may be safer and more practical than AI-only outputs.