We read with great interest the report by Mayourian et al on the use of an artificial intelligence–enabled electrocardiogram (AI-ECG) to inform pulmonary valve replacement (PVR) timing in repaired tetralogy of Fallot (rTOF). In a cohort of 605 PVR patients from Boston Children’s Hospital and Toronto General Hospital, the authors demonstrate that pre-PVR AI-ECG risk predicts post-PVR mortality (c-index 0.77), improves discrimination beyond an established imaging-based benchmark (combined c-index 0.84), and that PVR is associated with improved survival overall (HR 0.28; 95% CI, 0.13-0.60), with exploratory evidence of benefit in the intermediate and high-risk tertiles but not in the low-risk group. These findings represent an important step toward scalable, ECG-based decision support in a population for whom optimal intervention timing remains an enduring clinical challenge.
We wish to highlight one methodological consideration that may influence interpretation of the reported treatment-effect heterogeneity: time alignment in the observational PVR versus non-PVR comparison. Because AI-ECG probabilities were derived from ECGs obtained within three months before PVR, the “time zero” for PVR patients is well defined. However, for propensity score–matched non-PVR comparators, how an analogous index date was assigned is less clear. If controls were indexed on an ECG not temporally aligned to a comparable clinical decision point, or if eligibility required surviving long enough to be matched, immortal time bias and time-dependent confounding may arise. , Such biases would affect both the overall hazard ratio and the apparent effect modification across AI-ECG risk strata. Given the low event rate (3.6% mortality over a median follow-up of 7.5 years), even modest time-related bias could materially alter subgroup-level signals.
We respectfully suggest that the study’s translational message should be strengthened by addressing three aspects. First, explicitly describing the index-date assignment for non-PVR patients—specifically, how the “pre-PVR” ECG window was mirrored for controls—and demonstrating covariate balance at the index date would help readers assess the comparability of groups at baseline. Second, a target trial emulation–style sensitivity analysis, employing time-varying treatment models such as marginal structural models or cloning–censoring–weighting approaches, would further address the potential for evolving clinical status to confound PVR timing effects. Third, reporting calibration summaries and, if feasible, decision-curve analysis would allow clinicians to evaluate the net benefit of AI-ECG–guided PVR thresholds across clinically plausible scenarios, moving the discussion from prognostic enrichment toward actionable decision support.
These clarifications would help delineate whether AI-ECG primarily enhances risk stratification or robustly identifies patients most likely to derive a survival benefit from PVR, a distinction with direct implications for how this promising biomarker could be integrated alongside imaging into clinical decision-making for this growing population.
Declaration of generative AI and AI-assisted technologies in the writing process
Stay updated, free articles. Join our Telegram channel
Full access? Get Clinical Tree