Estimating the Sensitivity of Holter to Detect Atrial Fibrillation After Stroke or Transient Ischemic Attack Without a Gold Standard is Challenging




The sensitivity of a diagnostic test is ideally calculated by comparing the test assessments to a truth determined by another (perfect) test considered to be the gold standard. However, in many cases, there is no perfect gold standard. When it exists, assessment by the gold standard can be inaccessible, costly, or highly invasive. Using the best available but imperfect diagnostic test as gold standard can lead to substantial error in the estimation of diagnostic accuracy. The detection of atrial fibrillation (AF) after ischemic stroke is not an exception to this phenomenon.


In the absence of an accepted gold standard, most studies assessing electrocardiographic monitoring strategies for the detection of poststroke AF have reported diagnostic yields (the proportion of patients diagnosed with AF among all the population screened). Using diagnostic yields as a proxy of test sensitivity for poststroke AF can be misleading because of variation in the cohorts studied: all patients with stroke and transient ischemic attack versus only patients with cryptogenic stroke, or patients who have undergone previous monitoring during very short periods, or using a single technology versus patients who have already undergone 2 or 3 AF monitoring evaluations. Accordingly, we performed a meta-analysis in which we assessed diagnostic yields of different monitoring methods in the context of standardized phases of AF screening accounting for these factors and we found that, through a sequence of monitoring strategies, AF can be detected in up to 23.7% (95% CI 17.2 to 31.0) of patients with stroke or transient ischemic attack without previously known AF.


In their recent study, Choe et al propose 12 months of monitoring by implantable loop recorders (ILR) after ischemic stroke as the gold standard for the detection of poststroke AF. Using this gold standard, they aimed to estimate the sensitivity of various simulated durations and sequences of Holter monitoring. They estimated that 24-hour Holter monitoring had a sensitivity of 1.3%, whereas 30-day monitoring had a sensitivity of 22.8%. On the basis of these results, the odds of detecting AF with 24-hour Holter monitoring, one of the most commonly used diagnostic strategies, is exceptionally low. These findings are inconsistent with previous research on diagnostic yield from Holter monitoring.


The proportion of patients who test positive (the diagnostic yield) is calculated as positives = prevalence × sensitivity + (1−prevalence) × (1−specificity). Assuming that the specificity of an adjudicated Holter is 100% (no one is falsely diagnosed), we can rearrange this equation to find prevalence, where prevalence = positives/sensitivity. In our meta-analysis, the summary diagnostic yield for ambulatory Holter (performed on patients who already tested negative on in-hospital continuous monitoring or in-hospital Holter monitoring) was 10.7% (95% CI 5.6% to 17.2%). If we use this to calculate the AF prevalence implied by Choe et al, we find it to be 823% (prevalence = positives/sensitivity = 0.107/0.013). Even using the sensitivity for 7-day Holter estimated by Choe et al (8%) the calculated AF prevalence in the poststroke population would be 134% (prevalence = 0.107/0.08). Furthermore, assuming a less than perfect specificity for adjudicated Holter does not correct these implausible results. As prevalence cannot be so high, we sought alternative explanations for the findings of Choe et al, which may have substantially underestimated the sensitivity of alternative AF monitoring technologies.


First, we suspect Choe et al answered a different question than the one they posed. Rather than estimating the sensitivity of different Holter monitoring strategies, they answered how many AF diagnoses would have been missed if Holter monitoring were used instead of using ILR. For example, regardless of the sensitivity of Holter monitoring, a poststroke AF investigative strategy using Holter monitoring would have never diagnosed AF in a patient who had a first AF episode 7 months after the qualifying stroke. However, in the calculation of test sensitivity, it is wrong to treat this patient as a “missed” AF diagnosis because the patient did not have AF at the time of Holter monitoring.


Second, in a recent study including >10,000 patients who underwent cardiac monitoring with implanted devices for at least 3 months, >10% were diagnosed with AF lasting ≥5 minutes within the first month. This proportion is much greater than that of the study of Choe et al in which only 8 of 168 patients (4.8%) who underwent cardiac monitoring with ILR were detected in the first month. A low detection yield during the first months of ILR may have resulted in artificially low Holter sensitivity results.


Third, in Cryptogenic Stroke and Underlying Atrial Fibrillation, only AF episodes lasting >30 seconds were considered in the analysis. Although the risk of stroke seems to increase with the duration of AF, very short AF paroxysms (e.g., shorter than 30 seconds) represent more than half of overall poststroke AF cases and their associated stroke risk should not be neglected until better investigated. Furthermore, there is no consensus regarding the minimum AF duration that warrants anticoagulation. Because Holter monitoring is technically able to detect such short AF paroxysms whereas ILR is not, some poststroke AF cases could have been missed or their diagnosis delayed (e.g., longer AF episodes occurring after the first short AF paroxysms) by ILR but potentially detected by Holter monitoring, resulting in even greater Holter sensitivity.


Fourth, the first weeks after stroke seem to be the time window in which patients are more prone to be diagnosed with AF. In Cryptogenic Stroke and Underlying Atrial Fibrillation the mean (±SD) time between the index event and randomization was 38.1 ± 27.6 days. Most of the patients (88.5%) were implanted with the loop recorder 10 days after randomization, meaning that the mean time between the qualifying event occurrence and implantation was almost 50 days. This could have resulted in many AF episodes missed because, during that period, 71.2% of participants received only a mean of 23 hours of Holter monitoring and 29.7% underwent telemetry monitoring during a median of 68 hours. In fact, in our recent systematic review and meta-analysis, we showed a significantly lower time of initiation of monitoring in studies using mobile cardiac outpatient telemetry compared with those using ILR (26.7 ± 21.9 vs 74.8 ± 17.9 days, p <0.001). This, at least partially, explained a fivefold greater probability of AF detection with mobile cardiac outpatient telemetry compared to ILR within the first 21 days of monitoring (adjusted hazard ratio 5.8, 95% CI 3.3 to 10.2; p <0.0001). Using 7-day, 21-day, or 30-day Holter monitoring immediately after stroke would have shown much greater detection yields and, therefore, greater Holter “sensitivity.”


The gold standard for AF detection after stroke remains unidentified. Simultaneous use of multiple methods may be a reasonable proxy for a gold standard enabling estimation of the sensitivity of each technology and duration of monitoring. In the meantime, caution should be exercised to minimize bias when assessing the usefulness of different screening methods.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Nov 27, 2016 | Posted by in CARDIOLOGY | Comments Off on Estimating the Sensitivity of Holter to Detect Atrial Fibrillation After Stroke or Transient Ischemic Attack Without a Gold Standard is Challenging

Full access? Get Clinical Tree

Get Clinical Tree app for offline access