A Brief History of Predicting Response to Cardiac Resynchronization Therapy
Thousands of articles have been published on cardiac resynchronization therapy (CRT) in patients with heart failure. Many of these articles focus on refining selection criteria for CRT to improve the rate of individual patient response, and arguably none have succeeded in improving clinical practice. The lack of standardization of both baseline predictors and outcome or “response criteria” is impairing progress in this field and must be addressed.
Echocardiographic measures of dyssynchrony should undergo rigorous evaluation and comparison studies before testing their ability to help improve response to CRT. Additionally, a common end point to quantify “response to CRT” should be decided upon, as the many published end points do not agree with one another. The purpose of this editorial is to discuss these issues in the field of CRT research within the framework of an article showing that the often-ignored right ventricle may play a role in improving CRT for patients with heart failure. Much-needed studies to address shortcomings in this field are proposed, and echocardiography remains poised to play a crucial role in the continued development of this exciting field.
The dilemma articulated by Claude Bernard in 1865 still haunts clinicians today: the response of the “average” patient to a therapy is not necessarily the response of the individual patient standing before the clinician. The “average” response to cardiac resynchronization therapy (CRT) is undeniably favorable for the current criteria by which we select patients. CRT is a class IA recommendation for patients in sinus rhythm with QRS duration > 120 msec, left ventricular (LV) ejection fraction ≤ 35%, and New York Heart Association (NYHA) class III or ambulatory class IV heart failure symptoms despite optimal medical therapy. Fourteen randomized controlled trials enrolling 4,420 patients have demonstrated that CRT improves LV ejection fraction, quality of life (QOL), and NYHA class while reducing hospitalizations by 37% and all-cause mortality by 22%.
However, CRT is invasive, costly, and associated with rare but serious complications, including peri-implantation death (0.3%–0.5%), wound or device infection (2%), and device or lead failure (5%). Additionally, the initial application for CRT submitted by Medtronic (Minneapolis, MN) to the US Food and Drug Administration showed that 32% to 55% of patients did not respond to CRT according to the primary end points of improved QOL, improved 6-min walking distance ( 6MWD ), and improved NYHA class ( Table 1 ). These factors demanded an attempt to limit CRT to those patients who will truly benefit, or “respond.” Although initial single-center studies showed promise in using LV mechanical dyssynchrony (defined using echocardiographic measures) as a baseline predictor of response to CRT, multicenter and subsequent single-center studies have not been able to reproduce these results.
Response criterion | Percentage of patients who responded according to the given end point (pacer off) ∗ | Percentage of patients who responded according to the given end point (pacer on) ∗ | NNT |
---|---|---|---|
NYHA | 38% | 68% | 3.4 |
QOL | 44% | 58% | 7.4 |
6MWD | 27% | 45% | 5.5 |
6MWD + QOL | 18% | 33% | 6.9 |
6MWD + NYHA | 15% | 38% | 4.4 |
QOL + NYHA | 26% | 47% | 4.7 |
6MWD + QOL + NYHA | 12% | 30% | 5.6 |
CCS | 39% | 65% | 3.9 |
Alive at 1 y | 87% | 90% | 32.7 |
Alive at 2 y | 75% | 83% | 13.1 |
Alive at 3 y | 63% | 77% | 7.4 |
∗ Percentages are rounded to the nearest whole number for display, while actual ratios were used to quantify the NNT.
Right Ventricular Assessment and Cardiac Resynchronization Therapy
The search for better selection criteria for CRT has largely been dominated by LV indices of dyssynchrony or function. Baseline right ventricular (RV) dysfunction prior to CRT increases mortality and reduces the likelihood of both clinical and echocardiographic response after CRT. Additionally, CRT improves RV ejection fraction and peak systolic myocardial velocities in the RV free wall. A recent study even suggested that a subset of patients with exaggerated LV-RV interaction show a reduced pulmonary artery pressure after CRT without LV remodeling. Thus, the assessment of RV function is emerging as an important aspect of CRT that has been mostly overlooked.
In this issue of JASE , Szulik et al report that the addition of RV dyssynchrony parameters at baseline improved the ability to predict clinical response to CRT. The authors generated a multivariate model of baseline predictors, including parameters of both LV and RV morphology and LV, RV, and interventricular dyssynchrony. The model showed that the addition of RV dyssynchrony parameters improved the power of their model to predict response to CRT, with an area under the receiver operating characteristic curve equal to 1. Interestingly, no LV dyssynchrony measures were helpful in predicting response to CRT, but this may be because the authors required baseline interventricular or intra-LV dyssynchrony for inclusion in the study.
The report by Szulik et al provides evidence that, as the authors say, “it might be useful to consider an analysis of baseline RV function and dyssynchrony” in the search for refined CRT selection criteria. However, several limitations deserve mention. These limitations are in fact limitations of the entire field of CRT research; the work of Szulik et al provides a nice framework for a much-needed discussion.
Right Ventricular Assessment and Cardiac Resynchronization Therapy
The search for better selection criteria for CRT has largely been dominated by LV indices of dyssynchrony or function. Baseline right ventricular (RV) dysfunction prior to CRT increases mortality and reduces the likelihood of both clinical and echocardiographic response after CRT. Additionally, CRT improves RV ejection fraction and peak systolic myocardial velocities in the RV free wall. A recent study even suggested that a subset of patients with exaggerated LV-RV interaction show a reduced pulmonary artery pressure after CRT without LV remodeling. Thus, the assessment of RV function is emerging as an important aspect of CRT that has been mostly overlooked.
In this issue of JASE , Szulik et al report that the addition of RV dyssynchrony parameters at baseline improved the ability to predict clinical response to CRT. The authors generated a multivariate model of baseline predictors, including parameters of both LV and RV morphology and LV, RV, and interventricular dyssynchrony. The model showed that the addition of RV dyssynchrony parameters improved the power of their model to predict response to CRT, with an area under the receiver operating characteristic curve equal to 1. Interestingly, no LV dyssynchrony measures were helpful in predicting response to CRT, but this may be because the authors required baseline interventricular or intra-LV dyssynchrony for inclusion in the study.
The report by Szulik et al provides evidence that, as the authors say, “it might be useful to consider an analysis of baseline RV function and dyssynchrony” in the search for refined CRT selection criteria. However, several limitations deserve mention. These limitations are in fact limitations of the entire field of CRT research; the work of Szulik et al provides a nice framework for a much-needed discussion.
The Dys-synchrony Among Dyssynchrony Parameters
Szulik et al investigated 13 different dyssynchrony parameters in their study. This large number is not surprising; anyone reading the literature on predicting response to CRT will likely be overwhelmed by the number of different dyssynchrony parameters. Some questions an investigator is faced with when deciding how to quantify dyssynchrony include the following:
- 1.
Which imaging modality should be used (echocardiography, computed tomography, magnetic resonance imaging, nuclear medicine)?
- 2.
If echocardiography is used, which type should be used (M mode, pulsed Doppler, tissue Doppler velocity, strain, strain rate or displacement, speckle tracking, three-dimensional)?
- 3.
Which industrial platform should be used (Philips, GE Healthcare, etc)?
- 4.
Should radial, longitudinal, or circumferential motion be examined?
- 5.
How many segments or locations should be analyzed (two, four, six, 12, or 16 segments)?
- 6.
Should time to onset or time to peak be measured, or should more data be used?
- 7.
Should the analysis be limited to systole or isovolumic periods or include both?
- 8.
Should the standard deviation or the maximum difference among timings be assessed?
Although studies have addressed some of these issues, many are contradictory, and none have made a good case for which methodology best quantifies dyssynchrony with both high accuracy and good reproducibility. Additionally, multiple studies have shown significant differences both in the diagnosis and the magnitude of dyssynchrony among the different published dyssynchrony parameters.
Assessing the Accuracy of Dyssynchrony Parameters
Response to CRT should not be the gold standard by which we determine the accuracy of a dyssynchrony parameter. Perhaps the major reason we do not have a gold standard dyssynchrony parameter is because we are using response to CRT as a metric for determining the accuracy of a parameter. A patient’s individual response is affected by too many other variables, including the total scar burden and location, the region of latest activation, RV function as noted by Szulik et al, and comorbidities such as diabetes, renal failure, and atrial fibrillation. We need to document and compare the accuracy of dyssynchrony parameters in well-controlled studies such as acute pacing-induced dyssynchrony, for which subjects can serve as their own controls.
Quantifying the Reproducibility of Dyssynchrony Parameters
Reproducibility was either overlooked or grossly inaccurate in many initial publications on dyssynchrony parameters. Unfortunately, overlooking reproducibility led to problems in future studies. For example, the dyssynchrony parameter with arguably the most evidence supporting its use in predicting response to CRT is the standard deviation of times to peak systolic velocity (TsSD) in the 12 basal and midwall segments of the left ventricle. The initial publications on TsSD did not report reproducibility and instead cited an article apparently documenting “low inter- and intraobserver variability of <5%.” The cited article reports that “the mean difference between observations was <5% of the mean value of the observations for measurement of both amplitudes and durations.” There are four problems to note here that apply not only to the example case but to all dyssynchrony parameters in general:
- 1.
Reporting the mean difference is a grossly inaccurate measure of reproducibility, and 95% limits of agreement should be reported at a minimum.
- 2.
Reproducibility should be determined for the dyssynchrony parameter itself, not just for individual components of a dyssynchrony parameter (such as the time to peak velocity of one of the 12 locations involved in calculating TsSD), because these are not equivalent.
- 3.
If the parameter is to be used in clinical practice as a diagnostic (i.e., to diagnose a patient with dyssynchrony), then good reproducibility of the diagnosis is more important than showing good reproducibility of the magnitude of the parameter. Diagnostic reproducibility should be assessed with κ coefficients.
- 4.
Reproducibility should be documented both between observers and between tests. Test-retest agreement is more relevant, and the few studies that have documented this on dyssynchrony parameters have shown poor results.
The Dys-synchrony Among Response Criteria
Methods to assess response to CRT are nearly as varied and different as the numerous published dyssynchrony parameters. Qualitative lack of agreement among response criteria was apparent from the original Multicenter InSync Randomized Clinical Evaluation (MIRACLE) trial data showing lower numbers of patients who responded to more than one end point ( Table 1 ). The discrepancy was mostly overlooked until a recent study identified 17 different response criteria from the 26 most cited publications on predicting response to CRT. The percentage of patients showing a positive response to CRT ranged widely from 32% to 91% for the different criteria. Agreement among the methods was strong only 4% of the time and poor 75% of the time. It is therefore impossible to make comparisons across multiple studies that used different response criteria. Szulik et al attempted to address this problem by using two different methods in their study to predict response: reverse LV remodeling (> 15% reduction in LV end-systolic volume [LVESV]) and “clinical response” (alive, no hospitalization for decompensation, improved NYHA class ≥ 1, and 10% decreases in both peak ventilatory oxygen uptake and 6MWD ). The authors found a different result for each end point, so the reader is left wondering which end point is more important; unfortunately, this question has no good answer. Additionally, their measure of clinical response has not been used before and therefore may end up only adding confusion to the end point conundrum.
The Fallacy of Our “Best” Surrogate End Point
More than 150 clinical, hemodynamic, or exercise variables have been identified as predictors of survival in heart failure. The major motivation for using surrogate end points is to reduce sample size and study duration. The most widely used surrogate in quantifying response to CRT is reduction in LVESV > 15%, for three main reasons:
- 1.
A substudy of the MIRACLE trial showed that the benefits of CRT on 6MWD , NYHA class, and QOL occurred predominantly in patients with objective changes in LV geometry.
- 2.
The 95% limit of variability in the measurement of LVESV is equal to ±15%.
- 3.
A study by Yu et al showed that a reduction of > 10% in LVESV predicted all-cause and cardiovascular mortality, while changes in NYHA class, 6MWD , and QOL did not.
However, a closer look at the data suggests that a reduction in LVESV is not a good surrogate at all. Table 2 shows data deduced from the aforementioned study by Yu et al on using LVESV response to predict cardiovascular mortality. The authors proposed that these data provided good evidence that LVESV response predicts cardiovascular mortality and could be used as an adequate surrogate end point. Suppose we identify a baseline predictor that has 100% accuracy for predicting LVESV response to CRT. If we limit CRT enrollment criteria to this perfectly accurate LVESV predictor, then, on the basis of data in Table 2 , we will exclude 38 of 141 patients (27%) from receiving CRT who have not had cardiovascular death and may have actually benefited from CRT. This is further illustrated by the fact that the κ value derived from Table 2 is equal to 0.3, suggesting poor agreement between LVESV response and cardiovascular death.