Myocardial deformation measurements using two-dimensional speckle-tracking echocardiography (STE) are known to vary among vendors. The intervendor agreement of three-dimensional (3D) deformation indices has not been studied. The goals of this study were to determine the intervendor agreement of 3D STE–based measurements of left ventricular (LV) deformation parameters to investigate the intrinsic variability of these measurements and identify the sources of intervendor differences.
Real-time full-volume images obtained in 30 subjects with normal LV systolic function using two vendors’ equipment (V1 and V2) on the same day were analyzed by two independent observers using two software packages (S1 and S2). Agreement between three technique combinations (V1/S1, V2/S2, and V1/S2) and their intrinsic reproducibility (interobserver and intraobserver agreement) were assessed using intraclass correlation coefficients. Parameters of LV deformation included global longitudinal strain, twist, 3D displacement, and 3D strain and its radial, longitudinal, and circumferential components.
For all three combinations, intertechnique agreement was poor (intraclass correlation coefficient < 0.4), always beyond the intrinsic variability. For all measured parameters, the intertechnique agreement was better when the same software package was used with images from different vendors (V2/S2 vs V1/S2) than when images from same vendor were analyzed using different software (V1/S2 vs V1/S1).
Three-dimensional STE–derived LV deformation parameters are highly vendor dependent, and the discordance levels are beyond intrinsic measurement variability of any of the tested combinations of imaging equipment and analysis software. This intervendor discordance must be taken into account when interpreting 3D deformation data.
Two-dimensional (2D) speckle-tracking echocardiography (STE) is a relatively new tool that is increasingly used for the analysis of systolic and diastolic cardiac function and myocardial deformation in various conditions, including myocardial ischemia, left ventricular (LV) dyssynchrony in patients with heart failure, hypertrophic cardiomyopathy, and fibrosis and viability. The known limitations of 2D STE are that (1) myocardial deformation measurements are affected by loss of speckles due to motion outside the imaging plane ; (2) the technique possesses limited reproducibility, especially when applied to suboptimal images; and (3) there is poor intervendor agreement, attributed to a lack of standardization in image analysis.
More recently, speckle tracking from three-dimensional (3D) echocardiography was developed, allowing more complete and accurate assessment of myocardial deformation in 3D space by avoiding the loss of speckles due to out-of-plane motion. Currently, this concept has been implemented in several commercial imaging systems. To date, its usefulness has been demonstrated for the quantification of LV volumes and the evaluation of LV wall motion in ischemic heart disease. However, there are few data on the reproducibility of 3D STE–derived deformation measurements, and the intervendor variability of these measurements has not been studied. These issues are important, because reasonable levels of reproducibility and vendor independence are imperative for this methodology to become useful both in clinical research and patient care.
Accordingly, the aims of our study were (1) to determine the intervendor agreement of 3D STE–based measurements of LV deformation parameters, (2) to investigate the intrinsic variability of these measurements, and (3) to identify the sources of intervendor differences.
Real-time 3D data sets were acquired on the same day in the same setting by a single experienced sonographer using two different vendors’ imaging systems (coded V1 and V2). The images were analyzed using two different software packages (coded S1 and S2), resulting in four theoretical combinations: V1/S1, V1/S2, V2/S1, and V2/S2. Because one of these combinations was not technically feasible, only three remaining combinations were studied ( Figure 1 ). Intertechnique agreement was assessed by comparing the measurements obtained with these three combinations. To put the intertechnique discordance in perspective and ensure that it was not a by-product of simple intermeasurement variability, the intrinsic variability of each technique was assessed using repeated measurements by the same observer ≥1 week later and by a second independent observer. Both observers were blinded to the results of all prior measurements. The influence of image quality on the intertechnique agreement was also studied.
We prospectively studied 30 volunteers (17 men, 13 women; mean age, 34 ± 8 years) with normal LV systolic function as determined by standard 2D echocardiography and with image quality that was visually judged as adequate for speckle tracking. Patients with prior cardiothoracic surgery, known coronary artery disease, or cardiac arrhythmias were excluded. The protocol was approved by the institutional review board, and each subject provided informed consent.
Real-time 3D echocardiographic imaging was performed by a single experienced sonographer using two commercial ultrasound systems equipped with fully sampled matrix-array transducers: an Artida 4D with the PST-25SX transducer (Toshiba Medical Systems, Zoetermeer, The Netherlands) and an iE33 with the X5 transducer (Philips Medical Systems, Andover, MA). A wide-angled acquisition “full-volume” mode was used, in which a number of wedge-shaped subvolumes were obtained over consecutive cardiac cycles during a single breath hold. Special care was taken to include the entire LV cavity within the pyramidal scan volume. After gain settings were optimized for endocardial visualization using each imaging system, three or four data sets were acquired. Data sets were stored digitally for offline analysis.
Images were analyzed using 3D Wall Motion Tracking software (Toshiba Medical Systems) and 4D LV Analysis software (TomTec Imaging Systems, Unterschleissheim, Germany) by two experienced observers. Measurements were performed using the data set with the best image quality, which was selected by consensus of the two readers. Apical two-chamber, four-chamber, and short-axis views at different levels of the left ventricle (base, middle, and apex) were automatically selected at end-diastole. Nonforeshortened apical views were identified by finding the largest long-axis dimensions. In these two planes, the LV boundaries were initialized by manually pointing at a small number of anatomic landmarks (mitral annulus and LV apex). In both cases, papillary muscles were included in the LV cavity ( Figure 2 ). Then, the 3D endocardial surface was automatically reconstructed and tracked in 3D space throughout the cardiac cycle. Subsequently, the endocardial surface was manually adjusted when necessary until a best match with the actual endocardial position was visually verified in the different views. The left ventricle was automatically divided into 16 3D segments using standard segmentation.
The following parameters of global LV deformation were evaluated: global longitudinal strain (GLS) and twist, defined as the maximum difference between basal and apical rotation. In addition, 3D displacement and 3D strain and its radial, longitudinal, and circumferential components were averaged over the 16 segments. The peak value of each index was defined as its maximum absolute value with the sign. Because the potential clinical value of this methodology is related to its ability to assess regional myocardial function, the intertechnique concordance was also assessed on a segmental level.
For each chosen data set, image quality was independently assessed by the two readers and graded as (1) inadequate for tracking or (2) adequate to optimal. In case of disagreement between the two readers, a consensus was reached by a joint review.
For each deformation parameter, the agreement between each pair of the three techniques (V1/S2 vs V1/S1, V2/S2 vs V1/S1, and V2/S2 vs V1/S2) was studied using Bland-Altman analysis, designed to quantify a systematic difference (bias) between two techniques used to measure the same parameter, as well as the spread of differences in individual pairs of measurements (limits of agreement). Of note, good agreement is indicated by near zero bias and narrow limits of agreement relative to the measured values. In addition, we used intraclass correlation coefficients (ICCs) with 95% confidence intervals. ICCs are similar to conventional Pearson’s correlation coefficients but take into account the bias, and they are used to assess concordance between continuous variables.
All these analyses were performed on the whole sample and separately on a subset of images graded as adequate to optimal. For each of the three techniques, the intrinsic intrareader and interreader reproducibility was assessed from the corresponding repeated measurements using ICCs with 95% confidence intervals. All analyses were performed using R statistical software (R Foundation for Statistical Computing, Vienna, Austria).
Among the 30 studied subjects, the mean heart rate was 67 ± 11 beats/min. The mean frame rates were 20.8 ± 1.2 frames/sec for the Toshiba data sets and 20.3 ± 5.9 frames/sec for the Philips data sets.
Figure 3 shows examples of images and GLS curves obtained in three patients, depicting three different kinds of intertechnique agreement. In patient A, GLS curves obtained by all three techniques were similar, resulting in similar measurements of the magnitude and timing of peak strain. In contrast, in patients B and C, GLS curves were different, leading to differences in measured time to peak (patient B) and peak value (patient C).
Table 1 shows the results of Bland-Altman analyses and ICCs for the intertechnique comparisons. The intertechnique agreement was not assessed for two parameters, 3D strain and radial strain, because the results reported by the two software packages were not on the same scale and were thus not comparable. For example, the mean values of 3D strain obtained from the same 30 data sets were 21.1 ± 7.6% for V1/S1 and −34.9 ± 4.1% for V1/S2.
|Variable||Mean ± SD||V2/S2 vs V1/S1||V1/S2 vs V1/S1||V2/S2 vs V1/S2|
|Bias ∗||Limit of agreement ∗||ICC (95% CI)||Bias ∗||Limit of agreement ∗||ICC (95% CI)||Bias ∗||Limit of agreement ∗||ICC (95% CI)|
|Twist||8.16 ± 5.71||6.96 (85%)||11.43 (140%)||0.03 (−0.09 to 0.12)||7.73 (95%)||10.06 (123%)||0.08 (0 to 0.19)||−0.76 (9%)||15.28 (187%)||−0.02 (−0.33 to 0.27)|
|Longitudinal strain||−17.25 ± 3.27||−4.59 (27%)||7.7 (45%)||0 (−0.1 to 0.12)||−3.05 (18%)||5.58 (32%)||0.13 (−0.07 to 0.31)||−1.55 (9%)||6.09 (35%)||0.44 (0.16 to 0.63)|
|GLS||−15.71 ± 3.01||−2.92 (19%)||6.94 (44%)||0.15 (−0.04 to 0.33)||−2.17 (14%)||5.25 (33%)||0.27 (−0.02 to 0.54)||−0.75 (5%)||7.12 (45%)||0.35 (−0.09 to 0.63)|
|Circumferential strain||−23.57 ± 4.47||−4.32 (18%)||12.77 (54%)||−0.05 (−0.36 to 0.22)||−3.32 (14%)||7.02 (30%)||0.33 (0.07 to 0.55)||−1 (4%)||11.94 (51%)||0.1 (−0.25 to 0.46)|
|3D displacement||8.56 ± 1.53||0.88 (10%)||3.73 (44%)||0.21 (−0.02 to 0.45)||0.07 (1%)||2.36 (28%)||0.59 (0.26 to 0.78)||0.82 (10%)||3.5 (41%)||0.38 (0.07 to 0.61)|
For all measured parameters, the intertechnique agreement was better when the same software package was used to analyze images from different vendors (V2/S2 vs V1/S2) than when images from the same vendor were analyzed using different software packages (V1/S2 vs V1/S1). Not surprisingly, the worst levels of agreement were when images from different vendors were analyzed using different software packages (V2/S2 vs V1/S1).
For all three comparisons, twist was the parameter with the worst intertechnique concordance, as reflected by biases of up to 95% of the mean value and ICCs < 0.1. In contrast, 3D displacement showed the best intertechnique agreement, with biases up to only 10% of the mean value and ICCs varying between 0.21 and 0.59.
These segmental intertechnique concordance data demonstrated that regional measurements showed biases and ICCs that were similar to or worse than those measured globally, but the limits of agreement were systematically two to three times wider, being of the same order of magnitude as the mean value (detailed data per segment are not shown for the sake of space and simplicity).
Results of intrinsic variability analysis measured by ICCs are displayed in Figures 4 to 6 , together with the intertechnique comparisons: Figure 4 shows the two global indices (GLS and twist), Figure 5 the 3D measurements (strain and displacement), and Figure 6 the three strain components. Among the measured parameters, 3D displacement was the most reproducible, with ICCs consistently > 0.65. In contrast, twist was the least reproducible parameter, with most ICCs < 0.50.