Background
This study was planned by the EACVI/ASE/Industry Task Force to Standardize Deformation Imaging to (1) test the variability of speckle-tracking global longitudinal strain (GLS) measurements among different vendors and (2) compare GLS measurement variability with conventional echocardiographic parameters.
Methods
Sixty-two volunteers were studied using ultrasound systems from seven manufacturers. Each volunteer was examined by the same sonographer on all machines. Inter- and intraobserver variability was determined in a true test-retest setting. Conventional echocardiographic parameters were acquired for comparison. Using the software packages of the respective manufacturer and of two software-only vendors, endocardial GLS was measured because it was the only GLS parameter that could be provided by all manufactures. We compared GLS AV (the average from the three apical views) and GLS 4CH (measured in the four-chamber view) measurements among vendors and with the conventional echocardiographic parameters.
Results
Absolute values of GLS AV ranged from 18.0% to 21.5%, while GLS 4CH ranged from 17.9% to 21.4%. The absolute difference between vendors for GLS AV was up to 3.7% strain units ( P < .001). The interobserver relative mean errors were 5.4% to 8.6% for GLS AV and 6.2% to 11.0% for GLS 4CH , while the intraobserver relative mean errors were 4.9% to 7.3% and 7.2% to 11.3%, respectively. These errors were lower than for left ventricular ejection fraction and most other conventional echocardiographic parameters.
Conclusion
Reproducibility of GLS measurements was good and in many cases superior to conventional echocardiographic measurements. The small but statistically significant variation among vendors should be considered in performing serial studies and reflects a reference point for ongoing standardization efforts.
Highlights
- •
Strain measurements from nine machine and software vendors were compared in 62 subjects.
- •
GLS reproducibility was superior to conventional echocardiographic measures.
- •
Small but significant differences between vendors were detected.
- •
GLS may be used in clinical practice.
In the past years, speckle-tracking echocardiography has been successfully applied in research, and several clinical software tools and a wide variety of postprocessing options have been developed by different vendors. Despite long, reassuring experience with the technique, STE has not yet been fully adopted in routine clinical practice, as the robustness of the method has been questioned. Although several authors have demonstrated added clinical value of STE over conventional echocardiography, others have reported insufficient reproducibility and vendor dependence of measurements.
The European Association of Cardiovascular Imaging (EACVI) and the American Society Echocardiography (ASE) have therefore convened a task force to assess variability in speckle-tracking echocardiographic measurements, to identify its source, and to lead a dialogue with and among industry partners aimed at the standardization of STE as a clinical method. Left ventricular (LV) global longitudinal strain (GLS) was chosen by the task force as its first target parameter because it appeared the easiest to define, the most robust, and therefore the closest to routine clinical use. Measurement of endocardial GLS was specifically chosen because it was the only parameter that could be provided by all vendors. It was the consensus of the task force members that the relative variation in GLS measurement among the different vendors should not exceed 10% before the technique can be recommended for clinical use.
The present study was designed to (1) determine the level of test-retest variability of GLS measurements in the clinical setting per vendor, (2) assess the potential systematic variability of GLS measurements obtained using different analysis software packages from different vendors, (3) compare the variability of GLS measurements with that of LV ejection fraction (EF) and other conventional echocardiographic parameters, and (4) obtain reliable baseline data for future discussions within the task force.
Methods
Study Population
Study subjects were recruited among those referred to the routine echocardiography laboratory of our institution and from the coworkers of our research center. We included both patients with a variety of LV functional states and subjects with normal cardiac function. The main inclusion criteria were age ≥ 18 years, ability to consent, ability to walk and to lie in a supine position for 2 hours, good echocardiographic imaging windows, and regular heart rhythm. In total, 63 subjects were scheduled to participate, with 62 presenting for the study. All subjects provided written informed consent before inclusion. The study was approved by the ethics committee of the University Hospitals Leuven.
Industry Partner Recruitment
All industry partners within the task force were invited to participate in the study by an open letter. All seven ultrasound machine manufacturers responded positively and provided ultrasound machines, speckle-tracking echocardiographic software packages, and application specialists to support data acquisition for the study. Additionally, two manufacturers of generic software solutions for speckle-tracking echocardiography analysis joined the study (see Table 1 ).
Vendor | Ultrasound machine | Type | Software and version |
---|---|---|---|
Hitachi-Aloka | Prosound α7 CV version 6.1 | High end | 2D Tissue Tracking analysis version 4.0 |
Esaote | MyLab Alpha | Portable | SW release 5 |
GE | Vivid E9 version 112.1.3 | High end | EchoPac version 113.0.0 |
Philips | iE 33 Vision 2012 | High end | QLAB version 10.0 |
Samsung | EKO7 | High end | Kardia |
Siemens | SC2000 version 3.5 | High end | Velocity Vector Imaging 3.0 |
Toshiba | Artida version 3.0 | High end | ACP version 3.0 |
Epsilon ∗ | EchoInsight | ||
TomTec ∗ | TomTec Image Arena 2D CPA 1.2 |
Study Protocol
All examinations were completed within 1 week during nine scanning sessions of 2 to 3 hours each. Examination beds and ultrasound machines were arranged in a circle in a large room and were separated by shades to provide some privacy for the participants. To reduce interobserver variability in image acquisition, one experienced echocardiographer was assigned to each study subject and performed the examinations on all machines. The echocardiographer was responsible for acquiring high-quality standard images in compliance with the study protocol. To ensure the technical quality of the image data, company representatives were available during all examinations to guide the echocardiographers in handling the machines and to optimize machine settings and frame rate according to the respective manufacturers’ recommendations for strain measurements.
Blood pressure was measured before each examination. Subjects were examined in the left lateral decubitus position. Standard parasternal long-axis views were obtained to measure LV dimensions and wall thickness. Three consecutive cardiac cycles of apical views (four chamber, two chamber, and long axis) were stored for STE-based strain analysis and for conventional EF measurements. Doppler flow velocity traces of the mitral valve inflow were recorded to measure the E/A ratio. After completing the acquisition protocol in the assigned study subject, all echocardiographers moved counterclockwise to the next study subject to obtain a set of LV apical views, as described above, for the assessment of interobserver variability. Finally, all echocardiographers returned to their initially assigned study subjects and acquired a third set of apical views of the left ventricle for the assessment of intraobserver variability. By these means, 21 echocardiographic examinations were performed in each patient, and inter- and intraobserver variability could be assessed in a true test-retest scenario.
All image data were stored as raw data in a proprietary company format if available. In addition, all data were also stored in standard Digital Imaging and Communications in Medicine (DICOM) format to allow postprocessing with the independent software packages.
Data Analysis
GLS Measurements
All GLS measurements were performed by a single observer using the dedicated postprocessing software package of the respective vendor. For the two independent software packages, DICOM images acquired with the GE system were used.
As a first step, the cardiac cycle with the best image quality was selected. If image quality was similar in all three cycles, the middle one was chosen. To reduce bias from user interaction, we accepted as the beginning and end of the cardiac cycle the time points that were automatically defined by the respective software package, usually on the basis of automatic QRS detection on the electrocardiogram. For Samsung data, the peak of the R wave had to be manually selected by the reader because the software did not provide automatic timing.
In the second step, the endocardial border was traced, and a region of interest was constructed according to the recommendations of the respective vendor and the requirements of the software. Manual adjustments were made after visual inspection of the segmental tracking results throughout the cardiac cycle, with the goal of obtaining the best possible tracking in all myocardial segments.
Finally, the peak systolic endocardial GLS value per view, as automatically provided by the software, was noted.
Tracking feasibility in each apical view was rated with a tracking feasibility score (TFS) of 3 when, on visual inspection, tracking of all myocardial segments appeared correct; 2 if it failed in one segment; and 1 if tracking was insufficient in two or more segments. Image views with TFS of 1 were excluded from further analysis.
Mean peak systolic endocardial GLS (GLS AV ) was calculated by averaging the peak systolic endocardial GLS values from the available apical views (two or three apical views with TFS > 1).
GLS AV was extracted from the data as recommended by the vendors and compared among the seven different vendor-specific and the two generic software packages. Interobserver and intraobserver variability was assessed from the repeated acquisitions, by assessing the relative mean error (the ratio of absolute difference divided by the mean of the measurements). The same analysis was performed separately for GLS obtained from the four-chamber view only (GLS 4CH ).
In the following text, unless noted otherwise, we always refer to the absolute measurement values (i.e., a GLS value of −21% will be described as higher than a GLS value of −19%).
Conventional Echocardiographic Parameters of LV Function
Conventional echocardiographic parameters were measured blinded to the strain analysis and any other patient data. The same reader performed all measurements in one patient.
To comparatively assess inter- and intraobserver variability of LV EF measurements, images from one vendor (GE) were analyzed. This vendor was chosen because of the availability of standard analysis software in our echocardiography laboratory. EF was measured using both the single (EF 4CH ) and the biplane Simpson method (EF BI ). The relative mean error was calculated, similarly to GLS. For the comparative assessment of the intervendor variability of typical conventional echocardiographic measurements, LV end-diastolic diameter (LVEDD), interventricular septal and posterior wall thicknesses, and mitral inflow E-wave velocity and E/A ratio were measured in images from two vendors (GE and Samsung) using GE EchoPAC software by the same reader.
Statistical Analysis
Data were examined for distribution using the Kolmogorov-Smirnov test. Continuous variables are expressed as mean ± SD. GLS measurements were compared among vendors by repeated-measures analysis of variance (ANOVA). Because there is no gold-standard measure of GLS, the results of each vendor were correlated with each other and with the mean of all vendors. To assess interobserver and intraobserver variability, the relative mean error was calculated as mentioned above (therefore, a reported 10% error in a GLS measurement of 20% indicates a 2% absolute error in strain); the relative mean error was compared among vendors with repeated-measures ANOVA, while the comparison with conventional echocardiographic parameters and with EF was performed by using one-way ANOVA. Pearson correlation coefficients were used to examine the associations between GLS AV and GLS 4CH and among GLS AV measurements from different vendors. All statistical tests were two tailed, and P values ≤ 0.05 were considered to indicate statistical significance. Commercially available statistical software was used for the analysis (SPSS version 18; SPSS, Inc, Chicago, IL).
Results
Sixty-two of the 63 invited persons participated in the scanning, and a total of 1,302 echocardiographic examinations could be performed. For an individual study participant, no relevant changes in blood pressure were observed during the course of the entire scanning session. Furthermore, no differences in blood pressure were observed between machines. EFs in our study population ranged from 35% to 78%, with 14 participants having EFs < 55%. The average EF of the study population was 60%. Overall tracking feasibility was high, with an average TFS of 2.81 in all vendors, ranging from 2.62 to 2.92. Nevertheless, significant differences were observed between vendors in tracking feasibility ( Supplemental Figure 1 ). GLS AV measurements were feasible in all 62 subjects with Hitachi-Aloka, GE, Philips, Epsilon, and TomTec; in 61 patients with Esaote; in 60 patients with Toshiba and Siemens (for the latter, one data set was lost because of storage problems); and in 59 patients with Samsung. In total, 52 single-image acquisitions were excluded from the analysis because of suboptimal tracking (TFS of 1). Additionally, nine single-image acquisitions could not be analyzed, because of missing files ( n = 3) or electrocardiographic triggering problems ( n = 6). A detailed list of dropouts is provided in Table 2 .
Vendor | Total | 4CH | 3CH | 2CH | ECG triggering problems |
---|---|---|---|---|---|
Hitachi-Aloka | 2 | 1 | 1 | 0 | 0 |
Esaote | 10 | 2 | 4 | 2 | 4 |
GE | 3 | 1 | 1 | 1 | 0 |
Philips | 3 | 0 | 1 | 2 | 0 |
Samsung | 13 | 5 | 2 | 6 | 0 |
Siemens | 3 | 0 | 1 | 2 | 0 |
Toshiba | 17 | 6 | 4 | 5 | 2 |
Epsilon | 2 | 0 | 1 | 1 | 0 |
TomTec | 3 | 1 | 1 | 1 | 0 |
Intermachine Comparison
The results of GLS AV for each vendor are displayed in Figure 1 . Values ranged from −18.0% to −21.5%, with moderate but statistically significant differences observed among vendors (ANOVA P < .001). GLS AV measurements correlated significantly among vendors ( Table 3 ). The highest absolute difference between vendors was 3.7% strain units ( Table 4 ). Furthermore, all vendors showed a significant correlation with the mean of all vendors ( Figure 2 A). Bland-Altman analyses for both comparisons revealed no value-dependent bias ( Figure 2 B, Table 4 ).
Hitachi-Aloka | Esaote | GE | Philips | Samsung | Siemens | Toshiba | Epsilon | |
---|---|---|---|---|---|---|---|---|
Esaote | .850 | |||||||
GE | .840 | .873 | ||||||
Philips | .902 | .843 | .869 | |||||
Samsung | .873 | .868 | .837 | .894 | ||||
Siemens | .855 | .861 | .836 | .883 | .904 | |||
Toshiba | .815 | .816 | .849 | .864 | .880 | .881 | ||
Epsilon | .804 | .821 | .872 | .819 | .797 | .867 | .825 | |
TomTec | .821 | .854 | .902 | .826 | .853 | .841 | .861 | .884 |
Hitachi-Aloka | Esaote | GE | Philips | Samsung | Siemens | Toshiba | Epsilon | |
---|---|---|---|---|---|---|---|---|
Esaote | −2.3 (−6.1 to 1.5) | |||||||
GE | −3.0 (−7.1 to 1.1) | −0.7 (−4.5 to 3.1) | ||||||
Philips | −0.9 (−4.1 to 2.2) | 1.4 (−2.5 to 5.2) | 2.1 (−1.6 to 5.8) | |||||
Samsung | −0.3 (−3.8 to 3.2) | 1.9 (−1.8 to 5.5) | 2.7 (−1.6 to 6.9) | 0.5 (−2.8 to 3.8) | ||||
Siemens | −2.0 (−5.8 to 1.8) | 0.3 (−3.6 to 4.2) | 1.0 (−3.1 to 5.2) | −1.1 (−4.5 to 2.3) | −1.2 (−8.4 to 5.9) | |||
Toshiba | −0.7 (−4.7 to 3.3) | 1.65 (−2.4 to 5.7) | 2.4 (−1.7 to 6.5) | 0.3 (−3.3 to 3.8) | −0.3 (−3.7 to 3.1) | 1.4 (−1.9 to 4.8) | ||
Epsilon | −0.5 (−4.8 to 3.7) | 1.7 (−2.6 to 6.0) | 2.5 (−1.3 to 6.3) | 0.4 (−3.7 to 4.6) | −0.2 (−4.8 to 4.5) | 1.5 (−2.0 to 4.9) | 0.1 (−4.0 to 4.2) | |
TomTec | −3.7 (−8.1 to 0.7) | −1.4 (−5.5 to 2.8) | −0.7 (−4.0 to 2.7) | −2.7 (−7.0 to 1.6) | −3.3 (−7.4 to 0.8) | −1.6 (−5.9 to 2.6) | −3.0 (−7.1 to 1.1) | −3.2 (−7.1 to 0.8) |