Strain Imaging in Echocardiography: Converging on Congruence?




Imagine this scenario:


Sonographer: Here’s the study on Mr Jones, your patient with aortic stenosis.

Physician: Wait, we’ve always done Mr Jones on the HDI-5000, and we need to keep doing him on the HDI-5000. Any other machine will give different AS gradients.

Sonographer: But we got rid of the HDI-5000 last week.

Physician: Uh-oh….


Ludicrous, right? The echocardiography community takes it as a matter of faith that structural and Doppler measurements made on one instrument are interchangeable with those made on any other instrument. A 2.3-cm upper septum will be 2.3 cm on any machine, as would a 5 m/sec transaortic jet. That this assumption has only rarely been tested empirically is of little import. We are comfortable with it.


But there is one arena in which we have hard data demonstrating a significant lack of agreement among vendors, and that is left ventricular (LV) systolic strain imaging by echocardiographic speckle-tracking. To choose juse two of a multitude of similar studies, the Japanese Ultrasound Speckle Tracking of the Left Ventricle (JUSTICE) study examined 817 subjects using one or more of GE, Philips, and Toshiba instruments to establish normal values of global longitudinal strain (GLS), global circumferential strain, and global radial strain. Mean normal values for the three vendors ranged between −18.9% and –21.3% for GLS, wider for global circumferential strain (−22.2% to −30.5%), and wider still for global radial strain (+36.3% to +54.6%). More concerning data came from the 193 patients who were studied on two or more of the machines. For only one vendor combination (Philips vs GE) and one component (GLS) did r 2 exceed 0.5 (which means that 50% of the variation in one set of measurements was predicted by the other measurement; for a perfect pair of measurements, r 2 would equal 1.0). All other component-vendor combinations had r 2 values < 0.25, and many were not even statistically significant. Complementary data come from Koopman et al ., who studied 34 children with GE and Philips equipment. For radial strain, GE equipment gave values almost twice those of Philips equipment, with r 2 < 0.2. Concerning indeed! No matter how valuable this technique appeared in early studies, it was crippled by these intervendor discrepancies.


In response to these and other studies, the European Association for Cardiovascular Imaging and the American Society of Echocardiography jointly organized a task force to bring industry and the societies together to analyze the causes of these differences and to work toward harmonizing the measurements. In addition to the two founding societies, the Japanese and Korean societies of echocardiography have joined the effort, along with seven hardware vendors (Esaote, GE, Hitachi Aloka, Philips, Samsung, Siemens, and Toshiba) and two software vendors (Epsilon and TomTec). Within the cocoon of the Strain Standardization Task Force (protecting them against antitrust charges), vendors are free to discuss technical aspects of their speckle-tracking algorithms. We focused initially on GLS, because it has the most clinical data supporting its use and generally had shown the least intervendor discrepancy. Interestingly, we found that many of the differences related simply to the companies’ using the term “global longitudinal strain” to refer to very different physical entities. For example, vendor A might report strain across the full thickness of the wall, whereas vendor B might measure strain only along the endocardium (which should have higher strain than the midwall or epicardium). Vendor C might average the end-systolic values from each of the segments, whereas vendor D might average the peak strain values regardless of when they occurred during systole. Vendor E might begin strain integration at the onset of the Q wave, whereas vendor F might begin at the peak of the R wave. These seemingly trivial differences in technique could add up to clinically significant differences in reported strain values. To guide the vendors in their technical convergence, the task force provided them with synthetic data sets (courtesy of Jan D’Hooge, PhD) with prespecified strain values and the opportunity to test their algorithms in a large clinical trial (spearheaded by Jens-Uwe Voigt, MD, and his colleagues at the University of Leuven). The first product of this task force was a consensus report that established a common nomenclature for strain reporting and attempted to establish a “baseline” situation for GLS reporting. Vendors are certainly allowed to report other types of GLS but only with full disclosure of precisely what they are reporting.


While the synthetic and clinical intervendor reports are currently undergoing peer review, this issue of JASE contains two articles that provide important perspectives (both encouraging and discouraging) on the topic of strain standardization.


First, in the “glass half empty” category, we have another contribution from the JUSTICE investigators. These investigators reviewed retrospectively echocardiograms from 81 of the 193 subjects, all with normal results from a cardiovascular perspective, used to study intervendor variability in the original report. The criteria for selecting this subset related to focusing on adult subjects and limiting analysis to centers that contributed high-quality images to the original trial. In the 3 years since the JUSTICE study was completed, GE, Philips, and Toshiba have all released updated versions of their strain software, and testing the consistency of results with these serial versions of proprietary software formed the first aim of the study. The second broad aim was to test whether two “nondenominational” software programs (Epsilon and TomTec), designed to operate on Digital Imaging and Communications in Medicine–formatted images from any vendor, would also yield consistent results. As proposed in our strain consensus document, to avoid confusion in discussing changes in a parameter (GLS) that is intrinsically negative, when Nagata et al . speak of an “increase” in strain, they mean that the value is becoming more negative, corresponding to an improvement in LV contraction.


So what was found? For each of the hardware vendors, there was a highly significant change in GLS with successive software versions (GE: decrease in absolute GLS by 1.1%; Philips: decrease by 2.0%; Toshiba: increase by 0.6%), which could have led to reclassification between “normal” and “abnormal” in up to 19% of patients. Importantly, though, for the strain standardization effort, the spread among the vendors decreased significantly from the earlier software versions to the later ones, with the range in mean normal values falling from 3.6% to 1.9%, indicating important convergence among the vendors.


As for the vendor-independent software, there were results to cheer both optimists and pessimists. For TomTec, analysis of the same patients from two different hardware platforms yielded mean values that were within 1% of each other, though with 95% limits of agreement that ranged from 7.2% to 9.1% for the various pairs. For Epsilon, the bias was a bit higher than for TomTec (up to 1.25%), but the 95% limits of agreement were ≤6.5%. Interestingly, when compared head-to-head across all images, TomTec gave absolute GLS that was 2.2% higher than Epsilon, perhaps reflecting the fact that Epsilon analyzes strain across the LV wall, whereas TomTec is focused on the endocardium, where strain is known to be highest, indicating a ripe target for convergence. At the end of this analysis, however, the JUSTICE investigators conclude “that the same ultrasound machine and the same 2D speckle-tracking software should be used to measure GLS in longitudinal studies and cross-sectional studies.”


Moving on to the “glass half full” report in the current issue of JASE , this second study was structured in a fashion largely similar to the first, with intervendor comparison of strain across several versions of software (though without the additional analysis of vendor-neutral software). By way of full disclosure, please note that I am a coauthor of this report. In contrast to the prior study, this one was not focused solely on normal subjects (who constituted only 20% of the study population), although most of the others must have been “pretty normal,” as all mean values of GLS and LV ejection fraction (EF) were quite comparable with those in the JUSTICE reanalysis, despite at least one subject with an EF into the teens.


These subjects were all scanned with GE and Philips equipment, then analyzed using three versions of GE’s EchoPAC (versions E11 [released before standardization efforts] and E12 and E13 [released subsequently]) and Philips’s QLAB (prestandardization version Q8 and poststandardization versions Q9 and Q10). The calculations are complex, but the key results involve the coefficient of variation (CV) between the various pairs of strain calculations, essentially a measure of the “scatter” between two data sets, the ratio of the SD of the pairwise differences to the mean of the measurements. Lower CVs indicate closer agreement between two sets of measurements and also provide a guide as to how large a change between two successive measurements is needed to have faith that that change is “real” and not just due to random measurement error. Yang et al . asked the interesting question as to whether our intervendor strain task force was having a favorable impact, by comparing these CV results from before the standardization effort to those afterward. Comparison was also made with the CVs for end-diastolic and end-systolic volumes and EF. These results are most clearly seen in Figure 3 in Yang et al ., showing the CV among the strain measurements compared with the 5% CV noted for EF. For software from before the standardization effort, the strain CV was 12% to 13%, more than twice that for EF, whereas for software released after the start of standardization, the strain CV was much lower, only 5% to 6%, and no different from the CV for EF.


Encouraging, yes, and consistent with the convergence of mean strain values observed by Nagata et al . So how do we make sense of these two studies? They report largely similar results, but one ends on a discouraging note (“the same…software…should be used to measure GLS” ), while the other is upbeat (“the removal of concerns about measurement variability should allow wider use of GLS” ). There actually are a few differences between the studies that may be relevant in resolving these discrepant conclusions.


Note first that the JUSTICE study was designed primarily to determine normal ranges of strain, so Nagata et al . examined only normal patients, whereas Yang et al . studied a heterogeneous group that included normal volunteers and patients with cardiovascular disorders and therefore found a wider range of strain values. The relatively small amount of strain variance in Nagata et al .’s study makes it difficult to achieve high correlations between successive measurements, something easier to do in a large variance data set such as that of Yang et al ., which may also be considered more “real world,” as those investigators studied a more typical clinical population. Second, Yang et al . concerned themselves specifically with “convergence” between successive versions of software, whereas Nagata et al . provided more of a snapshot of current disparities, particularly with regard to the two “nondenominational” analysis packages from TomTec and Epsilon. Finally, we may ultimately have to accept the fact that although complete congruence between strain assessments remains elusive, the results are “good enough” for daily clinical work and research. As mentioned at the beginning, intervendor variability for strain has been scrutinized in a way that no other echocardiographic or Doppler parameter (or, for that matter, any computed tomographic or magnetic resonance imaging [MRI] parameter) has been examined previously. One of the interesting analyses of Yang et al . was to compare the intervendor CV for GLS with that for EF, confirming that with the latest software, GLS was as reproducible as EF. Even more fascinating was the observation that although GE and Philips gave virtually identical EFs (58.0% vs 58.1%), the component volumes were actually somewhat different (e.g., for LV end-diastolic volume, 95 vs 102 mL). This is reminiscent of the well-known discrepancies between volumes by echocardiography and MRI related to handling of trabeculae and may reflect differing degrees of edge enhancement as the B-mode image is constructed. Regardless, it certainly indicates that GLS is not the only imaging parameter with variability issues.


The strain standardization effort has been a challenging one for the echocardiography community. Although we cannot prove that the convergence reported by Yang et al . resulted directly from the European Association for Cardiovascular Imaging and American Society of Echocardiography task force’s efforts, it is clearly moving in the right direction. Bringing vendors together to discuss openly the technical aspects of this task and giving them opportunities to tune their software against common data sets is yielding benefits that might fruitfully be applied to other issues in cardiovascular imaging. The issues raised here are not strain specific, nor are they echocardiography specific, and the computed tomography and MRI communities should be encouraged to ensure their own vendor independence. Finally, we must never lose sight of the fact that echocardiography remains by far the best modality for serial studies, by virtue of cost, availability, and safety, without the radiation concerns of computed tomography or the cost and inconvenience of MRI. The convergence shown in both of these studies should cheer the cardiology community while still challenging us to continue striving for improvement.


Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Apr 21, 2018 | Posted by in CARDIOLOGY | Comments Off on Strain Imaging in Echocardiography: Converging on Congruence?

Full access? Get Clinical Tree

Get Clinical Tree app for offline access