Echocardiography Core Laboratory Reproducibility of Cardiac Safety Assessments in Cardio-Oncology




Background


As the potential for cancer therapy–related cardiac dysfunction is increasingly recognized, there is a need for the standardization of echocardiographic measurements and cut points to guide treatment. The aim of this study was to determine the reproducibility of cardiac safety assessments across two academic echocardiography core laboratories (ECLs) at the University of Pennsylvania and the Duke Clinical Research Institute.


Methods


To harmonize the application of guideline-recommended measurement conventions, the ECLs conducted multiple training sessions to align measurement practices for traditional and emerging assessments of left ventricular (LV) function. Subsequently, 25 echocardiograms taken from patients with breast cancer treated with doxorubicin with or without trastuzumab were independently analyzed by each laboratory. Agreement was determined by the proportion (coverage probability [CP]) of all pairwise comparisons between readers that were within a prespecified minimum acceptable difference. Persistent differences in measurement techniques between laboratories triggered retraining and reassessment of reproducibility.


Results


There was robust reproducibility within each ECL but differences between ECLs on calculated LV ejection fraction and mitral inflow velocities (all CPs < 0.80); four-chamber global longitudinal strain bordered acceptable reproducibility (CP = 0.805). Calculated LV ejection fraction and four-chamber global longitudinal strain were sensitive to small but systematic interlaboratory differences in endocardial border definition that influenced measured LV volumes and the speckle-tracking region of interest, respectively. On repeat analyses, reproducibility for mitral velocities (CP = 0.940–0.990) was improved after incorporating multiple-beat measurements and homogeneous image selection. Reproducibility for four-chamber global longitudinal strain was unchanged after efforts to develop consensus between ECLs on endocardial border determinations were limited primarily by a lack of established reference standards.


Conclusions


High-quality quantitative echocardiographic research is feasible but requires a commitment to reproducibility, adherence to guideline recommendations, and the time, care, and attention to detail to establish agreement on measurement conventions. These findings have important implications for research design and clinical care.


Highlights





  • Standardizing echocardiography measurements that guide treatment decisions in cancer is needed.



  • The authors assessed the reproducibility of echocardiographic measurements in cancer across two echocardiography core laboratories.



  • There was robust reproducibility within each laboratory but differences between laboratories.



  • Interlaboratory reproducibility for LVEF and GLS was less concordant.



  • Comparing small changes in LVEF and GLS across laboratories should be done cautiously.



As cardiac morbidity in patients with cancer is increasingly recognized, accurate diagnostic tools are critical to identify patients at risk for cancer therapy–related cardiac dysfunction (CTRCD). Echocardiography provides essential structural, functional, and hemodynamic insights into cardiac pathophysiology and, as a low-cost, widely available, and safe test is frequently used to assess the cardiac consequences of cancer and cancer therapy. However, variability related to imaging quality, biologic variation, and interpretive differences can limit the reliability of echocardiographic results. In clinical practice, variability of conventional echocardiographic parameters of left ventricular (LV) function (i.e., LV ejection fraction [LVEF]) can extend across treatment eligibility thresholds and affect critical decisions regarding cancer therapy. In cancer trials, clinical LVEF data from site echocardiography laboratories are often used to determine study eligibility and evaluate the cardiac consequences of novel cancer therapies, although reproducibility of results across clinical sites is rarely reported. Echocardiographic parameters that assess cardiac mechanics, including myocardial deformation and ventricular-arterial coupling, may improve sensitivity for early CTRCD beyond LVEF. However, data in patients with cancer are predominantly derived from single-center studies that may not account for factors recognized to potentially diminish measurement reproducibility.


As cardio-oncology progresses toward larger clinical trials, standard echocardiographic measurements and thresholds for treatment, with validated reproducibility of such measures, are vital. Multicenter cardiovascular clinical trials generally use echocardiography core laboratories (ECLs) to provide expertise and consistency for image acquisition and measurements as well as for assessments of imaging eligibility criteria and safety end points. In this regard, ECLs can reduce variability of imaging data and ensure the validity of study results. Cardio-oncology studies have used ECLs, but the practice is not widespread.


Against this background, the National Cancer Institute Division of Cancer Prevention awarded substudies of the PREDICT MDA 2007 0914 ( ClinicalTrials.gov identifier NCT01032278 ) and SCUSF 0806 ( ClinicalTrials.gov identifier NCT01009918 ) trials for the central review of echocardiograms to ECLs at the University of Pennsylvania (Penn) and the Duke Clinical Research Institute (DCRI), respectively. As a condition of the awards, the ECLs were instructed to collaborate with the potential goal of pooling echocardiographic data from the trials. To determine the feasibility of pooling the data, as well as the impact of central echocardiography review in cardio-oncology clinical trials, the ECLs at Penn and DCRI aimed to (1) determine the reproducibility of echocardiographic assessments in cardio-oncology within and across two academic ECLs, (2) identify sources of variability and corrective solutions, and (3) propose recommendations for echocardiographic research in the detection and monitoring of CTRCD, with potential implications for clinical care.


Methods


Penn and DCRI ECL Group Reads


To align data collection elements, the Penn and DCRI ECLs reviewed two-dimensional (2D) and Doppler echocardiographic parameters of cardiac size and systolic and diastolic function relevant to clinical cardiovascular outcomes in patients with cancer. A harmonization process between ECLs ensued. Sonographers and principal investigators at both ECLs conducted serial calls and multiple Web-conference group reads from October 2013 to March 2014 to share standard operating procedures, review sample echocardiograms, and align ECL perspectives on image quality, border selection, and tracing conventions. Group readings illustrated differences between ECLs on tracing conventions for certain parameters. These included nonconsensus on the angle of the minor axis for LV internal dimensions (e.g., parallel to mitral valve plane vs perpendicular to the LV long axis) and LV endocardial border definitions (e.g., depth of exclusion of trabeculae) during measurements of LV internal dimensions ( Figure 1 ) and volumes ( Figure 2 ), respectively. Harmonization efforts were aimed at achieving consensus on the application of measurement conventions outlined in national guidelines recommendations and culminated in the development of consensus reading instructions (please see the Online Appendix , available at www.onlinejase.com ) as well as a common comprehensive case report form across ECLs.




Figure 1


LV internal dimension tracing conventions by ECL.



Figure 2


LV endocardial border tracing conventions by ECL.


Echocardiographic Acquisition and Creation of Analysis Repository


After developing consensus reading instructions, each ECL contributed echocardiograms for reproducibility analyses. A total of 25 patient echocardiograms were selected from transthoracic echocardiograms previously acquired at both institutions from patients who had completed treatment with potentially cardiotoxic anticancer agents (i.e., doxorubicin with or without trastuzumab) for breast cancer. More detailed clinical data were not made available, as each ECL was blinded to patient characteristics. Selected echocardiograms were required to have visible LV endocardium unobscured by undergaining or artifact and no significant apical foreshortening in 2D acquisitions. Echocardiograms were obtained by dedicated sonographer teams in the Intersocietal Accreditation Commission clinical laboratories at both institutions. All images were acquired using Vivid 7 or E9 machines (GE Healthcare, Milwaukee, WI) at 60 to 90 frames/sec and digitally archived at the acquisition frame rate. Digital echocardiographic images were deidentified and transferred in standard Digital Imaging and Communications in Medicine format to TomTec (TomTec Imaging Systems, Unterschleissheim, Germany) and DigiView (Digisonics, Houston, TX) analysis workstations at the Penn and DCRI ECLs, respectively. Both ECLs used TomTec 2D Cardiac Performance Analysis version 1.1 for strain analysis.


Measurement of Echocardiography Parameters


After image transfer, measurements of echocardiography parameters were assigned to two readers at each ECL ( n = 4 total readers). Readers included three highly experienced research sonographers and a cardiologist with level III certification in echocardiography. Each reader independently analyzed two uniquely identified copies of the 25 patient echocardiograms and recorded 50 measurement results per analyzed parameter. Each result was treated independently (i.e., no averaging within or between readers).


At laboratory A, LV volumes and strain were measured by a single reader; Doppler parameters (i.e., velocities and timing intervals) were measured by a separate reader, according to reader expertise and existent laboratory practices. At laboratory B, each reader measured every parameter. The measurement results generated per echocardiogram by the readers in each ECL are depicted in more detail in Figure 3 A.




Figure 3


ECL assignment of echocardiography measurements and comparisons of results. (A) For each echocardiogram, there were two results per reader for every measured parameter. At laboratory A, LV volumes and strain were measured by a single reader; Doppler parameters were measured by a separate reader. At laboratory B, each reader measured every parameter. (B) Pairwise comparisons of measurement results between the two readers at laboratory B. For each measured parameter per echocardiogram, there were four pairwise comparisons of two measurement results per reader between the two readers at laboratory B. (C) Pairwise comparisons of measurement results between the readers at laboratories A and B. For each measured parameter per echocardiogram, there were eight pairwise comparisons of two measurement results per reader between one reader at laboratory A and two readers at laboratory B. 4CH , Apical four-chamber; LVEDV , LV end-diastolic volume; LVESV , LV end-systolic volume; LVOT , LV outflow tract; MV , mitral valve.


Intra- and Interlaboratory Reproducibility Testing


Intra- and interlaboratory reproducibility was evaluated by pairwise comparisons of all possible measurement differences (interpretative variability) among all readers for selected echocardiographic parameters, as described below. For each of the 25 patient echocardiograms, there were four pairwise comparisons of measurement results between the two readers at laboratory B ( Figure 3 B) and eight pairwise comparisons of measurement results between the reader at laboratory A and two readers at laboratory B ( Figure 3 C) on any selected parameter.


The “acceptable difference,” or limit of measurement variability, for each parameter served as a benchmark for reproducibility. The intra- and interreader acceptable differences were defined prospectively on the basis of literature review of studies with reproducibility data for LV volumes and LVEF, mitral inflow velocities, global longitudinal strain (GLS), and R-wave flow onset and flow end times. All acceptable differences are expressed in absolute terms ( Table 1 ).



Table 1

Prespecified reproducibility standards for echocardiographic parameters












































Echocardiographic parameter Minimum acceptable difference (for 80% of pairwise comparisons) Maximum acceptable differences (for 100% of pairwise comparisons)
LVEDV (mL) 30 60
LVESV (mL) 30 60
LVEF (calculated) (%) 10 20
LVEF (visual) (%) 10 20
MV E velocity (cm/sec) 5 10
MV A velocity (cm/sec) 5 10
GLS (%) 4 8
Time from R wave to LVOT flow onset (msec) 20 40
Time from R wave to LVOT flow end (msec) 40 85

LVEDV , LV end-diastolic volume; LVESV , LV end-systolic volume; LVOT , LV outflow tract; MV , mitral valve.


After readers completed the echocardiographic interpretations, result reports were generated, and intra- and interreader pairwise comparisons were displayed in tables and illustrated in dot-plot graphs. Paired measurements with differences exceeding the prespecified acceptable differences (outliers) were reviewed in a group read. If reproducibility between readers was not acceptable for any echocardiographic parameter(s), a process of revisiting image analysis instructions, retraining and retesting ensued.


Retraining and Retesting


Retraining involved Web conference–based sessions among all readers. Review of individual statistical results guided the retraining process by illustrating the extent and direction of outlier pairs. Open discussion among readers helped identify the source(s) of interpretation and/or measurement error(s). Illustrative case examples and the consensus reading instructions were discussed and revised to promote uniformity of interpretation and help eliminate individual idiosyncrasies. After retraining, all readers reinterpreted the 25 echocardiograms twice for selected parameters with unacceptable reproducibility on initial testing, and reproducibility was reevaluated.


Statistical Analysis


Specific standards or a universally preferred index does not exist for assessing intrareader and interreader reproducibility in ECLs; therefore, multiple statistical approaches for assessing reproducibility were considered. The coverage probability (CP) method was selected given its desirable characteristics as an agreement index in the ECL setting, including its computational simplicity, rapid identification of group and individual reader variability and specific disagreements for review and retraining, and broad applicability to patient populations and continuous and categorical variables. CP is the probability that the difference between any two measurements of a parameter on the same echocardiogram are within a prespecified acceptable difference. Specifically, all possible interreader comparisons were examined to determine whether the measurement difference between paired readers was within the prespecified acceptable difference. The nonparametric approach was used to obtain the estimate of CP. The estimate of CP is the proportion of the number of pairwise interreader comparisons within the acceptable difference divided by the number of all possible pairwise comparisons. The higher the CP, the better the reproducibility. Perfect reproducibility corresponded to 100% CP (i.e., CP = 1.00), indicating that all measurements were within the prespecified acceptable difference for that parameter.


The prespecified acceptable differences and the cut point for the CP determine the standard for acceptable reproducibility. In this study, the cut point for the estimated CP was set at 0.80 on the basis of previous studies assessing reproducibility in an ECL setting. Reproducibility was considered acceptable if ≥80% of all possible pairwise comparisons were within the prespecified acceptable difference for each parameter. Additionally, all pairwise comparisons (100% or CP = 1.00) were required to be within twice the acceptable difference. To easily visualize the reproducibility data and rapidly identify outlier pairs, the results of the CP analysis were displayed graphically for continuous parameters. Statistical analyses were performed using SAS version 9.2 (SAS Institute, Cary, NC).




Results


On the initial testing, there was robust reproducibility within each ECL. Intrareader reproducibility was acceptable (CP ≥ 0.80) for the a priori defined acceptable difference for all parameters on the basis of analysis of 225 pairwise comparisons (25 for each of the nine parameters) for each reader. All readers achieved a CP of 1.00 for twice the acceptable difference for all parameters except one reader for mitral valve A peak velocity (CP = 0.960). Interreader reproducibility between two readers at laboratory B demonstrated acceptable reproducibility for all parameters except mitral E velocity (CP = 0.770) and mitral A velocity (CP = 0.770) on the basis of analysis of 900 pairwise comparisons (100 for each of nine parameters). Reproducibility performance results within each ECL (i.e., intrareader CPs for all readers at both ECLs and interreader CPs at laboratory B) are shown in Table 2 .



Table 2

Coverage probabilities for echocardiographic reproducibility within each ECL



































































Echocardiographic parameter Intrareader (laboratories A and B) Interreader (laboratory B)
CP minimum acceptable difference (target ≥0.8) CP 2× acceptable difference (target 1.00) CP minimum acceptable difference (target ≥0.8) CP 2× acceptable difference (target 1.00)
Single-plane LVEDV (mL) 0.960–1.000 1.000–1.000 1.000 1.000
Single-plane LVESV (mL) 1.000–1000 1.000–1.000 1.000 1.000
Single-plane LVEF (calculated) (%) 1.000–1.000 1.000–1.000 1.000 1.000
Single-plane LVEF (visual) (%) 1.000–1.000 1.000–1.000 1.000 1.000
MV E velocity (cm/sec) 0.840–0.960 1.000–1.000 0.770 0.990
MV A velocity (cm/sec) 0.840–0.920 0.960–1.000 0.770 0.990
GLS 4CH (%) 0.920–1.000 1.000–1.000 0.940 1.000
R wave to LVOT flow onset (msec) 0.960–1.000 1.000–1.000 0.900 1.000
R wave to LVOT flow end (msec) 0.960–1.000 1.000–1.000 0.970 1.000

LVEDV , LV end-diastolic volume; LVESV , LV end-systolic volume; LVOT , LV outflow tract; MV , mitral valve.

Intrareader CP reported as a range across all readers at both laboratories.



Reproducibility performance results between ECLs are shown in Table 3 . Analysis of 1,800 pairwise comparisons (200 for each of nine parameters) among three readers (for any given parameter) across both ECLs demonstrated acceptable interlaboratory reproducibility for all parameters except calculated LVEF (CP = 0.675), mitral E velocity (CP = 0.715), and mitral A velocity (CP = 0.760). Measurements of GLS in the apical four-chamber view (GLS 4CH ) bordered acceptable reproducibility (CP = 0.805).



Table 3

Initial assessment of echocardiographic reproducibility between ECLs












































Echocardiographic parameter CP minimum acceptable difference (target ≥0.8) CP 2× acceptable difference (target 1.00)
Single-plane LVEDV (mL) 0.985 1.000
Single-plane LVESV (mL) 0.975 1.000
Single-plane LVEF (calculated) (%) 0.675 0.980
Single-plane LVEF (visual) (%) 0.900 1.000
MV E velocity (cm/sec) 0.715 0.910
MV A velocity (cm/sec) 0.760 0.915
GLS 4CH (%) 0.805 1.000
R wave to LVOT flow onset (msec) 0.810 0.995
R wave to LVOT flow end (msec) 0.955 1.000

LVEDV , LV end-diastolic volume; LVESV , LV end-systolic volume; LVOT , LV outflow tract; MV , mitral valve.


Intra- and interreader comparisons for parameters of LV structure and systolic function are illustrated in dot plots, which indicate the direction and absolute magnitude of the measurement differences within a single reader or between pairs of readers for each echocardiogram. Within-ECL comparisons are illustrated in Figures 4 and 5 . The differences for calculated LVEF measurements by a single reader at laboratory A ( Figure 4 A) and between two readers at laboratory B ( Figure 4 B) were typically within 5% and all were within 10%. The differences for GLS 4CH measurements exceeded 4% (the prespecified minimum acceptable threshold) on only two echocardiograms at each ECL ( Figures 5 A and 5B). Between-ECL comparisons for LV volumes were acceptably reproducible; however, end-diastolic volumes ( Figure 6 A) were similar to slightly higher and end-systolic volumes ( Figure 6 B) were higher for laboratory A compared with laboratory B. Dot-plot comparisons for LVEF and GLS 4CH measurements between ECLs are illustrated in the Online Appendix ( Supplemental Figures 1 and 2 , available at www.onlinejase.com ).




Figure 4


LVEF reproducibility results within ECLs. Pairwise comparisons on testing of calculated LVEF. (A) Intrareader at laboratory A (CP = 1.000). (B) Interreader at laboratory B (CP = 1.000). LVEF was determined in the apical four-chamber view. Red dots indicate reader pairs exceeding the minimum acceptable difference of 10%; green dots indicate those pairs within the acceptable difference. ↑LVEF , Higher LVEF values within each ECL; ↓LVEF , lower LVEF values within each ECL.

Only gold members can continue reading. Log In or Register to continue

Stay updated, free articles. Join our Telegram channel

Apr 15, 2018 | Posted by in CARDIOLOGY | Comments Off on Echocardiography Core Laboratory Reproducibility of Cardiac Safety Assessments in Cardio-Oncology

Full access? Get Clinical Tree

Get Clinical Tree app for offline access