Evidence-Based Thoracic Surgery
Jemi Olak
Paul J. Karanicolas
“Evidence-based medicine” (EBM) represents a shift in medical paradigms. Traditional approaches to clinical decision making put a relatively high value on intuition, unsystematic clinical observations, pathophysiologic rationale, and the opinions of authority figures. In contrast, EBM emphasizes evidence from clinical research and provides clinicians with tools to construct focused clinical questions, identify relevant evidence, critically appraise the literature, and apply the findings to their clinical practice.11 EBM also emphasizes the role of patient values and preferences in the decision-making process.
EBM acknowledges many sources of evidence. These include unsystematic clinical observations, physiologic studies, observational studies, randomized controlled trials, and systematic reviews that synthesize multiple studies. EBM does not disregard the importance of clinical expertise that comes only through experience but recognizes that unsystematic observations are limited by small sample size and by deficiencies in the human processes of forming inferences.13 Therefore EBM provides the means for clinicians to weigh the strength of evidence available in arriving at clinical decisions.
This chapter provides tools to assist in the practice of evidence-based thoracic surgery. While it is intended primarily for the practicing clinician, both junior researchers and seasoned investigators may find this review helpful in planning and conducting clinical research in thoracic surgery. To illustrate the process, a clinical scenario is presented. This allows discussion of how to frame a focused clinical question, the appropriate study designs, where to find the best evidence, how to critically appraise the evidence, and finally how to interpret some common statistics likely to be encountered.
Clinical Scenario
A general thoracic surgeon has a busy, diverse practice. While making rounds, his team encounters a 70-year-old man who underwent a left lower lobectomy 3 days prior for non-small-cell lung cancer and has developed new onset of atrial fibrillation. The nurse notes that this is the third time in the past week that a thoracic patient has developed atrial fibrillation following surgery and asks, “Is there a role for beta blockade to prevent these patients from developing arrhythmias?” This question is the basis for searching for the best evidence and provides an opportunity to discuss the problem.
Framing the Question
Prior to embarking on a search for new evidence regarding a clinical problem, the clinician should pose the question as clearly and explicitly as possible. For example, in the scenario outlined, one might ask, “Are there methods to prevent atrial fibrillation in patients that I am likely to operate on?” Although this question captures the spirit of the surgeon’s dilemma, it is
far too broad to enable a search for relevant evidence. There are too many possible interventions to consider in searching for the answer to this question. Furthermore, what types of patients should be included for the search to be relevant? Should the outcome be limited to the new onset of atrial fibrillation or are studies that have measured other atrial tachyarrhythmias also appropriate?
far too broad to enable a search for relevant evidence. There are too many possible interventions to consider in searching for the answer to this question. Furthermore, what types of patients should be included for the search to be relevant? Should the outcome be limited to the new onset of atrial fibrillation or are studies that have measured other atrial tachyarrhythmias also appropriate?
Table 99-1 The “PICOT” Structure for Framing a Clinical Question | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
In posing a question regarding an intervention, the clinician should describe the patients, the intervention, the comparison group, and the outcome of interest. It may be useful to remember these items using the mnemonic PICO.12 All clinical questions can be framed explicitly by using this outline, which makes it easier to search for the best evidence and to determine the extent to which the studies identified are applicable to the question. Often one unstructured question can generate multiple structured questions; Table 99-1 displays one possible structured question. Alternatively, structured problems can still be phrased in the form of a question; for example: “In patients undergoing general thoracic surgery, does perioperative beta blockade compared with placebo reduce the incidence of postoperative atrial tachyarrhythmias?”
Notice that several aspects of this new question differ from the original, unstructured question. Some of the terms are more specific than the original question (such as “patients undergoing general thoracic surgery” rather than “patients that I am likely to operate on,” while others are broader (such as “atrial tachyarrhythmias” rather than “atrial fibrillation”). Each of these terms could be even more specific (such as “elderly males undergoing left lower lobectomy”) or broader (such as “arrhythmia”). The decision about how specific to make your question is a clinical one and requires substantial experience and expertise. Questions that are too broad are unlikely to be clinically useful, since they will yield different answers depending on the nature of the participants, interventions, or outcome, while questions that are too specific are unlikely to yield any relevant studies.
Determining the question type is the final step in framing the question. There are four fundamental types of clinical questions18:
Therapy: What is the impact of an intervention on improving patient outcomes?
Harm: What is the impact of potentially harmful agents (including therapies) on patient function, morbidity, or mortality?
Diagnosis: To what extent can a test differentiate between patients with and without a target disorder?
Prognosis: What is the expected course of a patient’s disease?
Some clinicians include the type of question as the final component of the PICO framework, the so-called PICOT framework. Identifying the appropriate question type is critical because it dictates the best study design to answer that question, as discussed in the next section.
Study Design
The optimal study design is dependent upon the type of question being asked. For each type of question, a “hierarchy of evidence” exists.11 Figure 99-1 depicts a hierarchy of evidence for a therapeutic question, with the strongest evidence at the top of the pyramid and the weakest evidence at the bottom. Clinicians should be familiar with this pyramid, because as the level of evidence decreases, they should be less confident in the strength of the inferences. Thus for each research question, clinicians should search for evidence at the top of the pyramid and proceed down the pyramid only if they are unable to locate evidence at a higher level.
For example, a systematic review of randomized controlled trials comparing patients who received perioperative beta blockade with patients who received placebo would be ideal to answer the question posed by the clinician earlier in this chapter. If such a study exists (and had been rigorously conducted), the clinician may be confident from its conclusions. If, however, the clinician is forced to rely on biologic studies (such as a study of beta blockade in rats), the inferences are much weaker.
The most common study designs are descriptive and comparative. There is a fundamental distinction between studies that involve a single group of patients (sometimes called descriptive studies) and studies that include two or more groups of patients (also called comparative studies).
A descriptive study usually generates data from existing databases, hospital records, or public health records to describe patterns of disease and risk factors associated with disease. Their results often help to formulate research questions for prospective studies. There are three types of descriptive studies: correlational studies, case reports or case series, and cross-sectional surveys.
Correlational studies are often the first step in the investigation of a possible link between exposure, for example, to a carcinogen such as nicotine and the development of a disease such as lung cancer. These studies examine populations of patients; they are inexpensive and relatively easy to complete because the data are derived from existing databases, hospital records, and so on. Conclusions from correlational studies, however, do not explain cause and effect in an individual patient; inferences are limited to the population of patients studied. Another drawback of this type of study is that the investigator has no ability to control for potential confounding variables.
Case reports and case series examine individuals or groups of individuals who fit criteria predefined by the author. Their results often generate hypotheses and spawn prospective research regarding new diseases and risk factors for the development of disease. The lack of a comparison group in this study design does not allow researchers to draw strong conclusions.
Comparative studies build upon descriptive studies by incorporating two or more groups of participants, which allows researchers to conduct direct comparisons between groups. Most comparative studies share the same basic structure: a group of participants receives an intervention (or is exposed to something) while another group does not (or is not), and some participants in each group achieve an outcome while others do not (Fig. 99-2). The data from a basic study such as this may be summarized in a 2-by-2 table, which allows for the calculation of several summary statistics (see later section on statistics). A distinction between the three types of comparative studies is based upon how the participants are identified and how they are separated into groups.
The most rigorous type of comparative study is a randomized controlled trial (RCT). An RCT involves prospectively and randomly assigning consecutive patients who meet a set of predefined inclusion and exclusion criteria to receive either an intervention or a control and then measuring one or more important outcomes. Researchers may design RCTs to assess a new therapy (such as video-assisted lobectomy compared with open lobectomy) or a preventive maneuver (such as preoperative beta blockade to prevent atrial tachyarrhythmias). RCTs are time-consuming and expensive to conduct since all data must be collected in a prospective manner (as opposed to chart review or database). In most cases, a Data Safety Monitoring Committee must be present to oversee the study to protect the patients from unanticipated side effects from the treatment they receive. Chapter 100 discusses the principles of clinical trials in more detail.
Cohort and case-control studies comprise the other common forms of comparative studies. These two designs are also referred to as observational studies, since patients are not usually assigned to different groups but are simply “observed.” Clinicians should consider evidence from these studies with less confidence than evidence from RCTs, since many factors (other than the intervention or exposure) could account for differences between groups.
A cohort study involves the observation of two or more groups of people over time from the date of exposure or intervention (e.g., start of smoking use) to the date of occurrence of the outcome(s) of interest (e.g., development of lung cancer). The investigator decides in advance what he or she will observe or measure in the cohort of patients. Data collection may occur either retrospectively (both exposure and outcomes have occurred) or prospectively (exposure may or may not have occurred but outcome(s) have not); in either case, patients are identified based on the exposure. For example, one retrospective cohort study, using tax return data, identified an excess of lung cancer mortality among asbestos workers.9 A prospective cohort study is expensive and time-consuming to complete and often follows other studies that have generated hypotheses. Interpretation of the results of cohort studies must take into account the number of participants who were lost to follow-up. This study design allows for assessment of the temporal relationship between exposure and the outcome(s) of interest and can also examine multiple outcomes.
In contrast, a case-control study identifies a group of patients that already have an outcome (such as lung cancer) and matches them to a group that is similar in every respect except for the outcome. The investigator then looks backward in time to see what exposure(s) occurred in the group with the outcome versus the group without the outcome. This design is ideally suited for situations when the disease being studied is rare, has a long latency, or is a new disease or condition. Case-control studies are prone to bias, since both the exposure and the disease have occurred before the study was begun and selection, reporting, or recording bias may be involved with respect to either of the groups being studied. However, case-control studies are usually the least expensive and time-consuming of the three comparative studies to complete.
It is becoming increasingly difficult to conduct prospective research owing to increases in physician workload, the expense of conducting such studies, and difficulties in recruiting sufficient numbers of patients within a reasonable time frame. This usually means that a multicenter trial design is required, which adds many layers of administrative and monitoring costs. In part because of these challenges, industry-sponsored trials are more prevalent than ever before. However, caution should be exercised in interpreting the results of these trials owing to their potential biases.
Decision analysis and economic analysis are becoming increasingly utilized in health-care assessment. In a decision analysis, investigators create a model that depicts different treatment options for a particular disease and assign probabilities
of events occurring (ideally based upon good evidence). In addition, researchers may include a measure of patients’ outcomes as a result of each treatment, such as health utility (e.g., death would have a utility of zero and perfect health a utility of one). Since decision analysis typically requires investigators to incorporate several assumptions, this type of assessment represents a relatively low level of evidence and is most appropriate when a surgical problem that could not be the subject of a clinical trial is being examined.
of events occurring (ideally based upon good evidence). In addition, researchers may include a measure of patients’ outcomes as a result of each treatment, such as health utility (e.g., death would have a utility of zero and perfect health a utility of one). Since decision analysis typically requires investigators to incorporate several assumptions, this type of assessment represents a relatively low level of evidence and is most appropriate when a surgical problem that could not be the subject of a clinical trial is being examined.
An economic analysis assesses the costs and benefits of an intervention (such as a new procedure, drug, or test) compared with the current “gold standard.” Economic analyses may help to determine whether the intervention is “worth it.” Interested readers may consult an introductory textbook for further details on economic and decision analyses.8