Song, Jae W. M.D.; Chung, Kevin C. M.D., M.S.
Because of the innovative nature of the specialty, plastic surgeons are frequently confronted with a spectrum of clinical questions by patients who inquire about “best practices.” It is thus essential that plastic surgeons know how to critically appraise the literature to understand and practice evidence-based medicine and also contribute to the effort by carrying out high-quality investigations.1 Well-designed randomized controlled trials have held the preeminent position in the hierarchy of evidence-based medicine as level I evidence (Table 1). However, randomized controlled trial methodology, which was first developed for drug trials, can be difficult to conduct for surgical investigations.2 Instead, well-designed observational studies, recognized as level II or III evidence, can play an important role in deriving evidence for plastic surgery. Results from observational studies are often criticized for being vulnerable to the influence of unpredictable confounding factors. However, recent work has challenged this notion, showing comparable results between observational studies and randomized controlled trials.3,4 Observational studies can also complement randomized controlled trials in hypothesis generation, establishing questions for future randomized controlled trials, and defining clinical conditions.
Observational studies fall under the category of analytic study designs and are further subclassified as observational or experimental study designs (Fig. 1). The goal of analytic studies is to identify and evaluate causes or risk factors of diseases or health-related events. The differentiating characteristic between observational and experimental study designs is that in the latter, the presence or absence of undergoing an intervention defines the groups. By contrast, in an observational study, the investigator does not intervene and rather simply “observes” and assesses the strength of the relationship between an exposure and disease variable.5 Three types of observational studies include cohort studies, case-control studies, and cross-sectional studies (Fig. 1). Case-control and cohort studies offer specific advantages by measuring disease occurrence and its association with an exposure by offering a temporal dimension (i.e., prospective or retrospective study design). Cross-sectional studies, also known as prevalence studies, examine the data on disease and exposure at one particular time point (Fig. 2).5 Because the temporal relationship between disease occurrence and exposure cannot be established, cross-sectional studies cannot assess the cause-and-effect relationship. In this review, we will primarily discuss cohort and case-control study designs and related methodologic issues.
The term “cohort” is derived from the Latin word cohors. Roman legions were composed of 10 cohorts. During battle, each cohort, or military unit, consisting of a specific number of warriors and commanding centurions, were traceable. The word cohort has since been adopted into epidemiology to define a set of people followed over a period of time. W. H. Frost, an epidemiologist from the early 1900s, was the first to use the word cohort in his 1935 publication assessing age-specific mortality rates and tuberculosis.6 The modern epidemiologic definition of the word now means a “group of people with defined characteristics who are followed up to determine incidence of, or mortality from, some specific disease, all causes of death, or some other outcome.”6
A well-designed cohort study can provide powerful results. In a cohort study, an outcome or disease-free study population is first identified by the exposure or event of interest and followed in time until the disease or outcome of interest occurs (Fig. 3, above). Because exposure is identified before the outcome, cohort studies have a temporal framework to assess causality and thus have the potential to provide the strongest scientific evidence.7 Advantages and disadvantages of a cohort study are listed in Table 2.8,9 Cohort studies are particularly advantageous for examining rare exposures because subjects are selected by their exposure status. In addition, the investigator can examine multiple outcomes simultaneously. Disadvantages include the need for a large sample size and the potentially long follow-up duration of the study design, resulting in a costly endeavor.
Cohort studies can be prospective or retrospective (Fig. 2). Prospective studies are carried out from the present time into the future. Because prospective studies are designed with specific data collection methods, they have the advantage of being tailored to collect specific exposure data and may be more complete. The disadvantage of a prospective cohort study may be the long follow-up period while waiting for events or diseases to occur. Thus, this study design is inefficient for investigating diseases with long latency periods and is vulnerable to a high rate of loss to follow-up. Although prospective cohort studies are invaluable as exemplified by the landmark Framingham Heart Study, started in 1948 and still ongoing,10 in the plastic surgery literature, this study design is generally seen to be inefficient and impractical. Instead, retrospective cohort studies are better indicated given the timeliness and inexpensive nature of the study design.
Retrospective cohort studies, also known as historical cohort studies, are carried out at the present time and look to the past to examine medical events or outcomes. In other words, a cohort of subjects selected based on exposure status is chosen at the present time, and outcome data (i.e., disease status, event status), which were measured in the past, are reconstructed for analysis. The primary disadvantage of this study design is the limited control the investigator has over data collection. The existing data may be incomplete, inaccurate, or inconsistently measured between subjects.8 However, because of the immediate availability of the data, this study design is comparatively less costly and shorter than prospective cohort studies. For example, Spear and colleagues examined the effect of obesity and complication rates after undergoing pedicled transverse rectus abdominis musculocutaneous (TRAM) flap reconstruction by retrospectively reviewing 224 pedicled TRAM flaps in 200 patients over a 10-year period.11 In this example, subjects who underwent pedicled TRAM flap reconstruction were selected and categorized into cohorts by their exposure status: normal/underweight, overweight, or obese. The outcomes of interest were various flap and donor-site complications. The findings revealed that obese patients had a significantly higher incidence of donor-site complications, multiple flap complications, and partial flap necrosis than normal or overweight patients. An advantage of the retrospective study design analysis is the immediate access to the data. A disadvantage is the limited control over the data collection because data were gathered retrospectively over 10 years; for example, a limitation reported by the authors is that mastectomy flap necrosis was not uniformly recorded for all subjects.11
An important distinction lies between cohort studies and case series. The distinguishing feature between these two types of studies is the presence of a control, or unexposed, group. Contrasting with epidemiologic cohort studies, case series are descriptive studies following one small group of subjects. In essence, they are extensions of case reports. Usually, the cases are obtained from the authors' experiences, generally involve a small number of patients, and more importantly, lack a control group.12 There is often confusion in designating studies as “cohort studies” when only one group of subjects is examined. Yet, unless a second comparative group serving as a control is present, these studies are defined as case series. The next step in strengthening an observation from a case series is selecting appropriate control groups to conduct a cohort or case-control study, the latter of which is discussed in the Case-Control Studies section later in the article.9
Selection of Subjects in Cohort Studies
The hallmark of a cohort study is defining the selected group of subjects by exposure status at the start of the investigation. A critical characteristic of subject selection is to have both the exposed and unexposed groups be selected from the same source population (Fig. 4).9 Subjects who are not at risk for developing the outcome should be excluded from the study. The source population is determined by practical considerations, such as sampling. Subjects may be effectively sampled from the hospital, be members of a community, or be from a doctor's individual practice. A subset of these subjects will be eligible for the study.
Attrition Bias (Loss to Follow-Up)
Because prospective cohort studies may require long follow-up periods, it is important to minimize loss to follow-up. Loss to follow-up is a situation in which the investigator loses contact with the subject, resulting in missing data. If too many subjects are lost to follow-up, the internal validity of the study is reduced. A general rule of thumb requires that the loss to follow-up rate not exceed 20 percent of the sample.5 Any systematic differences related to the outcome or exposure of risk factors between those who drop out and those who stay in the study must be examined, if possible, by comparing individuals who remain in the study and those who were lost to follow-up or dropped out. It is therefore important to select subjects who can be followed for the entire duration of the cohort study. Methods of minimizing loss to follow-up are listed in Table 3.
Case-control studies were historically borne out of interest in the cause of disease. The conceptual basis of the case-control study is similar to taking a history and physical examination; the patient with disease is questioned and examined, and elements from this history taking are knitted together to reveal characteristics or factors that predisposed the patient to the disease. In fact, the practice of interviewing patients about behaviors and conditions preceding illness dates back to the Hippocratic writings of the fourth century BC.6
Reasons of practicality and feasibility inherent in the study design typically dictate whether a cohort study or case-control study is appropriate. This study design was first recognized in Janet Lane-Claypon's study of breast cancer in 1926, revealing the finding that low fertility rates raise the risk of breast cancer.13,14 In the ensuing decades, case-control study methodology crystallized with the landmark publication linking smoking and lung cancer in the 1950s.15 Since that time, retrospective case-control studies have become more prominent in the biomedical literature with more rigorous methodologic advances in design, execution, and analysis.
Case-control studies identify subjects by outcome status at the outset of the investigation. Outcomes of interest may be whether the subject has undergone a specific type of surgery, experienced a complication, or been diagnosed with a disease (Fig. 3, below). Once outcome status is identified and subjects are categorized as cases, controls (subjects without the outcome but from the same source population) are selected. Data regarding exposure to a risk factor or several risk factors are then collected retrospectively, typically by interview, abstraction from records, or survey. Case-control studies are well suited to investigate rare outcomes or outcomes with a long latency period because subjects are selected from the outset by their outcome status. Thus, in comparison with cohort studies, case-control studies are quick and relatively inexpensive to implement, require comparatively fewer subjects, and allow for multiple exposures or risk factors to be assessed for one outcome (Table 4).8,9
An example of a case-control investigation is the study by Zhang and colleagues, who examined the association of environmental and genetic factors associated with rare congenital microtia,16 which has an estimated prevalence of 0.83 to 17.4 in 10,000.17 They selected 121 congenital microtia cases based on clinical phenotype, and 152 unaffected controls, matched by age and sex in the same hospital and same period. Controls were of Han Chinese origin from Jiangsu, China, the same area from which the cases were selected. This allowed both the controls and cases to have the same genetic background, which is important to note given the investigated association between genetic factors and congenital microtia. To examine environmental factors, a questionnaire was administered to the mothers of both cases and controls. The authors concluded that adverse maternal health was among the main risk factors for congenital microtia, specifically, maternal disease during pregnancy (odds ratio, 5.89; 95 percent confidence interval, 2.36 to 14.72), maternal toxicity exposure during pregnancy (odds ratio, 4.76; 95 percent confidence interval, 1.66 to 13.68), and resident area, such as living near industries associated with air pollution (odds ratio, 7.00; 95 percent confidence interval, 2.09 to 23.47).16 A case-control study design is most efficient for this investigation, given the rarity of the disease outcome. Because congenital microtia is thought to have multifactorial causes, an additional advantage of the case-control study design in this example is the ability to examine multiple exposures and risk factors.
Selection of Cases
Sampling in a case-control study design begins with selecting the cases. In a case-control study, it is imperative that the investigator has explicitly defined inclusion and exclusion criteria before the selection of cases. For example, if the outcome is having a disease, specific diagnostic criteria, disease subtype, stage of disease, or degree of severity should be defined. Such criteria ensure that all the cases are homogenous. Second, cases may be selected from a variety of sources, including hospital patients, clinic patients, or community subjects. Many communities maintain registries of patients with certain diseases and can serve as a valuable source of cases. However, despite the methodologic convenience of this method, validity issues may arise. For example, if cases are selected from one hospital, identified risk factors may be unique to that single hospital. This methodologic choice may weaken the generalizability of the study findings. Another example is choosing cases from the hospital versus the community; most likely, cases from the hospital sample will represent a more severe form of the disease than those in the community.8 Finally, it is also important to select cases that are representative of cases in the target population to strengthen the study's external validity (Fig. 4). Potential reasons why cases from the original target population eventually filter through and are available as cases (study participants) for a case-control study are illustrated in Figure 5.
Selection of Controls
Selecting the appropriate group of controls can be one of the most demanding aspects of a case-control study. An important principle is that the distribution of exposure should be the same among cases and controls; in other words, both cases and controls should stem from the same source population. The investigator may also consider the control group to be an at-risk population, with the potential to develop the outcome. Because the validity of the study depends on the comparability of these two groups, cases and controls should otherwise meet the same inclusion criteria in the study.
A case-control study design that exemplifies this methodologic feature is the study by Chung and colleagues, who examined maternal cigarette smoking during pregnancy and the risk of newborns developing cleft lip–cleft palate.18 A salient feature of this study is the use of the 1996 U.S. Natality database, a population database from which both cases and controls were selected. This database provides a large sample size to assess newborn development of cleft lip–cleft palate (outcome), which has a reported incidence of one in 1000 live births,19 and also enabled the investigators to choose controls (i.e., healthy newborns) that were generalizable to the general population to strengthen the study's external validity. A significant relationship with maternal cigarette smoking and cleft lip–cleft palate in the newborn was reported in this study (adjusted odds ratio, 1.34; 95 percent confidence interval, 1.36 to 1.76).18
Matching is a method used in an attempt to ensure comparability between cases and controls and reduces variability and systematic differences attributable to background variables that are not of interest to the investigator.7 Each case is typically paired individually with a control subject with respect to the background variables. The exposure to the risk factor of interest is then compared between the cases and the controls. This matching strategy is called individual matching. Age, sex, and race are often used to match cases and controls because they are typically strong confounders of disease.20 Confounders are variables associated with the risk factor and may potentially be a cause of the outcome.7Table 5 lists several advantages and disadvantages with a matching design.
Investigations examining rare outcomes may have a limited number of cases from which to select, whereas the source population from which controls can be selected is much larger. In such scenarios, the study may be able to provide more information if multiple controls per case are selected. This method increases the “statistical power” of the investigation by increasing the sample size. The precision of the findings may improve by having up to approximately three or four controls per case.21–23
Bias in Case-Control Studies
Evaluating exposure status can be the Achilles heel of case-control studies. Because information about exposure is typically collected by self-report, interview, or from recorded information, it is susceptible to recall bias, interviewer bias, or will rely on the completeness or accuracy of recorded information, respectively. These biases decrease the internal validity of the investigation and should be carefully addressed and reduced in the study design. Recall bias occurs when a differential response between cases and controls occurs. The common scenario is when a subject with disease (case) will unconsciously recall and report an exposure with better clarity because of the disease experience. Interviewer bias occurs when the interviewer asks leading questions or has an inconsistent interview approach between cases and controls. A good study design will implement a standardized interview in a nonjudgmental atmosphere with well-trained interviewers to reduce interviewer bias.9
THE STRENGTHENING THE REPORTING OF OBSERVATIONAL STUDIES IN EPIDEMIOLOGY STATEMENT
In 2004, the first meeting of the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) group took place in Bristol, United Kingdom.24 The aim of the group was to establish guidelines on reporting observational research to improve the transparency of the methods, thereby facilitating the critical appraisal of a study's findings. A well-designed but poorly reported study is disadvantaged in contributing to the literature because the results and generalizability of the findings may be difficult to assess. Thus, a 22-item checklist was generated to enhance the reporting of observational studies across disciplines.25,26 This checklist is also located at the following Web site: http://www.strobe-statement.org. This statement is applicable to cohort studies, case-control studies, and cross-sectional studies. In fact, 18 of the checklist items are common to all three types of observational studies, and four items are specific to each of the three specific study designs. In an effort to provide specific guidance to go along with this checklist, an “explanation and elaboration” article was published for users to better appreciate each item on the checklist.27 Plastic surgery investigators should peruse this checklist before designing their study and when they are writing up the report for publication. In fact, some journals now require authors to follow the STROBE Statement. A list of participating journals can be found on this Web site: http://www.strobe-statement.org./index.php?id=strobe-endorsement.
Because of the limitations in carrying out randomized controlled trials in surgical investigations, observational studies are becoming more popular for investigating the relationship between exposures, such as risk factors or surgical interventions, and outcomes, such as disease states or complications. Recognizing that well-designed observational studies can provide valid results is important in the plastic surgery community, so that investigators can both critically appraise and appropriately design observational studies to address important clinical research questions. The investigator planning an observational study can certainly use the STROBE statement as a tool to outline key features of a study and come back to it again at the end to enhance transparency in methodology reporting.
This work was supported in part by a Midcareer Investigator Award in Patient-Oriented Research (K24 AR053120) from the National Institute of Arthritis and Musculoskeletal and Skin Diseases (to K.C.C.).
1.Chung KC, Swanson JA, Schmitz D, Sullivan D, Rohrich RJ. Introducing evidence-based medicine to plastic and reconstructive surgery. Plast Reconstr Surg. 2009;123:1385–1389.
2.Hulley SB, Cummings SR, Browner WS, et al. Designing Clinical Research: An Epidemiologic Approach. 2nd ed. Philadelphia: Lippincott Williams & Wilkins; 2001:1–336.
3.Chung KC, Burns PB. A guide to planning and executing a surgical randomized controlled trial. J Hand Surg (Am.) 2008;33:407–412.
4.Benson K, Hartz AJ. A comparison of observational studies and randomized, controlled trials. N Engl J Med. 2000;342:1878–1886.
5.Concato J, Shah N, Horwitz RI. Randomized, controlled trials, observational studies, and the hierarchy of research designs. N Engl J Med. 2000;342:1887–1892.
6.Merril RM, Timmreck TC. Introduction to Epidemiology. 4th ed. Sudbury, Mass: Jones and Bartlett Publishers; 2006:1–352.
7.Morabia A. A History of Epidemiologic Methods and Concepts. Basel: Birkhaeuser Verlag; 2004:1–405.
8.Everitt BS, Palmer CR. Encyclopaedic Companion to Medical Statistics. London: Hodder Arnold; 2005.
9.Elwood M. Critical Appraisal of Epidemiological Studies and Clinical Trials. 3rd ed. Oxford: Oxford University Press; 2007:1–570.
11.Spear SL, Ducic I, Cuoco F, Taylor N. Effect of obesity on flap and donor-site complications in pedicled TRAM flap breast reconstruction. Plast Reconstr Surg. 2007;119:788–795.
12.Jenicek M. Foundations of Evidence-Based Medicine. Boca Raton, Fla: Parthenon; 2003:1–542.
13.Lane-Claypon JE. A Further Report on Cancer of the Breast, with Special Reference to its Associated Antecedent Conditions. London: Her Majesty's Stationery Office; 1926.
14.Cole P. The evolving case-control study. J Chronic Dis. 1979;32:15–27.
15.Doll R, Hill AB. Smoking and carcinoma of the lung; preliminary report. BMJ. 1950;2:739–748.
16.Zhang QG, Zhang J, Yu P, Chen H. Environmental and genetic factors associated with congenital microtia: A case-control study in Jiangsu, China, 2004 to 2007. Plast Reconstr Surg. 2009;124:1157–1164.
17.Suutarla S, Rautio J, Ritvanen A, Ala-Mello S, Jero J, Klockars T. Microtia in Finland: Comparison of characteristics in different populations. Int J Pediatr Otorhinolaryngol. 2007;71:1211–1217.
18.Chung KC, Kowalski CP, Kim HM, Buchman SR. Maternal cigarette smoking during pregnancy and the risk of having a child with cleft lip/palate. Plast Reconstr Surg. 2000;105:485–491.
19.Das SK, Runnels RS Jr, Smith JC, Cohly HH. Epidemiology of cleft lip and cleft palate in Mississippi. South Med J. 1995;88:437–442.
20.Wacholder S, McLaughlin JK, Silverman DT, Mandel JS. Selection of controls in case-control studies: I. Principles. Am J Epidemiol. 1992;135:1019–1028.
21.Ury HK. Efficiency of case-control studies with multiple controls per case: Continuous or dichotomous data. Biometrics 1975;31:643–649.
22.Bruce N, Pope D, Stanistreet D. Quantitative Methods for Health Research. West Sussex, England: Wiley; 2008:1–529.
23.Woodward M. Epidemiology: Study Design and Data Analysis. 2nd ed. London: Chapman & Hall/CRC; 2005:1–849.
24.Vandenbroucke JP. The making of STROBE. Epidemiology 2007;18:797–799.
25.von Elm E, Altman DG, Egger M, et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: Guidelines for reporting observational studies. J Clin Epidemiol. 2008;61:344–349.
26.von Elm E, Altman DG, Egger M, et al. The strengthening the reporting of observational studies in epidemiology (STROBE) statement: Guidelines for reporting observational studies. Lancet 2007;370:1453–1457.
27.Vandenbroucke JP, von Elm E, Altman DG, et al. Strengthening the reporting of observational studies in epidemiology (STROBE): Explanation and elaboration. PLoS Med. 2007;4:e297.