The nested case-control design is well established as an epidemiologic study design. Nonetheless, a number of articles and letters have appeared recently asserting that the nested case-control study design is susceptible to a form of study design bias.1–5 Given the theoretical understanding of the validity of the standard nested case-control design, in which case-control sets consist of the case and a simple random sample of controls from the risk set6–9 (and which we will henceforth refer to as the “simple random sampling case-control” [SRS] design) such contentions might be dismissed as untenable. However, the repeated assertion of bias in the SRS design has raised concerns in the occupational health community that “tried and true” case-control methods may be flawed.10 Noting that methods to empirically evaluate potential study design biases within the context of existing cohort data have not been described, Deubner et al4 proposed an approach which they then applied to a cohort of beryllium workers to illustrate problems that they contend arise in SRS studies.
We have 3 goals in this paper. First, we give a heuristic description of the nested case-control methodology and provide some intuition about why the approach is valid for assessing exposure–disease associations. Second, we examine the evidence presented by Deubner et al that the SRS design can be biased when evaluating lagged exposure variables. In particular, we review their method of empirical investigation and show why it not a valid way to evaluate bias questions in SRS designs. Third, we provide a valid approach to empirical evaluation of nested case-control study designs and analysis methods. This approach can be used to evaluate study design validity, investigate the behavior of estimates under model misspecification, and provide sample size and power calculations.
As pointed out by Deubner et al, an important advantage of an empirical analysis is that the results are specific to the cohort that serves as the basis for the case-control study.4 Thus, although a flawed study design can in general result in biased estimation, with an empirical evaluation, one can explore the magnitude of such potential bias in the particular study in question. Throughout, we will illustrate the concepts and methods using the Colorado Plateau uranium miners cohort.11,12
NESTED CASE-CONTROL STUDIES: A BRIEF OVERVIEW
The Colorado Plateau uranium miners cohort formed the basis of a study of radon exposure and lung cancer mortality and has been used extensively both to characterize the radon–lung cancer association and as a methodologic example.13–15 Although radon exposure has been estimated for all miners in this cohort, we consider a hypothetical situation in which the only radon exposure–related information available on cohort members is the dates of start and end of mining.
Risk Set Representation of Cohort Data
Figure 1A depicts basic information for miners in the underlying cohort. Each horizontal line represents the ages during which the given miner was under observation or at risk for lung cancer death. In other words, at each age on the line, the miner meets the eligibility requirements for cohort membership (eg, uranium miner in the 4-state mining area during 1950–1960 enrolled by a United States Public Health Services researcher) and lung cancer death status is known. Thus, the line starts at the age the miner was enrolled and ends at the age that the worker is known to have died or at age of last contact if alive. Death due to lung cancer (as noted on the death certificate) is the outcome of interest. Now, with the goal of assessing the impact of radon exposure on risk of lung cancer, the question becomes, “what are reasonable comparisons to make from data of this type?” The nested case-control approach is based on risk sets defined by the ages of death due to lung cancer and is illustrated in Figure 1A. The risk set at a given lung cancer death age consists of the case (the miner who died of lung cancer at the risk set age) and controls (miners who were alive and being observed in the study at the risk set age). Comparison of the lung cancer case exposure to that of the controls in the risk set provides a reasonable and intuitive basis for estimation of radon exposure–lung cancer risk relationships. The risk set defines a population of miners of the same age from which any risk set member could have been the lung cancer case. Higher exposure in the cases than the controls in their respective (age-defined) risk sets is evidence of a positive exposure–lung cancer association.
Exposure Summary Construction
Although the general idea of comparing exposure in risk set cases and controls is eminently reasonable, we have not described how radon exposure should be quantified for such a comparison. Miners in the cohort are exposed to varying levels of radon depending on the time and mine worked. Exposure is inferred based on mine measurements made by federal and state agencies. An analytic challenge is how to summarize these exposure histories into meaningful exposure summaries that capture aspects of the history that are relevant to lung cancer risk. The risk set approach again provides some intuition. With the risk set defined by age at lung cancer death, it is natural to compare exposures that are experienced only up to that age, that is, to compare case-control exposure summaries based on the exposure history up to the risk set age. This simply implements the principle of temporality, that only exposure experienced prior to disease occurrence can be involved in causing the disease.16–18
Of course, further restrictions on exposure summaries may be dictated by the specific study situation. For instance, radon exposure affects lung cancer incidence but in the miners study only information on lung cancer death is available. Thus, we can restrict exposure summaries to be functions of radon histories up to 2 years prior to the risk set age to (approximately) account for the time from lung cancer diagnosis to death.15 Once reasonable restrictions on the exposure history are established, risk set case-control comparisons of any radon exposure summary provide a valid basis for assessing the association between that radon exposure summary and lung cancer risk.
The appeal of the risk set approach to the miners cohort data is that case-control comparisons are between miners of the same age. This natural idea of comparing “like with like” can be extended to other factors as well, by restricting the risk set controls to those similar to the case with respect to these factors. For instance, although miners in the cohort risk sets are of the same age, they attained this age at different times. In the risk set at age 48 (Fig. 1A), the case was 48 in 1955, whereas some the controls were 48 during the 1940s and others were 48 during the 1970s. To make the controls used in the comparison more like the case in terms of date at the risk set age, controls could be matched to the case on, say, year of birth. This is illustrated in Figure 1B, in which the controls for comparison to the case are restricted to those who were in the same 5-year year-of-birth–matched risk set, that is, those born in the same 5-year interval as the case. Intuitively, conclusions drawn from the radon exposure comparisons between cases and the restricted control sets will be more believable than the full risk set controls because the controls are “more like” the case in ways related to time trends (say, smoking behavior) that might obscure (confound) a crude analysis of the relationship between radon exposure and lung cancer. Control for confounding by measured covariates may, of course, be achieved by means other than matching.19 However, matching in a case-control analysis ensures efficiency in the control for matched covariates by achieving balance in the distribution of controls across strata of the matching factor(s). Just as with exposure summary construction, temporality considerations lead to the natural constraint that matching factors should depend on history up to the risk set age.
Simple measures, such as the average case-control differences over risk sets, can give a sense of whether exposure is associated with lung cancer. However, quantification of disease difference is generally quantified by the rate ratio, the relative change in lung cancer rates per unit of increase in exposure. Inference about the rate ratio from risk set organized cohort data is done using Cox regression in which the case-control comparison from each risk set is quantified in a conditional logistic likelihood contribution. The partial likelihood analysis of cohort data is based on the product of the conditional logistic contributions from each case-control set. Cox regression has been studied extensively and the validity of the method is well accepted.20,21
Nested Case-Control Studies
Intuitively, comparison of exposure between the cases and controls in the risk sets does not require all the controls; a representative sample of the controls should be sufficient. The nested case-control study illustrated Figure 1C is an extension from “full cohort data,” based on the risk set representation, to “case-control data” with the sampled controls represented by the ‘○'s. Only a few risk set controls, matched to the case on 5-year year-of-birth interval, are selected to represent the distribution of exposure in all eligible controls. We will use the term nested case-control studies to mean designs in which controls are sampled (using any of a wide range of sampling methods) from the risk sets, but are sampled independently across risk sets.22 The most common design, and the main focus of this paper, is simple random sampling (SRS design) of controls from the (matched) risk sets. With an SRS design, radon exposure comparisons between the cases and sampled controls will, on average, will be representative of those from the risk sets, but with added “sampling” variability. We note that the same intuition applies if one were first to randomly sample the lung cancer cases from the cohort, then randomly sample risk set controls; the case-control relationships are representative of those in the cohort risk sets. This is illustrated in Figure 1D, in which the ‘•'s, denote randomly sampled cases while the ‘*'s denote cases not sampled. A single control is randomly sampled (the ‘○'s) from each of the year-of-birth–matched risk sets of the 3 sampled cases. As with the full cohort, the conditional logistic (partial likelihood) analysis of nested case-control data provides valid estimation of the rate ratio and accounts for the sampling variability in the standard errors and confidence intervals. The analysis is completely analogous to the Cox regression method, except that it uses only the “sampled risk sets.” Again, these methods have strong theoretical justification and have been validated extensively.7,9,23
THE ASSERTION THAT NESTED CASE-CONTROL STUDIES ARE METHODOLOGICALLY FLAWED
In a series of articles and letters, Deubner and coworkers have asserted that estimates of effect based on lagged exposure variables in SRS nested case-control studies can be biased. In particular, they claim that associations between a lagged exposure variable and disease can be introduced by the SRS nested case-control study design when none, in fact, exists. Lagging of exposure assignment is done in epidemiologic analyses to account for a period of time from an exposure to an increase in disease risk (an induction and latency period). For instance, the risk of lung cancer does not increase until 5 years after radon exposure.24 To explore Deubner and coworkers’ claim of bias, we need a clear definition of “nested case control bias.” Our definition, which we believe to be the only meaningful one, is that (aside from sampling error) the nested case-control study yields different conclusions from those based on the comparable analysis (ie, using the same statistical model) of the cohort from which it is drawn.25 Thus, in the context described by Deubner et al, the assertion is that disease associations with a lagged exposure can be present in the SRS design data that are not present in the cohort. This claim is based on theoretical and empirical evaluation evidence. We consider each of these in turn.
Theoretical Arguments that SRS Nested Case-Control Studies Can be Biased
The first point made by Deubner et al that we consider is, “the theoretical foundations on which incidence density (nested case-control) sampling is based do not address exposure lagging.”3 This assertion would imply that there is a “hole” in the statistical theory that opens the possibility that the SRS design can introduce spurious associations.3–5 However, lagged exposure is clearly a summary of exposure history and thus is a reasonable and valid variable for analysis for which the statistical theory applies.21,24,26,27 They further prove that lagging exposure assignment in an SRS study disproportionately truncates exposure information for controls (ie, the controls tend to have more exposure information truncated due to lagging exposure assignment than the cases).3 However, although the observation and proof that controls have “greater likelihood than cases of having some or all of their exposure truncated” is true, it is irrelevant. When exploring a latency effect, the exposure experienced during the latency (lag) period should be ignored if one wishes to assess an association under the assumption that exposure experienced during this period is not related to disease risk. Intuitively, the lagged exposures of controls from an SRS study will be representative of those from all controls in the risk set (cohort study). If there is no difference in exposure during the lag period in the cohort data, there will be no difference in the SRS nested case-control data.
Empirical Evaluation Evidence that Nested Case-Control Studies Are Biased
To support their theoretical argument, Deubner, Roth, and Levy4 developed an approach to evaluate empirically bias in nested case-control studies by using available cohort data and then showed, in a cohort of beryllium workers, that the lagged exposure was associated with mortality in SRS samples, when it was not associated in the cohort. Starting with a cohort of 3569 beryllium workers followed for (lung cancer) mortality through 1988, 142 workers were randomly sampled from the cohort and designated as “probands” with exit age taken as the worker's age at last observation (age at death if deceased, and age at end of follow-up if not deceased). A risk set was formed at each of these proband exit times that included the proband and all workers alive and under study at the proband exit age. Five controls were then sampled from these risk sets. Because the probands were a random sample from the cohort, one might reasonably believe, as Deubner and coworkers assert, that there should not be associations between exposure (or any other variables) and “proband-status.” However, it turns out that associations between exposure and proband status found in SRS samples by this method reflect associations present in the cohort. To see this, consider our discussion of the nested case-control design. Figure 2A illustrates a sample by Deubner et al, with ‘x’ denoting a sampled proband and ‘○’ marking controls. Because controls are randomly sampled from the proband-defined risk sets, they are representative of controls in the entire risk set, shown in Figure 2B. Then, because probands are randomly sampled from all members of the cohort, the sampled proband–control comparisons are representative of risk sets in a cohort in which all members are probands, as illustrated in Figure 2C. Due to the random sampling, it should be apparent that any case-control differences in exposure are preserved and that any exposure–outcome associations estimated from the case-control sets in Figure 1A are sample approximations of the same associations in the underlying cohort, Figure 1C. Thus, contrary to the presumption that there should be no association between (lagged) exposure and disease using SRS data generated using the empirical evaluation method of Deubner and coworkers, the random sampling of probands preserves any associations between exposure and age at exit that exist in the full cohort. Therefore, the finding that lagged exposure was associated with proband status cannot be used to conclude that nested case-control methods can lead to bias with lagged (or any other summary of) exposure variables.
Exposure may be associated with age at exit for many reasons including actual increases in rates due to exposure for some other causes of death, or more likely, artifactual associations between exposure and age of exit for subjects alive at the end of the study. To further demonstrate that the empirical investigation of Deubner and coworkers does not work, we generated simulated data in which subjects are enrolled during a 1-year accrual period and then followed for 1 additional year. We assigned a dichotomous exposure to each subject with probability dependent on enrollment time, with 20% exposure probability during the first half of the enrollment year, and 80% during the second half. An exit date from the study was the minimum of the enrollment time plus a random (exponentially distributed) time to disease or 3 years, the time at the end of follow-up. The rate of disease was set so that approximately 10% of cohort subjects would develop the disease, and the rest were alive at end of follow-up. The results from an SRS study with 5 controls sampled per case and an empirical evaluation by Deubner et al, as well as the corresponding cohort study analyses described previously, are shown in Table 1. The disease outcome SRS study exposure RR estimate is 0.93 (95% confidence interval [CI] = 0.76–1.13) is consistent with the true RR value of 1 and very close to the cohort estimate of 0.90. The RR estimate from Deubner and coworkers’ empirical evaluation is 2.29 (1.89–2.82), a result of the date of entry-exposure correlation and is completely consistent with the cohort results with all subjects as probands.
EMPIRICAL EVALUATION OF DESIGN AND ANALYSIS QUESTIONS FROM COHORT DATA
Although the empirical evaluation method proposed by Deubner and coworkers fails to provide a technique for assessing nested case-control performance when basic cohort data are available, such a procedure would be of some value in assessing design and analysis questions within the context of specific study situations. In this section, we describe a valid approach to this problem.
Description of the Method
To simplify the discussion, we will describe the methods with respect to a nested case-control study from the miners cohort. To conduct a valid empirical evaluation of the nested case-control design, we generate data in which there is a specified association (RR) between exposure and lung cancer. To do this, plausible radon exposure histories are assigned to each miner in the cohort. Suppose that mine surveys provide estimates of decade-specific average dose rates of about 6.25, 8.30, 5.00, and 0.83 working levels24 for before 1950, the 1950s, the 1960s, and the 1970s or later, respectively. To create a radon exposure history for each miner between start and end of mining employment, we assigned exposure to each 5-year interval of age as 5 years times the decade-specific average dose rate for the decade at the midpoint of the age interval. Risk sets are then formed from the cohort at each of the ages of the lung cancer death. However, we do not identify the lung cancer case. This is done randomly by specifying the rate of lung cancer death for each subject at a given time, which we label λi(t) and is a function of the exposure history of subject i up to time t. For risk set k associated with time tk, to simulate a single case-control study, we first pick exactly 1 case from the risk set, a single draw from a multinomial distribution with probabilities
where the sum is over risk set members, and then sample controls according to the design of interest. This is done for each risk set in the cohort to create a simulated case-control study trial.
The case-control data from each trial is analyzed according to the desired analysis method, and the results tabulated. Although λi(t) can be taken to have any form, in most applications a standard log linear (Cox model) form will be used with λi(t) = λ0(t) exp(Z i(t)β) where λ0(t) is the baseline hazard, Z i(t) the exposure summary (vector) for subject i at time t, and β the log rate ratio parameters.20 The approach is justified implicitly based on the conditional probabilities used in the partial likelihood construction.20,21 We note that one can also simulate new exposure histories for each trial, but this adds to the complexity of the simulation and will rarely make any qualitative (or even quantitative) difference in the conclusions.
Using the risk sets associated with the age at each of the 258 lung cancer ages of death, Z i(tk) were set equal to 1 if 20-year lagged cumulative exposure was over 500 working-level months24 and to 0 if under 500 working-level months. We assigned a case in the risk set according to equation (1) under the Cox model with rate ratios exp(β0) = 1, 2, and 4.
We then sampled controls from the risk set according to the sampling design and estimated the rate ratio using conditional logistic regression with the model of interest. From 500 trials, we computed the antilog of the mean log rate ratio (“estimated RR”), empirical standard error of the estimated log rate ratio (“empirical s.e.”), average of the estimated standard errors of the log rate ratio “estimated s.e.,” power to detect a radon exposure–lung cancer association, and other statistics to characterize the performance of the simulation. The SAS statistical software package (SAS Institute, Cary, NC) was used to perform the simulation. The programs and data are available at: http://hydra.usc.edu/timefactors. We use this approach to address a variety of questions that might arise when considering a nested case-control study from the miners study.
Validity of the SRS Design for Lagged Radon Exposure
Table 2 gives the results of the empirical evaluation for analyses of the lagged exposure variable in SRS case-control studies with 1 and 3 controls per case. As indicated by the “Estimated RR” columns, for all situations considered, the true rate ratio is well estimated using the SRS sampling design. Furthermore, comparing the empirical standard errors of the 500 estimated log rate ratio estimates to the average of the estimated standard errors, the likelihood based estimated standard errors perform as predicted by theory.
Comparison of Tests of Radon–Lung Cancer Association by Using Cumulative and Lagged Exposure
Continuing with the SRS design, we performed an empirical evaluation to estimate the power to detect a radon effect when exposure assignment is lagged by 20 years and when total cumulative exposure (up to tk − 2) is used in the conditional logistic regression. The results for 1 and 3 controls per case are given in Table 3. When the rate ratio is 1, the power should correspond to the size of the test, in this case 5%. Estimated test size, for both lagged and cumulative exposure analyses, is very close to 5%, indicating that the design and analysis are behaving as theory predicts. When the rate ratio equals 2 and 4, testing using the lagged exposure (corresponding to the “true” model used to generate the case outcomes) has nearly 100% power, even with a single control. However, tests based on cumulative exposure (a “misspecified” model) have low power to detect a radon effect, with 66% power to detect a rate ratio of 4 or greater with 3 controls per case. This empirical investigation shows that, if one were to perform a SRS study in the miners cohort, it would be important to investigate latency effects as well as cumulative exposure. A cumulative exposure variable will have much less power to detect an association when the association is more accurately described by the lagged exposure.
The Effect of Matching on Age of Hire
When designing the nested case-control study, the choice of matching factors is fundamental. Generally, the goal of matching is to choose controls more “like the case” in ways that we believe increase the validity of case-control exposure comparisons. However, matching on factors correlated with exposure can also reduce the statistical power to detect exposure–disease associations by increasing the concordance of case-control sets with respect to the exposure of interest.19 For example, Deubner et al suggest matching on age of hire so that the case-control comparisons will be among workers who have worked a similar period of time.1,3 Certainly, matching on date of hire when there are disease and exposure-related unmeasured factors will improve the validity of exposure comparisons, but there may be a large cost in terms of statistical power if the matching is not needed. To examine this question in the miners cohort, we performed an empirical investigation of an analysis of lagged exposure from SRS samples from risk set controls matched to the case on 5-year age of first mining interval. The results are shown in Table 4 and, although the average estimated RRs are close to the true, the standard errors are on the order of 10 times larger than from a study with the same number of controls from the full risk sets. Thus, the power to detect the radon–disease association is quite low (eg, 70% for RR >4 with 3 controls per case). So, although comparing miners hired around the same age would increase the validity of the exposure–disease findings, our empirical investigation indicates that the power to find a true association would be somewhat reduced. Therefore, for a miners nested case-control study, we would recommend against matching on age at hire and carefully include in the statistical models any potential nonradon factors related to lung cancer risk that might also be correlated to year of hire.
The Effect of Matching on Age of Exit
Deubner et al2 advocated that SRS design controls should be matched to the case on age of exit from the study. Because age of exit will not be known until a subject exits the study, it is not part of history for any time prior to age of exit and thus could, in principle, lead to biased estimation of radon–lung cancer associations. For instance, if nonlung cancer death reasons for exit are associated with radon exposure, the estimated radon lung–cancer associations will be biased downward. However, if other reasons for exit are not well correlated with radon exposure, there may be little or no bias. Here we evaluate of “degree of bias” empirically for the miners cohort by sampling from the risk set controls who matched the case by 5-year age-of-exit interval. The average estimated rate ratios are shown in Table 5. For a study of the miners, there is evidence that this “improper” matching will result in notable bias.
Pure Case-Control Comparisons
As an alternative to sampling controls independently across risk sets, consider a design that specifies that cases cannot serve as controls, and that a subject can only serve as a control for a single case. This design might be considered a practical alternative to the standard when inclusion in multiple case-control sets requires multiple interviews with subjects, depletes a biologic sample resource, or complicates the study implementation in some other way. With the standard conditional logistic analysis, this method of sampling can lead to bias.28,29 We investigated the performance of this design for a study from the miners cohort; the results are given in Table 6. There is detectable upward bias that increases with the number of controls sampled. Because the biases in the table are probably not large enough to invalidate the results of a study using the pure case-control design, we conclude that if the pure case-control design is highly desirable for logistic reasons, and only 1 or 2 controls are to be sampled, the design bias is acceptable within this cohort.
It is unfortunate that a series of articles and letters criticizing aspects of a nested case-control study of beryllium exposure and lung cancer mortality have resulted in confusion about the validity of standard nested case-control study methods. Many of the methodologic criticisms were based on the false premise that the theory underlying nested case-control sampling, Cox regression, and (by extension) Poisson regression accommodates only cumulative (and not lagged or other) exposure summaries. These false premises were compounded by a flawed empirical evaluation of the design.
In this paper, we have provided an explanation of why nested case-control sampling is a natural sampling analog of analysis based on case-control exposure comparisons in cohort risk sets, and we have discussed the flaws in evidence that have been presented to show that nested case-control studies can be misleading for the analysis of lagged exposures. The investigation of design bias was made in the context of a list of criticisms of a lung cancer risk in a SRS nested case-control study in a cohort of beryllium.30,31 Although no study is perfect and it is important to examine plausible alternative explanations for observed exposure–disease associations, study design bias should not be considered further as a weakness of this study.
We have provided appropriate methods to evaluate and compare case-control study designs within the context of a particular cohort. This approach can be used in study planning, to assess and compare different candidate study designs, to assess the likely magnitude of potential bias, and compute power to detect effects. Of course, assumptions need to be made about covariates not available in the cohort and the hazard model for disease occurrence. To the extent that these assumptions deviate from the actual underlying data structure, the empirical evaluation may be inaccurate. Also, the findings from any empirical evaluation will be specific to the characteristics of the particular cohort and study design under examination. General properties of nested case-control study designs and estimators are best studied using the powerful counting process and martingale theory statistical tools for failure time data.26 The empirical evaluation methods that we have described complement these theoretical tools by tailoring the evaluation to the particular study situation, accounting for the structure of the underlying cohort, and accommodating complex exposure histories and study designs.
1. Deubner DC, Lockey JL, Kotin P, et al. Re: Lung cancer case-control study of beryllium workers [letter]. Am J Ind Med
2. Levy PS, Roth HD, Deubner DC. Exposure to beryllium and occurrence of lung cancer: a reexamination of findings from a nested case-control study. J Occup Environ Med
3. Levy PS, Deubner DC, Roth HD. Re: exposure to beryllium and occurrence of lung cancer: a reexamination of findings from a nested case-control study [letter]. J Occup Environ Med
4. Deubner DC, Roth HD, Levy PS. Empirical evaluation of complex epidemiologic study designs: workplace exposure and cancer. J Occup Environ Med
5. Deubner DC, Roth HD, Levy PS. Re: Empirical evaluation of complex epidemiologic study designs: workplace exposure and cancer [letter]. J Occup Environ Med
6. Liddell F, McDonald J, Thomas D. Methods of cohort analysis: appraisal by application to asbestos miners. J R Stat Soc A
7. Oakes D. Survival times: aspects of partial likelihood (with discussion). Int Stat Rev
8. Breslow NE, Lubin JH, Marek P, Langholz B. Multiplicative models and the analysis of cohort data. J Am Stat Assoc
9. Goldstein L, Langholz B. Asymptotic theory for nested case-control sampling in the Cox regression model. Ann Stat
10. Garabrant DH. Case-control study design: spurious associations between exposure and outcome [Editorial]. J Occup Environ Med
11. Lundin F, Wagoner J, Archer V. Radon Daughter Exposure and Respiratory Cancer, Quantitative and Temporal Aspects
. Washington, DC: US Public Health Service; 1971.
12. Hornung RW, Deddens JA, Roscoe RJ. Modifiers of lung cancer risk in uranium miners from the Colorado Plateau. Health Phys
13. Thomas D, Pogoda J, Langholz B, Mack W. Temporal modifiers of the radon-smoking interaction. Health Phys
14. Langholz B, Goldstein L. Risk set sampling in epidemiologic cohort studies. Stat Sci
15. Langholz B, Thomas DC, Xiang A, Stram D. Latency analysis in epidemiologic studies of occupational exposures: application to the Colorado plateau uranium miners cohort. Am J Ind Med
16. Hill BA. The environment and disease: association or causation? Proc R Soc Med
17. Wacholder S, McLaughlin JK, Silverman DT, Mandel JS. Selection of controls in case-control studies, I: principles. Am J Epidemiol
18. Wacholder S, Silverman DT, McLaughlin JK, Mandel JS. Selection of controls in case-control studies, III: design options. Am J Epidemiol
19. Wacholder S. Design issues in case-control studies. Stat Methods Med Res
20. Cox DR. Regression models and life-tables (with discussion). J R Stat Soc B
21. Andersen PK, Gill RD. Cox's regression model for counting processes: a large sample study. Ann Stat
22. Langholz B. Use of cohort information in the design and analysis of case-control studies. Scand J Stat
23. Borgan Ø, Goldstein L, Langholz B. Methods for the analysis of sampled cohort data in the Cox proportional hazards model. Ann Stat.
24. Lubin J, Boice J, Edling C, et al. Radon and Lung Cancer Risk: A Joint Analysis of 11 Underground Miners Studies.
Bethesda, MD: US Department of Health and Human Services, Public Health Service, National Institutes of Health; 1994. NIH Publication 94-3644.
25. Xiang AH, Langholz B. Comparison of case-control to full cohort analyses under model misspecification. Biometrika
26. Self SG, Prentice RL. Commentary on Andersen and Gill's Cox's regression model for counting processes: a large sample study. Ann Stat
27. Andersen P, Borgan O, Gill R, Keiding N. Statistical Models Based on Counting Processes.
New York: Springer Verlag; 1992.
28. Lubin J, Gail M. Biased selection of controls for case-control analysis of cohort studies. Biometrics
29. Robins J, Gail M, Lubin J. More on “Biased selection of controls for case-control analysis cohort studies.” Biometrics
30. Sanderson WT, Ward EM, Steenland K, Petersen MR. Lung cancer case-control study of beryllium workers. Am J Ind Med
31. Sanderson WT, Ward EM, Steenland K. Re: Response to criticisms of “lung cancer case-control study of beryllium workers” [letter]. Am J Ind Med