The Sixteenth Conference of the International Society for Environmental Epidemiology (ISEE): Abstracts
Risk of neurological diseases like Parkinson's disease is modulated in complex ways by the interplay of environmental exposures and genetic susceptibilities. Dissecting contributions to risk by particular exposures or by polymorphisms at candidate loci can be problematic, however, because epidemiologic studies typically require large sample sizes to make reliable statistical inferences about gene-environment interactions. One way to accumulate larger sample sizes than are available with most individual studies is to combine several studies in a single analysis.
We consider a situation where several investigators bring together data from separate studies to improve inferences about risk factors. We review possible data-analytic approaches for this situation that are, in essence, methods for meta-analysis using observations from individual subjects (sometimes called “pooled analysis”); and we focus on issues that arise when the studies involve distinct populations or disparate study designs. Besides discussing practical considerations and underlying assumptions, we study the operating characteristics of various methods, especially the use of hierarchical models for estimation or testing. This class of models allows estimation of a common or average effect of a risk factor across studies while acknowledging that the effect may differ among studies with unique characteristics.
Precision of estimation and statistical power should increase for a combined analysis compared to any individual study because the sample size is much larger. The extent to which such improvements are realized depends, however, on between-study heterogeneity in risk parameters. Besides the potential for improved precision or power, another benefit of combining studies is broadening the inference base. An association's validity is extended as it is maintained across an increasing set of diverse populations. Alternatively, finding that the effect of a risk factor is idiosyncratic to one study may itself provide clues to etiology. Properly combining data across heterogeneous populations may increase precision less than if the same total number of subjects were accrued from a homogeneous population. Nevertheless, the former approach is not only more feasible but provides broader validity, thus compensating for small decrements in precision.