Background: Understanding the health effects associated with environmental chemicals is challenging when individuals have concentrations at or below the laboratory limits of detection as well as when the values may round to zero or are presented in the form of 0 to substitute for missing values, which may result in many zeros in the database. Comparison of mean concentrations between individuals with and without disease necessitates estimation procedures that allow for data with many zero values. The main aim of this article is to propose and examine parametric and distribution-free methods for comparing data sets containing many zero observations. An important application of this approach is related to assessing environmental chemical concentrations and reproductive health.
Methods: We extended the empirical likelihood technique for estimating confidence intervals (CIs) in data sets with many zeros. We examined the proposed empirical likelihood interval estimations via a broad Monte Carlo study that compares the proposed method with parametric techniques. Certain characteristics of Monte Carlo simulations were chosen to be close to parameters of the real data set. We applied the method to a cohort study comprising 84 women aged 18–40 years who had undergone laparoscopy between 1999 and 2000 in whom serum concentrations of 2 organochlorine pesticides—Aldrin and beta-Benzene hexachloride (β-BHC) were measured using gas chromatography with electron capture.
Results: When applied to the cohort study, the method produced efficient 95% CIs, allowing for the comparison of mean serum Aldrin concentrations for women with and without endometriosis (0.000338, 0.003561) and (0.000803, 0.004211), respectively. Mean β-BHC concentrations also could be compared (0.000493, 0.005869) and (0.000680, 0.003807) based on individuals with and without the disease, respectively. Differences in mean concentrations for Aldrin and β-BHC could be estimated (−0.001563, 0.003025) and (−0.003522, 0.002890), respectively.
Conclusions: We found the empirical likelihood method for estimating CIs robust when data sets contain many zeros. In so doing, mean concentrations of Aldrin or β-BHC did not differ by endometriosis diagnosis.
From the aDepartment of Biostatistics, University at Buffalo, Buffalo, NY; and bEpidemiology Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Rockville, MD.
Submitted 16 December 2008; accepted 22 September 2009; posted 7 May 2010.
Supported in part by the American Chemistry Council and the Intramural Research Program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development.
The opinions expressed are those of the authors and not necessarily of the National Institutes of Health.
Supplemental digital content is available through direct URL citations in the HTML and PDF versions of this article (www.epidem.com).
Correspondence: Le Kang, Department of Biostatistics, University at Buffalo, 249 Farber Hall, 3435 Main St., Buffalo, NY 14214–3000. E-mail: firstname.lastname@example.org.