Background: Understanding the health effects associated with environmental chemicals is challenging when individuals have concentrations at or below the laboratory limits of detection as well as when the values may round to zero or are presented in the form of 0 to substitute for missing values, which may result in many zeros in the database. Comparison of mean concentrations between individuals with and without disease necessitates estimation procedures that allow for data with many zero values. The main aim of this article is to propose and examine parametric and distribution-free methods for comparing data sets containing many zero observations. An important application of this approach is related to assessing environmental chemical concentrations and reproductive health.
Methods: We extended the empirical likelihood technique for estimating confidence intervals (CIs) in data sets with many zeros. We examined the proposed empirical likelihood interval estimations via a broad Monte Carlo study that compares the proposed method with parametric techniques. Certain characteristics of Monte Carlo simulations were chosen to be close to parameters of the real data set. We applied the method to a cohort study comprising 84 women aged 18–40 years who had undergone laparoscopy between 1999 and 2000 in whom serum concentrations of 2 organochlorine pesticides—Aldrin and beta-Benzene hexachloride (β-BHC) were measured using gas chromatography with electron capture.
Results: When applied to the cohort study, the method produced efficient 95% CIs, allowing for the comparison of mean serum Aldrin concentrations for women with and without endometriosis (0.000338, 0.003561) and (0.000803, 0.004211), respectively. Mean β-BHC concentrations also could be compared (0.000493, 0.005869) and (0.000680, 0.003807) based on individuals with and without the disease, respectively. Differences in mean concentrations for Aldrin and β-BHC could be estimated (−0.001563, 0.003025) and (−0.003522, 0.002890), respectively.
Conclusions: We found the empirical likelihood method for estimating CIs robust when data sets contain many zeros. In so doing, mean concentrations of Aldrin or β-BHC did not differ by endometriosis diagnosis.