Share this article on:

Exposure Misclassification in Studies of Agricultural Pesticides: Insights From Biomonitoring

Acquavella, John F.*; Alexander, Bruce H.; Mandel, Jack S.; Burns, Carol J.§; Gustin, Christophe*

doi: 10.1097/01.ede.0000190603.52867.22
Original Article

Background: Epidemiologists often assess lifetime pesticide exposure by questioning participants about use of specific pesticides and associated work practices. Recently, Dosemeci and colleagues proposed an algorithm to estimate lifetime average exposure intensity from questionnaire information. We evaluated this algorithm against measured urinary pesticide concentrations for farmers who applied glyphosate (n = 48), 2,4-D (n = 34), or chlorpyrifos (n = 34).

Methods: Algorithm scores were calculated separately based on trained field observers’ and farmers’ evaluations of application conditions. Statistical analyses included nonparametric correlations, assessment of categorical agreement, and categorical evaluation of exposure distributions.

Results: Based on field observers’ assessments, there were moderate correlations between algorithm scores and urine concentrations for glyphosate (r = 0.47; 95% confidence interval [CI] = 0.21 to 0.66) and 2,4-D (0.45; 0.13 to 0.68). Correlations were lower when algorithm scores were based on participants’ self-reports (for glyphosate, r = 0.23 [CI = −0.07 to 0.48]; for 2,4-D, r = 0.25 [−0.10 to 0.54]). For chlorpyrifos, there were contrasting correlations for liquid (0.42; 0.01 to 0.70) and granular formulations (−0.44; −0.83 to 0.29) based on both observers’ and participants’ inputs. Percent agreement in categorical analyses for the 3 pesticides ranged from 20% to 44%, and there was appreciable overlap in the exposure distributions across categories.

Conclusions: Our results demonstrate the importance of collecting type of pesticide formulation and suggest a generic exposure assessment is likely to result in appreciable exposure misclassification for many pesticides.

From the *Monsanto Company (retired), St. Louis, Missouri; the †University of Minnesota, School of Public Health, Minneapolis, Minnesota; ‡Emory University, Rollins School of Public Health, Atlanta, Georgia; and the §Dow Chemical Company, Midland, Michigan.

Submitted 25 October 2004; accepted 14 July 2005.

The Farm Family Exposure Study was funded by a contract between the Farm Family Exposure Study Taskforce and the University of Minnesota. At the time of this study, Drs. Acquavella and Burns and Mr. Gustin were employed by companies that manufacture the pesticides considered in this article.

Correspondence: Jack Mandel, Department of Epidemiology, Rollins School of Public Health, Emory University, 1518 Clifton Road, Atlanta, GA 30322. E-mail:

Epidemiologic studies of industrial workers often use documentation about past and present workplace conditions to aid in retrospective exposure assessment. This information can include engineering diagrams, process descriptions, purchasing and sales records, job descriptions, information from plant medical records, and area or personal exposure-monitoring data. In contrast, studies of farmers and other pesticide-exposed workers involve research into largely undocumented and idiosyncratic work conditions. Exposure assessment in these types of studies relies on participant self-reports of lifetime pesticide use and work practices.

No method has yet been developed to assess the accuracy of lifetime self-reports of pesticide use, although some limited relevant information is available. In one study, approximately 60% of farmers’ self-reports agreed with suppliers’ records of purchases for specific pesticides.1 In another article, Blair et al2 evaluated the repeatability of pesticide information on enrollment questionnaires for 4,188 pesticide applicators, primarily farmers, who filled out questionnaires in successive years. The year-to-year reliability for reporting any lifetime use of 11 widely used pesticides varied from 79% to 87%; categorical agreement varied from 50% to 59% for typical days of use per year and from 50% to 77% for years of use.

The meaning of a day of pesticide use can vary greatly.3 For example, in a recent biomonitoring study of farmers in 2 states doing typical pesticide applications (the Farm Family Exposure Study), the number of acres treated in a day varied from 10 to 439, the number of pesticide mixing operations varied from one to 14, and the pounds of pesticide active ingredient handled varied from 4 to 351.4 These data suggest that the exposure potential for a day of pesticide use is highly variable.

Dosemeci et al5 recently proposed a generic algorithm for using questionnaire information to develop an average lifetime exposure intensity score for specific pesticides. This score could then be used as a multiplier of days of use to produce an intensity-weighted estimate of cumulative exposure. In the algorithm, weights are assigned according to participants’ recollections of the type of pesticide application equipment and personal protective equipment used, whether they mixed and applied pesticides personally, and whether they repaired equipment used for the application. The weights proposed were based largely on passive dosimetry data available in the Pesticide Handlers Exposure Database6 and in the literature. Passive dosimetry refers to techniques that estimate deposition of pesticides on skin surfaces and potential inhalation exposure. At least 2 recent published studies7,8 have used the average intensity score approach in their analyses.

In this article, we evaluate the exposure intensity algorithm developed by Dosemeci and colleagues5 using biomonitoring data for 3 pesticides for participants from the Farm Family Exposure Study.4 Biomonitoring refers to the direct measurement of pesticides or their metabolites in bodily fluids or tissues to estimate the actual absorbed dose of pesticides.

Back to Top | Article Outline


The Farm Family Exposure Study has been described previously.4,9,10 Briefly, a sample of farm families with children were identified at random from public lists of licensed pesticide applicators in Minnesota and South Carolina. Eligible farmers had to be planning to personally apply glyphosate (n = 48), 2,4-D (n = 34), or chlorpyrifos (n = 34) to at least 10 acres of cropland as part of the normal farm operation. All members of the family were enrolled in the study and each member provided all urine voids for a 24-hour sample for 5 days from 1 day before the pesticide application under study through 3 days after the application. Trained field observers recorded application practices and use of personal protective equipment to enable assessment of factors that might be related to the amount of pesticide exposure on the day of application. The participating farmers also filled out questionnaires within 3 days of the application describing their application practices and use of protective equipment. Many of the questions on application practices and use of protective equipment were taken from the questionnaire for which Dosemeci's algorithm was developed5 augmented with questions to aid in the eventual development of predictive exposure models. All of the glyphosate and 2,4-D applications were liquid formulations, whereas the chlorpyrifos applications included 24 liquid and 10 granular formulations. All participating families were taken through an informed consent process. The Institutional Review Board of the University of Minnesota approved the study protocol.

Glyphosate, 2,4-D, and the primary metabolite of chlorpyrifos (3,5,6-trichloro-2-pyridinol [TCPy]) are excreted fairly completely and rapidly in urine, making analysis of urine samples an appropriate means to assess pesticide dose from an application. Each pesticide was analyzed in daily 24-hour composite urine samples with a 1 part per billion (ppb) limit of detection. More details of the Farm Family Exposure Study, including details about the laboratory analytical methods, are available online.11

We calculated separate intensity scores for Farm Family Exposure Study farmers based on field observers’ reports and self-reports of application practices and use of personal protective equipment (PPE) according to the algorithm proposed by Dosemeci and colleagues5:

Intensity score = (apply + mix + repair)* PPE

As recommended by these authors, we used the following weights: for “apply,” 2 for in-furrow applications, 3 for boom spray applications, and 9 for hand spray applications or row fumigation; for “mix,” 9 for those who mixed pesticides and zero for those who did not mix; for “repair,” 2 for those who repaired their equipment and zero for those who did not repair; and for personal protective equipment, one for those who did not use PPE, 0.6 for those who used chemical resistant gloves only, and lesser values to a minimum of 0.1 for those who used additional personal protective equipment.

We compared the predicted exposure intensity scores to farmers’ 24-hour composite urinary pesticide concentrations (in ppb). For this article, we focus on the day of application for glyphosate and the day after application for 2,4-D and chlorpyrifos because those were the days that applicator urine concentrations peaked for the respective chemicals.9,12 We also compared predicted exposure scores with integrated measures of pesticide exposure—systemic dose12 and micrograms of pesticide excreted—both of which are estimators of pesticide dose commonly used in regulatory risk assessment evaluations and are not subject to variation due to the hydration status of study subjects. With a few exceptions, results were similar for analyses based on urine concentrations and integrated measures of dose, so we focus primarily on the former in this article.

The distributions of urinary concentrations for glyphosate, 2,4-D, and chlorpyrifos were highly skewed (skewness 4.6, 2.8, and 3.7, respectively),9 and so we calculated geometric means and medians as measures of central tendency. We calculated geometric means as the antilog of the average of the natural log-transformed urinary values.

We compared intensity scores and nontransformed pesticide urine concentrations using several methods. First, we computed Spearman correlation coefficients and associated 95% confidence intervals (95% CIs). Second, we crossclassified individuals into categories by intensity score and urinary concentrations and then calculated the percent categorical agreement and weighted kappa13,14 statistics. We allocated participants as evenly as possible across quartiles, allowing for ties. However, for glyphosate, 19 farmers had the same measured urine concentration (lower than the 1 ppb limit of detection) and, based on input from observers, 25 individuals had the same intensity score, leading to an unbalanced 3 × 3 classification. Finally, we calculated geometric means, medians, and ranges for exposure-intensity categories and examined scatterplots of individual values across categories.

Back to Top | Article Outline


In Table 1, we detail Farm Family Exposure Study farmers’ urine concentrations for glyphosate, 2,4-D, and TCPy. Urinary concentrations varied by chemical, being lowest for glyphosate and highest for 2,4-D. The range of values was much broader for 2,4-D than for the other pesticides. Glyphosate was unique in that 19 of 48 (40%) farmers had urine concentrations that were less than the 1 ppb limit of detection, including 9 who treated between 100 and 439 acres. A cumulative probability plot of urinary concentrations by chemical (Fig. 1) indicates that the distributions vary by chemical. For glyphosate and chlorpyrifos, values up to the 40th percentile were essentially identical. Use of chemical-resistant gloves was associated with a large relative reduction in urine concentration for glyphosate (geometric means 10 ppb without gloves vs 2 ppb with gloves) and 2,4-D (geometric means 206 ppb vs 39 ppb), but glove use was not a predictor of urinary TCPy values (geometric means 18 ppb vs 19 ppb).





Table 2 gives Spearman correlations and 95% CIs for urinary concentrations and estimated intensity scores. There were moderate correlations for glyphosate and 2,4-D when intensity scores were based on field observers’ assessments, but correlations were much reduced when intensity scores were based on participants’ self-reports. Overall, the differences in results based on observers’ versus participating farmers’ inputs were related to participants reporting a higher frequency of repairs and use of personal protective equipment. This augmented reporting of protective gear involved equipment other than chemical resistant gloves (eg, goggles, face shields, disposable outer clothing). The correlation between glyphosate and predicted intensity score based on self-reports essentially disappeared (r = 0.04) when systemic dose was considered instead of urine concentration.



For chlorpyrifos, correlations between intensity scores and urinary TCPy values were negligible based on observations and self-reports (Table 2). We stratified the correlation analysis into liquid (n = 24) and granular (n = 10) formulations and found contrasting correlations (for liquid, r = 0.42 [CI = 0.01 to 0.70] and for granular, 95% = −0.44 [−0.83 to 0.29]). The results were identical based on observers’ or farmers’ assessments. Geometric mean TCPy urine concentrations were much lower for those who applied granular (10 ppb) versus liquid (24 ppb) formulations.

A common practice in studies of dichotomous outcomes such as cancer is to conduct categorical dose–response analyses.7,8,15 We examined categorical agreement between intensity scores and urinary values by calculating percent agreement and weighted kappa statistics. Distributions of urine concentrations by intensity category are displayed in Figure 2.



Percent categorical agreement across chemicals ranged from 26% to 44% (Table 3). The range of agreement values was similar regardless of whether intensity scores were based on field observations or farmers’ self-reports. For glyphosate, however, percent agreement was appreciably lower when intensity scores were based on self-reports. Weighted kappa statistics paralleled the findings for percent agreement for 2,4-D and glyphosate (based on observers’ assessments only), but reflected little agreement beyond chance for TCPy and farmers’ self-reports for glyphosate.



Table 4 presents results for exposure-intensity categories. Geometric mean or median exposures were appreciably higher in the highest-intensity category for all 3 chemicals. However, the pattern was irregular for the lower-intensity categories. In addition, as illustrated in Figure 2, there is substantial overlap of individuals’ urine concentrations across intensity categories for intensity scores that were based on observers’ inputs. The overlap was more marked when intensity scores were based on self-reports (plots available on request from the authors).



Back to Top | Article Outline


Retrospective exposure assessment for pesticides requires that farmers report accurately whether they used specific pesticides and, if so, the duration and frequency of use. Blair and Zahm2 have argued that farmers can report accurately whether they have used specific pesticides, and that seems likely for pesticides that have been used over many years. It is less certain how well farmers can report duration and frequency of use for specific pesticides. As Blair et al2 reported, categorical agreement on duration and frequency of use was as low as 50% for specific pesticides on repeat questionnaires 1 year apart. More research is needed in this area because duration and frequency of use are key components of cumulative exposure.

Another important component of cumulative exposure is the average exposure intensity. Dosemeci and colleagues5 have proposed a generic algorithm for estimating exposure intensity that is based on use of personal protective equipment, type of application equipment, and whether the applicator personally handled the pesticide or made equipment repairs. The clearest finding from our analyses is that exposure prediction is fundamentally different for granular than liquid formulations. This finding is supported by our analysis of the extensive passive dosimetry data available in the Pesticide Handlers Exposure Database,6 which shows that exposure to the hands is much lower with granular than with liquid formulations. Accordingly, exposure prediction algorithms need to be parameterized differently with respect to glove use and perhaps other factors for granular and liquid formulations.

We found moderate correlations between predicted exposure intensity and urinary concentration for liquid formulations of glyphosate, 2,4-D, and chlorpyrifos when assessment of application conditions and PPE were based on field observations. Correlations were lower when study participants provided information about application conditions. Categorical analyses demonstrated considerable overlap across exposure score categories (Fig. 2) instead of the intended pattern of discretely increasing individual urine concentrations across categories.

We used data from a single pesticide application to evaluate an algorithm that is intended to assess individuals’ lifetime average exposure intensity over many applications. However, intraindividual variation may be substantial.16 The logical next step is to evaluate average exposure algorithms using repeated biomonitoring assessments of farmers.

Our evaluation focused on farmers in a limited number of pesticide application scenarios. However, tractor and boom application is a very common scenario,17 especially for large (>10 acres) U.S. agricultural pesticide applications. Careful evaluation and validation of exposure algorithms in other pesticide use situations is advisable.

Our method of categorizing subjects into equal tertiles or quartiles can produce misleading results, especially when data are highly skewed.18 Biomonitoring studies like the Farm Family Exposure Study are too resource-intensive to include enough participants to explore this possibility rigorously. Our results, however, are internally consistent in that the results of the categorical-agreement analyses were supported by the results of the nonparametric correlation analyses and by inspection of plots of exposure intensity scores and biomonitoring values. The fact that pesticide biomonitoring data almost always show markedly skewed distributions would suggest that the practice of categorizing subjects into evenly allocated pesticide intensity quartiles or quintiles deserves reconsideration.

A notable finding from our research is that trained observers and study participants reported appreciably different frequencies of equipment repair and use of personal protective equipment for the same applications. This suggests the need for research on the accuracy of participant reports of protective equipment use and application conditions. It may be that participants report some factors more accurately than observers or that observers missed certain practices during or after the applications. Nonetheless, the correlations between intensity scores and biomonitoring values tended to be lower when based on participant reports of application practices, and retrospective epidemiologic studies of pesticides depend heavily on study participants to report that type of information.

Our continuing analyses of the Farm Family Exposure Study data suggest different predictors of dose for glyphosate, 2,4-D, and chlorpyrifos.12 Of interest in this regard is a recent article by Arbuckle and colleagues19 who conducted a biomonitoring study of the phenoxy herbicides 2,4-D and 4-chloro-2-methylphenoxy acetic acid (MCPA) to assess the use of questionnaire information for predicting exposure. For MCPA, the type of pesticide formulation (amine salts vs potassium and sodium salts and esters) was a predictor of urinary concentration, but the same was not found for 2,4-D. Tank size was a positive predictor for MCPA urinary levels and a negative predictor for 2,4-D urinary levels. Accordingly, it is questionable whether a generic approach to retrospective exposure assessment is feasible or whether a chemical specific approach is required. As Arbuckle and colleagues,19 (p414) concluded: “Our results confirm that farm pesticide applicators are not uniformly exposed to herbicides during a day of application and that the extent of their exposure may not be consistent across similar herbicides, let alone all pesticides.”

Retrospective questionnaire-based pesticide exposure assessment seems to be at an early stage of development. The evidence suggests that users of specific pesticides can be differentiated from nonusers, and it seems likely that very frequent users can be differentiated from infrequent users. However, there is no evidence that retrospective questionnaire information can be used to differentiate gradients of pesticide exposure. The often used exposure unit “day of pesticide use” is not a homogeneous entity, although it is routinely used as such. In our data, a day of use varied greatly in terms of the number of acres treated, the pounds of pesticides handled, and up to 3 orders of magnitude in terms of urine concentration. Accordingly, and given the uncertainty in questionnaire responses about yearly frequency of use and years of use,2 our results suggest that dose–response analyses based on estimated cumulative days of use would have substantial exposure misclassification.

The average exposure intensity algorithm proposed by Dosemeci and colleagues5 is an important start toward improving exposure assessment for epidemiologic studies. The ability to estimate average exposure intensity would provide a basis for improved dose–response analyses. However, this algorithm (and indeed any generic approach to exposure prediction that is based on passive dosimetry) is limited because it ignores important pesticide specific physical/chemical properties that can greatly influence dose such as dermal penetration and vapor pressure. Furthermore, basing algorithm parameters on passive dosimetry findings in the Pesticide Handlers Exposure Database or in the literature means the parameters will reflect geometric mean exposure tendencies, not individual results. Finally, observed or reported use of personal protective equipment does not mean that the equipment was used correctly, maintained properly, or used during all instances of potential contact with pesticides.

Given our findings and those of Arbuckle et al,19 it seems unlikely that a generic approach to exposure estimation will suffice for all pesticides. Consideration should be given to developing algorithms for classes of pesticides with similar physical–chemical properties, formulations, and application practices, and to validating those algorithms with biomonitoring data. In the interim, investigators should recognize the likelihood of substantial exposure misclassification in dose–response analyses that rely on unvalidated exposure metrics. The direction of the resulting bias can be difficult to determine. As Rothman and Greenland18 point out, in any given study, random fluctuations in errors can lead to bias away from the null even if the classification method satisfies all the conditions that guarantee bias toward the null. Accordingly, it would be worthwhile to use quantitative techniques20,21 to estimate the effect of this misclassification on risk estimates and related confidence intervals assuming both differential and some degree of nondifferential misclassification.

Back to Top | Article Outline


We appreciate the cooperation of the volunteers who participated in the Farm Family Exposure Study. Mustafa Dosemeci provided advice on the implementation of his exposure intensity algorithm. Tim Lash provided helpful suggestions on an earlier version of the manuscript. Susan Riordan provided programming support.

Back to Top | Article Outline


1. Hoar SK, Blair A, Holmes FF, et al. Agricultural herbicide use and risk of lymphoma and soft tissue sarcoma. JAMA. 1986;256:1141–1147.
2. Blair A, Zahm SH. Patterns of pesticide use among farmers: implications for epidemiologic research. Epidemiology. 1993;4:55–62.
3. Acquavella JF, Doe J, Tomenson J, et al. Epidemiologic studies of occupational pesticide exposure and cancer: regulatory risk assessments and biologic plausibility. Ann Epidemiol. 2003;13:1–7.
4. Mandel JS, Alexander BH, Baker B, et al. Farm Family Exposure Study. Scand J Work Environ Health. 2005;31(suppl 1):98–104.
5. Dosemeci M, Alavanja M, Roland AS, et al. A quantitative approach for estimating exposure to pesticides in the Agricultural Health Study. Ann Occup Hyg. 2002;46:245–260.
6. PHED Surrogate Exposure Guide: Estimations of Worker Exposure from Pesticide Handler Exposure Database, version 1.1. May 1997.
7. Alavanja MCR, Samanic C, Dosemeci M, et al. Use of agricultural pesticides and prostate cancer risk in the Agricultural Health Study cohort. Am J Epidemiol. 2003;157:800–814.
8. Lee WJ, Hoppin JA, Blair A, et al. Cancer incidence among pesticide applicators exposed to alachlor in the Agricultural Health Study. Am J Epidemiol. 2004;159:373–380.
9. Acquavella JF, Gustin C, Alexander BH, et al. Farm family biomonitoring studies: implications for epidemiologic studies of pesticides. Scand J Work Environ Health. 2005;31(suppl 1):105–109.
10. Baker BA, Alexander BH, Mandel JS, et al. Farm Family Exposure Study: methods and recruitment practices for a biomonitoring study of pesticide exposure. J Expo Anal Environ Epidemiol. Published online 18 May 2005; DOI: 10.1038/sj.jea.7500427.
11. Available at: Accessed September 16, 2005.
12. Acquavella JF, Alexander BH, Mandel JS, et al. Glyphosate biomonitoring for farmer-applicators and their families: results from the Farm Family Exposure Study. Environ Health Perspect. 2004;112:321–326.
13. Cohen J. Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull. 1968;70:213–220.
14. Thompson WD, Walter SD. A reappraisal of the kappa coefficient. J Clin Epidemiol. 1998;41:949–958.
15. Checkoway H, Pearce N, Kriebel D. Research Methods in Occupational Epidemiology. Oxford University Press; 2004;280–281.
16. Kromhout H, Heederik D. Effects of errors in measurement of agricultural exposures. Scand J Work Environ Health. 2005;31(suppl 1):33–39.
17. Alavanja MCR, Sandler DP, McDonnell CJ, et al. Characteristics of pesticide use in a pesticide applicator cohort: the Agricultural Health Study. Environ Res. 1999;80:172–179.
18. Rothman KJ, Greenland S. Modern Epidemiology. Philadelphia: Lippincott Williams & Wilkins; 1998:130.
19. Arbuckle TE, Burnett R, Cole D, et al. Predictors of herbicide exposure in farm applicators. Int Arch Occup Environ Health. 2002;75:406–414.
20. Phillips CV. Quantifying and reporting uncertainty from systematic errors. Epidemiology. 2004;1(4):459–466.
21. Lash TL, Fink AK. Semi-automated sensitivity analysis to assess systematic errors in observational data. Epidemiology. 2003;14:451–458.
© 2006 Lippincott Williams & Wilkins, Inc.