The Agricultural Health Study is a long-term prospective cohort study designed to evaluate cancer and other diseases among farmers and their families in relation to agricultural exposures and life-style factors. 1 Farmers in Iowa and North Carolina, as well as commercial applicators in Iowa, were asked to participate in the study when they sought pesticide licenses or training at county agricultural extension offices. Self-completed questionnaires were used to obtain information on agricultural exposures and other factors necessary to evaluate disease risks.
Although questionnaires have long been used by the Economic Research Service of the U.S. Department of Agriculture (USDA) to obtain information on agricultural practices from farmers, 2,3 and farmers can provide considerable detail about their pesticide practices, 4,5 few studies have evaluated the reliability of information obtained on agricultural practices in epidemiologic investigations. 6 We took advantage of a special situation in Iowa to assess the reporting consistency for agricultural and life-style factors on a sample of the cohort that completed two questionnaires approximately 1 year apart.
Subjects and Methods
Participants in the Agricultural Health Study enrolled by completing a self-administered questionnaire when they came to the county agricultural extension offices to seek pesticide certification and training. At the beginning of the study, applicators in Iowa were required to take an examination in which a passing mark gained them certification for 3 years. Enrollment and completion of the questionnaires occurred from 1994 through 1996. After initiation of the study, the Iowa legislature changed procedures regarding pesticide certification for private applicators by allowing annual training as an alternative to the examination. This change meant that some individuals who chose pesticide training and completed the enrollment questionnaire one year would return for training the following year, which provided us the opportunity to compare information from questionnaires completed 1 year apart. Individuals returning for training who had already enrolled in the Agricultural Health Study were given a questionnaire containing a subset of the questions originally asked on selected pesticide practices and life-style factors.
The abbreviated questionnaire was administered at the county agricultural extension office in the same fashion as the original enrollment version. 1 It was completed by 2,895 applicators (2,842 men and 53 women). In addition, a second group of 1,193 applicators who completed the enrollment questionnaire and returned for training the following year inadvertently completed the full enrollment questionnaire a second time. An analysis of the separate and combined data revealed that these two groups were similar with respect to age, gender, marital status, education, and other factors (data not shown). We combined these two groups for analysis, which resulted in 4,088 respondents (4,008 men and 80 women). Most of these were private applicators (3,829), but 259 were commercial applicators. We compared responses on the first and second questionnaires by calculating percentage exact agreement, percentage agreement within one category (for quantitative or ordinal categories), and kappa statistic. 7 We calculated weighted kappas for multiple-response questions such as years or days per year of pesticide application.
The reliability subgroup was very similar to the full cohort for various factors including age, gender, marital status, and farm size (data not shown).
Comparability of reported use of pesticides and method of application is shown in Table 1. Although agreement on ever mixed or applied any pesticide was high (95%), kappa was considerably lower (0.15) because the proportion of subjects who had never used pesticides was very small. The reliability of reported use for specific chemicals was high and quite consistent (from 79% for carbaryl to 88% for permethrin) with no apparent differences by pest class, chemical class, or prevalence of use. Kappas ranged from 0.48 to 0.71. For the method of pesticide application, agreement of responses from the two questionnaires ranged from 72% to 99%. The range of values for kappa was from 0.11 to 0.56.
We evaluated the number of paired responses falling above and below the diagonal of exact agreement, ie, individuals providing different responses on the two questionnaires. For ever mixed or applied specific pesticides, the numbers were typically equally distributed above and below the diagonal of agreement, with the possible exception of malathion, carbaryl, and dichlorodiphenyltrichloroethane, for which considerably more positive responses were found in the original than in the second questionnaire (either follow-up abbreviated questionnaire or duplicated enrollment questionnaire).
The numbers of pairs falling above and below the diagonal were also similar for method of pesticide application. Sign tests of distributions were not statistically significant for ever-use of pesticides or method of application. Thus, there was no obvious tendency to get “yes” or “no” responses from one or the other of the questionnaires.
For years and days per year of mixing or applying any pesticides, exact agreement percentages were 55% and 45%, respectively, and weighted kappas were 0.56 and 0.45, respectively (Table 2). For each factor, 89% and 90% of the subjects were within one category of agreement.
Exact agreement for estimated percentage of the time subjects mixed or applied pesticides was 72% and 79%, respectively, with kappas of 0.39 and 0.47 (Table 2). Agreement within one category of exact agreement was 98% and 99%. In addition, exact agreement for years, days per year, and decade of use of specific pesticides was generally in the 50–70% range, which was lower than for dichotomous outcomes such as ever-/never-use (Table 1). Ninety per cent of the subjects gave responses within one category of agreement on the two questionnaires. Kappas were between 0.37 and 0.63.
Responses regarding frequency of reported symptoms after using pesticides were similar in the two questionnaires, as shown in Table 3. Exact agreement ranged from 76% to 92% with kappas from 0.31 to 0.45. Agreement within one response category was nearly 100%.
We evaluated level agreement by age, education, and farm size. For ever-handled any pesticide, the level of exact agreement was 94% or higher, with kappas ranging from 0.10 to 0.18. Exact agreement for ever-handled specific pesticides was between 80% and 90%, with kappas ranging from 0.42 to 0.73 (Table 4). Level of agreement for methods of application did not differ by age, amount of education, or farm size (data not shown).
We also compared responses for tobacco use and reported disease histories. Agreement was very high (over 90%) for smoking cigarettes and quite high for number of cigarettes per day (76%). Kappas were also high (0.71 for number of cigarettes per day and 0.87 for ever-smoked cigarettes). Percentage agreement for diseases the subject reported for themselves and their relatives was more than 90%. Kappas were more variable with values of 0.71 for asthma, 0.65 for pneumonia, 0.34 for kidney disease, and 0.10 for Parkinson’s disease. The kappa for Parkinson’s disease was low, despite high ex-act agreement, because there were a relatively small number of persons reporting this disease. Kappas for cancers among relatives were 0.64 for lung, 0.70 for breast, and 0.41 for the lymphatic and hematopoietic system.
Percentage agreement and kappas calculated for agricultural and life-style factors for commercial applicators separately were essentially identical to those of private applicators (data not shown).
About one-third of the study group completed the full enrollment questionnaire twice (N = 1,193). We used these data to compare reliability of responses to questions not included in the abbreviated follow-up questionnaire, including alcohol drinks per day (71% exact agreement; kappa = 0.63), vegetable servings per day (35% exact agreement; kappa = 0.43), and fruit servings per day (40% exact agreement; kappa = 0.49).
The USDA has used questionnaires to assemble information on pesticide use by farmers for many years. 2,3 Use of interviews to obtain information on pesticide use and exposure in epidemiologic research is a more recent phenomenon. 4 There are differences between the approach used by the USDA and the approach required for epidemiologic research. The USDA typically obtains information on the past year. The long latency associated with most chronic disease, however, requires that epidemiologic studies obtain data on pesticide use from several years in the past, underscoring the need to evaluate the reliability and validity of information on pesticides obtained by interview. 8
In the present analyses, the time frame for questions on pesticide use and other factors covered the subject’s entire farming history of use up to the interview, so for many this required quite lengthy recall. Several general patterns were observed. First, agreement for self-reported smoking, selected diseases, and other factors in this population was consistent with other reports in the literature, ie, in the 90% range. 9–11,14,19–21 Second, the reported agreement for ever-/never-use of specific pesticides is also quite high, ie, mostly in the 70–90% range; these compare favorably with the reliability reported in other studies for factors such as smoking and alcohol use and are better than those reported for diet, physical activity, and health conditions. Third, the level of agreement on pesticide reporting decreased as the amount of detail sought increased, such as the number of years a person applied specific pesticides instead of ever-/never-use. This is similar to other factors such as tobacco use, in which agreement for the number of cigarettes smoked per day is lower than reporting of ever having smoked. Fourth, for pesticide factors, as well as for life-style factors and disease, the disagreements between the enrollment and follow-up interviews were symmetrical, ie, there was no general tendency for a higher prevalence of positive reporting in one or the other questionnaire administration. For example, in situations in which subjects reported at one interview that they used a particular pesticide but not at the other, the number of positive reports was about equivalent for the enrollment and follow-up interviews. This suggests that the additional year of farming experience before the completion of the follow-up questionnaire had little impact on the amount of disagreement. If dramatic changes occurred in the farming operation from one year to the next, we might have expected a disproportionate number of positive or negative responses to specific responses on the follow-up questionnaire. Still, some of the disagreements could be due to changes in pesticide application activities by study participants between the first and second questionnaires. Fifth, for questions with quantitative or ordinal responses, percentage agreement within one response category was quite high, typically 80% or higher. This is especially important for epidemiologic studies because responses are often grouped into a few categories. Sixth, percentage agreement did not differ by age, level of education, or farm size, which suggests a relatively consistent reliability of reporting among the various subgroups of the cohort. Finally, it is interesting that the values for number of cigarettes per day, years of pesticide use, and days per year of pesticide use from the abbreviated questionnaire were virtually identical to those on the original enrollment questionnaire, which indicates that these reliability results are applicable to the entire cohort.
The kappas for interview-reinterview for pesticides generally ranged from 0.20 to 0.50. Although perfect agreement would generate a value of 1.0, values much lower can represent good agreement. This is because kappas are highly dependent on prevalence of the characteristic in the population, as well as on the sensitivity and specificity of the measure. 12 Thompson and Walter 12 have shown that for factors with a true prevalence of 0.2 to 0.8 and sensitivities and specificities that are quite high (in the 70–90% range), kappas fall into a range 0.3–0.6. Most values we observed are in this range. The few exceptions of kappas outside this range are for reporting on rare diseases or activities performed by nearly all applicators. Kappas can be quite low, and level of exact agreement is high for situations in which the factor is very prevalent or extremely rare. This was the situation when we observed low kappas.
The dependence of kappa on response prevalence is most obvious for questions with dichotomous response options to which few subjects gave a particular response (eg, “Did you pour fumigants?” or “Have you been diagnosed with Parkinson’s disease?”) or questions to which almost all subjects gave a particular response (eg, “Have you ever personally handled pesticides?”). In such situations the percentage exact agreement will always be near 100% (ie, almost all subjects give the majority response on both questionnaires), but kappas may be quite low (eg, if the few minority responses tend to come from different subjects for the second questionnaire than for the first questionnaire).
Few reports are available on the reliability of reported pesticide use specifically among farmers. Van Der Gulden et al13 found 82% agreement and kappas of 0.55 for reported occupational exposure to pesticides from a reliability study in the Netherlands. Farrow et al14 found kappas of about 0.29 for weed killers and 0.53 for pesticides/insecticides in general for women completing self-reported questionnaires before and after a miscarriage. These are somewhat lower than we found for specific pesticides (range of 0.48–0.70), but this might be expected because women in the miscarriage study were not from farms where pesticide use had a very important economic component, which might facilitate recall.
Several reports have compared reported pesticide use among farmers and surrogates, 4,5,15–18 and these provide a framework for considering results in this study. Agreement between farmers and pesticide suppliers regarding farmers’ use of pesticides was about 50–60%. 4 Agreement between farmers and surrogates (primarily wives) on reported use of specific pesticides was about 50–70%. 5,17 A comparison of self-assessed and expert-assessed exposure to pesticides and fertilizers found an agreement of 91% with a kappa of 0.53 in a case-control study from Montreal. 18 The level of agreement we observed from repeat interviews is generally better than that from comparisons between subjects and surrogates.
Although the reliability of reported pesticide use among Iowa farmers is as good as for many other factors assessed by questionnaires in epidemiologic research and better than for some variables, 9–11,14,19–21 it is important to assess effects of potential misclassification on estimates of relative risk. If the level of agreement between the first and second interviews is considered a measure of nondifferential exposure misclassification, we can calculate effects on relative risks. 22 For example, if the true relative risk was 4.0 and nondifferential misclassification for ever-/never-handled individual pesticides is as in Table 1 (from 79% to 88% agreement), the calculated relative risks would range from 2.0 to 2.6. If the true relative risk was 2.0, calculated relative risks for individuals pesticides would be from 1.1 to 1.6. Even though the level of agreement is quite high, the impact of misclassification in this range on the relative risks can be substantial and diminish the opportunity to detect real associations. It is important to note that nondifferential misclassification, ie, misclassification that does not differ by presence or absence of disease, would only diminish estimates of relative risk for dichotomous classifications in a prospective investigation such as the Agricultural Health Study. It could, however, result in an increase or decrease in calculated relative risks in multiple response situations for the middle exposure categories, but not for the upper exposure category. In the upper exposure category, nondifferential misclassification would always diminish the relative risk. 23 Although these data suggest that pesticide use is reliably reported by farmers in this cohort, it is important to underscore that they do not provide information on the validity of these reports.
In summary, agreement for self-reported use of pesticides by farmers is similar to that found for other factors routinely evaluated by questionnaire in epidemiology studies such as smoking and alcohol reporting, and better than others such as consumption of fruits and vegetables and physical activity. Because epidemiologic studies have successfully related disease risk to these factors, it seems likely that information on pesticide use from interviews can also be used successfully to address exposure-disease relationships.
1. Alavanja MCR, Sandler DP, McMaster SB, et al
. The Agricultural Health Study. Environ Perspect 1996; 104: 362–369.
2. Blake HT, Andrilenas PA, Jenkins RP, Eichers TR, Fox AS. Farmers’ Pesticide Expenditures in 1966. Agricultural Economic Report No. 192. Washington DC: Economic Research Service, 1970.
3. Blake HT, Andrilenas PA. Farmers’ Use of Pesticides in 1971. Agricultural Economic Report No. 296. Washington DC: Economic Research Service, U.S. Department of Agriculture, 1975.
4. Blair A, Zahm SH. Patterns of pesticide use among farmers: implications for epidemiologic research. Epidemiology 1993; 4: 55–62.
5. Blair A, Stewart PA, Kross B, et al
. Comparison of two techniques to obtain information on pesticide use from Iowa farmers by interview. J Agric Safety Health 1997; 3: 229–236.
6. Blair A, Zahm SH. Methodologic issues in exposure assessment for case-control studies of cancer and herbicides. Am J Ind Med 1990; 18: 285–293.
7. Cohen J. Weighted kappa: nominal scale agreement with provision for scaled disagreement or partial credit. Psychol Bull 1968; 70: 213–220.
8. Blair A, Zahm SH, Cantor KP, Stewart PA. Estimating exposure to pesticides in epidemiological studies of cancer. In: Wang RGM, Franklin CA, Honeycutt RC, Reinert JC, eds. Biologic Monitoring for Pesticide Exposure. ACS Symposium Series, No. 382. Washington DC: American Chemical Society, 1989: 38–46.
9. Persson P-G, Norell SE. Retrospective vs
original information on cigarette smoking. Am J Epidemiol 1989; 130: 705–712.
10. Kelly JP, Rosenberg L, Kaufman DW, Shapiro S. Reliability of personal interview data in a hospital-based case-control study. Am J Epidemiol 1990; 131: 79–90.
11. Jain M, Howe GR, Rohan T. Dietary assessment in epidemiology: comparison of a food frequency and diet history questionnaire with a 7-day food record. Am J Epidemiol 1996; 143: 953–960.
12. Thompson WD, Walter SD. A reappraisal of the kappa coefficient. J Clin Epidemiol 1998; 41: 949–958.
13. Van Der Gulden JWJ, Jansen IW, Verbeek ALM, Kolk JJ. Repeatability of self-reported data on occupational exposure to particular compounds. Int J Epidemiol 1993; 22: 284–287.
14. Farrow A, Farrow SC, Little R, Golding J, ALSPAC Study Team. The repeatability of self-reported exposure after miscarriage. Int J Epidemiol 1996; 25: 797–806.
15. Brown LM, Dosemeci M, Blair A, Burmeister L. Comparability of data obtained from farmers and surrogate respondents on use of agricultural pesticides. Am J Epidemiol 1991; 134: 348–355.
16. Boyle CA, Brann EA, and the Selected Cancers Cooperative Study Group. Proxy respondents and the validity of occupational and other exposure data. Am J Epidemiol 1992; 136: 712–721.
17. Johnson RA, Mandel JS, Gibson RW, et al
. Data on prior pesticide use collected for self and proxy respondents. Epidemiology 1993; 4: 157–164.
18. Fritschi L, Siemiatycki J, Richardson L. Self-assessed vs
expert-assessed occupational exposures. Am J Epidemiol 1996; 144: 521–527.
19. Herrmann N. Retrospective information from questionnaires. II. Intrarater reliability and comparison of questionnaire types. Am J Epidemiol 1985; 121: 948–953.
20. Blair SN, Dowda M, Pate RR, et al
. Reliability of long-term recall of participation in physical activity by middle-aged men and women. Am J Epidemiol 1991; 133: 266–275.
21. Hahn RA, Truman BI, Barker ND. Identifying ancestry: the reliability of ancestral identification in the United States by self, proxy, interviewer, and funeral director. Epidemiology 1996; 7: 75–80.
22. Copeland KT, Checkoway H, McMichael AJ, Holbrook RH. Bias due to misclassification in the estimation of relative risk. Am J Epidemiol 1977; 105: 488–495.
23. Dosemeci M, Wacholder S, Lubin JH. Does nondifferential misclassification of exposure always bias a true effect toward the null value? Am J Epidemiol 1990; 132: 746–748.