Share this article on:

Job Titles and Work Areas as Surrogate Indicators of Occupational Exposure

Sun, Yi1 2; Taeger, Dirk1; Weiland, Stephan K.1 3; Keil, Ulrich1; Straif, Kurt1

doi: 10.1097/01.EDE.0000047890.35190.F3
Original Articles

Background.  Job titles or work areas are often used as surrogate indicators of exposure in occupational epidemiological studies. In this article, we assess the validity and comparability of commonly used surrogate indicators.

Methods.  We analyzed lung cancer mortality among a hypothetical and an actual cohort of rubber workers. Surrogate indicators of exposure were defined according to jobs in which workers were “only,” “ever,” “longest” or “last” employed, or in which they were employed at the “census” of the study. Occupational risks were estimated using standardized mortality ratios. Validity of surrogate indicators was assessed in the simulated data by comparison between estimated effects and the known underlying associations. Comparisons of surrogate indicators were conducted in both simulated and empirical data.

Results.  Use of the definition “only” as the surrogate indicator gave valid but imprecise results. For all other definitions, we observed a moderate overestimation of risks in no-risk or low-risk jobs and attenuation of underlying dose-response relationships, without substantial differences among the applied definitions.

Conclusions.  Our results demonstrate a limitation of using surrogate indicators of exposure in occupational epidemiological studies. However, they suggest that the inconsistencies of published study findings in the rubber industry are unlikely to be attributable to the use of different surrogate indicators.

From the 1Institute of Epidemiology and Social Medicine, University of Münster, Münster;

2BG-Institute of Occupational Safety and Health-BIA, Sankt Augustin;

and 3Department of Epidemiology, University of Ulm, Germany.

Address correspondence to: Dirk Taeger, Institute of Epidemiology and Social Medicine, University of Münster, Domagkstr. 3, D-48129 Münster, Germany;

This study was supported by Berufsgenossenschaft der Chemischen Industrie, Heidelberg, Germany and Wirtschaftsverband der Deutschen Kautschukindustrie, Frankfurt, Germany.

Submitted 22 February 2002; final version accepted 5 November 2002.

There is an excess of certain cancers among specific occupational groups or industries (eg, the rubber industry and the boot and shoe manufacture industry). 1,2 Their work environments include exposure to agents (eg, aromatic amines, nitrosamines, polycyclic aromatic hydrocarbons, solvents and asbestos) shown to be carcinogens. However, lack of reliable exposure information is often a problem when studying potential carcinogenic exposures in the work environment.

In the rubber industry, for example, cancer risks have been assessed in a large number of cohort and case-control studies in Europe, the United States and Asia. 1,3 However, in most published studies, exposure assessment was limited mainly to job titles or work areas. Therefore, it has generally not been possible to identify the specific carcinogens responsible for the observed excess of malignant neoplasms (eg, lung or stomach cancer). The only established causal associations between chemicals used in the rubber industry and certain cancers (such as β-naphthylamine and bladder cancer, and benzene and leukemia) were obtained primarily from studies that were not conducted in the rubber industry. 4–8

For other cancer sites, analysis by job title or work area in the rubber industry often showed inconsistent results. 1,3,9,10 This may partly be attributable to different types or levels of exposures to chemical agents across time, plants or countries. Confounding factors or chance findings because of small numbers may have also played a role. Moreover, the use of different definitions of job titles or work areas across epidemiological studies could have resulted in the observed inconsistencies. Job titles or work areas were often used as surrogate indicators of occupational exposures, and they were frequently defined as a work area in which workers have “ever” been employed for at least 12 months. 9–12 However, in some studies, the “last job title,”13 the work area in which workers have been employed “at census” of the study (at the beginning of follow-up) 14,15 or the work area in which workers have been employed the “longest”16,17 were used. These different definitions of workplace exposures may influence both the validity and comparability of results among studies.

To assess the validity and comparability of various definitions of these surrogate indicators of exposures, we performed analyses on both hypothetical data and empirical data from a cohort of rubber workers. 9,10

Back to Top | Article Outline


Simulation of the Hypothetical Cohort

We assumed that a cohort study was used to assess the workplace-related cancer risks in three work areas (work areas I, II and III) of an industry. Work area I carries the highest occupational risk, work area II has an intermediate risk and work area III has no risk. A hypothetical exposure-response relationship of cancer risks in the three work areas is shown in Table 1.



To produce simulated data for analysis, we needed to make assumptions about sample size, the follow-up time, the work history, the vital status and the cause of death of each cohort member. We simulated a cohort of 30,000 newly hired workers with equal individual follow-up time of 30 years. The work history of each cohort member can be simulated into three parts because at maximum there are only three types of work areas in which a cohort member may have ever been exposed. We simulated the work history for each worker as three random employment periods (as shown in Fig. 1) independently and uniformly distributed from 1 to 10 years. Thus, the total years of employment (YOE) of each cohort member range from 3 to 30 years. We assumed that, in each employment period, each worker has the same probability (1/3) of being employed in work area I, II or III. Because of this assumption, there are only about 11% of workers (N = 3,331) who have never changed jobs during the whole employment period (employed in the same work area in the three employment periods).



To simulate the vital status of the hypothetical cohort, we assume that the cancer risk of each cohort member was determined only by the work area related to the highest occupational risk. That means we assumed that there is neither an additive nor multiplicative effect among the exposures in different work areas. For example, if a worker has ever been employed in both work areas I and II, the occupational risk of this person is only determined by work area I. Using the age-standardized mortality rate of lung cancer in Germany in 1990 (48.5 per 100,000 person years) 18 as a reference, we calculated the probability of lung cancer death for each cohort member according to the following equation:P =t ·R 0 · RR, where P = probability of lung cancer death of each cohort member;t = years of follow-up;R 0 = mortality rate of lung cancer in reference population; and RR = assumed relative risk in work area (related to the highest occupational risk). On the basis of calculated probability of lung cancer deaths, the vital status of the study cohort was randomly simulated.

Back to Top | Article Outline

Empirical Data of a Cohort of Rubber Workers

Details of the study design and methods of a cohort study of rubber workers have been described previously. 19 Briefly, active or retired male German blue collar workers, who have been employed for at least 1 year in any of five study plants, were followed for mortality from 1 January 1981 until 31 December 1991. Vital status and cause of death were assessed for 99.7% and 96.8% of the study cohort, respectively.

To assess the workplace exposure, complete individual work histories of all cohort members were reconstructed using routinely documented and archived cost center codes (“Kostenstellen”). These codes allow assignment of employment periods to specific work areas. All cost center codes were classified into six work areas. 9

Back to Top | Article Outline

Data Analysis

We used standardized mortality ratios (SMRs) as the measure of effect in analyses of both the hypothetical and empirical data. We focused our analyses on the risk of lung cancer deaths because lung cancer has the highest cancer mortality rate and thus may be expected to provide the most precise risk estimate, especially for the empirical data. Surrogate indicators of occupational exposure were defined as work areas in which workers had “only,” “ever,” “longest” or “last” been employed or in which workers were employed at “census” of the study. As in a real-life situation, definitions “last” and “census” are made as if the individual work history of the cohort members was unknown. In such a situation, the total YOE are taken as the YOE of the “last” or “census” employed work area.

In the simulation data, age standardized mortality rate (48.5 per 100,000 person years) of lung cancer in Germany in 1990 was used as the reference for the calculation of expected number of lung cancer deaths; this reference population was also used for the simulation of vital status of the study cohort. To achieve stable effect estimates in the hypothetical data, all computer simulations were conducted 500 times. The mean of SMRs (SMR) derived from the 500 random simulations is used as the effect estimate. The measurement error of this effect estimate is presented by mean squared error (MSE = variance of estimate + bias2). Because of random simulations of the first (“census”) and the last parts of the work histories of each cohort member, no difference is expected between the “last” and “census” work area, aside from random error. The results of these two classifications are, therefore, presented together.

Sensitivity analysis was conducted by using alternative employment probabilities of the cohort members to examine the stability of the results of this simulation study. Because the probability of employment in high-risk jobs may generally be lower than in lower-risk jobs, we assumed alternatively that the probability of employment in high- (work area I), intermediate- (work area II) and no- (work area III) risk jobs is 15%, 25% and 60%, respectively.

In the empirical data, the population of former West Germany was again used as the reference. All SMRs were standardized for calendar year and 5-year age groups and were lagged for 10 years to account for latency.

Back to Top | Article Outline


Simulation Study

Table 2 gives the results of computer simulations showing the SMR and the related measurement error (MSE) by the three work areas.



In comparison with the true RR, a valid assessment is found for all three work areas if the surrogate indicator of exposure is defined as “only” employed work area (Table 2). However, this definition can only be applied to persons who have never changed jobs. This limitation may, therefore, substantially reduce the precision of the estimates.

If the workplace exposures are defined as “ever,” “longest,” “last” or “census” employed work area, our analyses show a consistent overestimation of the cancer risk in work areas II and III and an attenuation of underlying dose-response relationships. The overestimation of the effect estimates (SMRs) in work areas II and III can easily be explained by the fact that many employees in work areas II or III had also been employed in work area I. Additional exposure in work area I will result in more cases than expected, even without any additive or multiplicative effect among risks in different work areas, because the exposure in work area I carries the highest occupational risk. In contrast, additional exposure (in work areas II or III) among persons who have been employed in work area I will not result in additional cases because of our conservative assumption of no additive or multiplicative effect between risks in different work areas. Therefore, no overestimation was observed in work area I. However, slight underestimation in work area I was observed among persons employed more than 15 years (YOE ≥ 15) if exposure was classified as “last” or “census” employed work area. This underestimation results from misclassification of exposure duration because total YOE are taken as the YOE of the “last” or “census” employed work area.

The overestimation of effect estimates in low-risk jobs described above is based on the assumption that the probability of employment in high-, intermediate- and low-risk jobs is equal. A sensitivity analysis based on alternative employment probabilities (15%, 25% and 60% in high, intermediate and no risk jobs, respectively) indicates that the results of this simulation study are fairly stable (Table 3). Although the magnitude of overestimation decreased slightly, analysis by using alternative employment probabilities still shows a considerable overestimation of effect estimates in work areas II and III. In comparison with other definitions, the total measurement error of the effect estimates (MSE) by “only” is higher for higher-risk jobs but lower in low-risk jobs.



Other than for the definition “only,” we observed nearly the same magnitude of overestimation by using the various classifications of surrogate indicators, ie, “ever,” “longest,” “last” or “census” (Table 2 and 3).

Back to Top | Article Outline

Empirical Study

The empirical cohort consisted of 11,663 blue collar workers (79,252 person years) who had been employed in a median number of two work areas (range: 1–6). Median employment in each work area was 13.7 years (range: 10 days–52.1 years). Because no additional exposure was involved, estimates derived from workers who had been employed in “only” one work area were expected to be more valid (as shown in the simulation study). There were 5,745 cohort members (48,300 person years) who had been employed in “only” one work area, with a median employment of 24.6 years (range: 1–51 years). Table 4 shows the results of observed lung cancer deaths and SMRs by employment categories.



In comparison with the estimates derived from persons who had been employed in “only” one work area, we observed consistently higher lung cancer risks by using the other definitions of surrogate indicators of exposure (“ever,” “longest,” “last” or “census”). The observed estimates were 10–40% higher than for the “only” criteria. One exception was found in the results for work area IV, which may be a result of imprecise estimates because of the small number of cases.

When the cancer risk in work area I was stratified by YOE (Table 5), an exposure-response relationship was observed only if the workplace exposure was defined on the basis of the “only” criteria. One reason for this may be that, by using other definitions, persons who had been employed in work area I for less than 10 years were more likely (95%) to have had additional exposures (employment in other work areas) than persons who had been employed in work area I for at least 10 years (52%). An effect caused by chance, confounding or effect modification, however, cannot be ruled out.



As in the simulated data, we found in the analysis of the empirical data a considerable agreement among the estimated lung cancer death risks for the various definitions of surrogate indicators, particularly if the sample size and the number of cases were large enough.

Back to Top | Article Outline


Reliable and valid assessment of exposure is a key factor in occupational epidemiological studies. However, retrospective exposure assessment is often hampered because of limited exposure information. A common approach is to use surrogate indicators of exposure, such as job titles or work areas, to represent homogeneous exposure groups. 20,21 These surrogates of exposure were commonly used in the past and now are primarily used in hypothesis-generating studies, in industries in which occupational exposure are related mainly to one chemical agent (such as in the carbon black industry) or when there is no other choice. Limitations of this approach have been described previously. 22 In particular, the assumption of homogeneous exposure groups may not be warranted, especially in industries with highly variable exposures or broadly defined job titles. Furthermore, the exposure variability within groups may sometimes be too large for an effective classification of exposure. 22 Our analysis suggests that, even when exposure groups are homogenous, some overestimation of risks may be introduced, especially for low-risk jobs or work areas. Overestimation of risks resulted from cross-contamination of various workplace exposures; this is analogous to cross-confounding of occupational exposure in various work areas and job titles that is introduced when investigators use various definitions of exposure.

Job-specific analysis is often used to identify jobs with elevated health risks. Therefore, it is of special interest to have the general population as the reference group in the analysis. SMRs are still one of the most commonly used analytic approaches for this purpose. Limitations of SMRs are well known and include their limited possibilities for modelling and for adjustment by potential confounders. Furthermore, if the age structure differs among analyses, SMRs may not always be comparable with each other because SMRs are standardized to the distribution of the exposed population. However, this problem may only be relevant if the age structures differ a great deal. In our empirical analysis, similar age structures of person years have been reported among workers employed in various work areas. 9 Furthermore, we found very similar results using SMRs or Cox proportional hazards models in the calculation of internally standardized rate ratios with work area III as the reference (results not shown). The use of Cox proportional hazards models may improve the dose-response relationship for the effect estimates by allowing adjustment of potential confounders. However, this model can only be used for internal comparisons and is less suitable for use in job-specific analyses.

Simulation studies need to be interpreted with caution. 23 In this article, the assumption of only three work areas in the simulation study covers a relatively complete situation, in which bias may be introduced if workers have additional exposures; if a worker changes jobs, he could either change to a job with a higher risk or with a lower risk or he could stay in the same job.

The simulated data offer a direct comparison between the estimates and the known true risks of the simulated data, which are not available in empirical studies. Thus, simulation studies may provide useful insight into the relevant mechanisms of the problem. In empirical studies, such problems are often obscured by other factors including chance, nonoccupational confounding, consideration of time periods and duration of exposure. To reduce the complexity of the simulation scenario, some assumptions may be arbitrary. For example, the assumption that a worker has three independent employment periods may not be realistic. In real-life situations, workers may be more likely to stay in the same job or change to a job with a similar activity. This leads to a lower proportion of workers who have never changed jobs in the simulation study (10%) than in the empirical study (50%). Thus, in real-life situations, the magnitude of overestimation may be expected to be slightly lower.

In the simulation study, we made the conservative assumption that only the work area with the highest risk contributes to risk. If we assume any additive or multiplicative effect among exposures in the different work areas, this will lead to a higher combined effect and thus even higher overestimation of the actual risk.

Although the use of simplified scenarios provides useful insight into the initial research problem, it may not demonstrate problems in a more complicated situation. For instance, in our simulation study, duration of exposure and the proportion of workers who had additional exposures (employment in other work areas) were treated as two independent variables. However, in practical situations, these two variables may be closely related. In the empirical study, workers who had been employed in work area I for less than 10 years were more likely to have had additional exposures (95%) than those who had been employed for 10 years or more (52%) (Table 5). Therefore, a potential exposure-response relationship between exposure duration in work area I and lung cancer risk is missed when using criteria other than “only.” However, there seems to be an exposure-response relationship if the work place exposure was classified as “only” employed work area (Table 5).

Our empirical analysis shows similar results for the four commonly used surrogate indicators of occupational exposures. Usually, the “last” or “census” job will be used as surrogate for all jobs only if individual work histories of the cohort members are not available. Previous studies have shown that the use of “last job” as surrogate for all jobs may lead to a misclassification of exposure of up to 51%. 24 Our analysis of the simulated data demonstrated that, in comparison with other classification criteria, the use of “last” job may lead not only to overestimation for low-risk jobs but also underestimation for high-risk jobs. The similarity of effect estimates derived from the various classifications in our empirical analysis may be a result of confounding, chance and low true occupational risks (usually RR < 2) being assessed in current occupational epidemiological studies.

The simulation produced valid estimates when workplace exposure was classified as the “only” employed work area. However, such definition has rarely been used in published occupational studies because it can be used only among persons who have never changed workplaces. Under real-life conditions, many workers move among jobs or departments. Limiting the analysis to workers who have not changed work areas may substantially decrease the power of the study. Our simulation study indicates that the use of the “only” criterion may increase the total measurement error of the effect estimates. Furthermore, workers who stay in one job might have other health risk profiles than workers who change positions more often. Thus, the definition “only” was used in our analysis for the illustration of the main problem rather than as a general recommendation for future studies.

To solve the methodological problem addressed in this article, alternative analytic approaches should be explored in future studies. One possible alternative may be the use of Poisson regression analysis with external comparisons. 25 This approach allows both the use of external reference and adjustment for potential confounders in the analysis. However, strengths and limitations of using this model in occupational cohort studies need to be explored.

In summary, the commonly used surrogate indicators of exposure in occupational cohort studies can lead to an overestimation of risks, especially for low-risk jobs and work areas. This bias results from cross contamination of various workplace exposures, which is analogous to investigator-introduced cross-confounding of various workplace exposures. Second, this error may also attenuate dose-response relationships between occupational exposures and risks under investigation. However, the use of different surrogate indicators of exposure are unlikely to explain the observed inconsistencies in published cancer risks for different job titles or work areas in the same industries.

Back to Top | Article Outline


1. IARC monographs on the carcinogenic risk of chemicals to humans. Vol 28. The Rubber Industry. Lyon, France: IARC, 1982.
2. IARC monographs on the carcinogenic risk of chemicals to humans. Vol 25. Wood, Leather and Some Associated Industries. Lyon: IARC, 1981.
3. Kogevinas M, Sala M, Boffetta P, et al. Cancer risk in the rubber industry: a review of the recent epidemiological evidence. Occup Environ Med 1998; 55: 1–12.
4. Case RAM, Hosker ME, McDonald DB, et al. Tumours of the urinary bladder in workmen engaged in the manufacture and use of certain dyestuff intermediates in the British chemical industry. I. The role of aniline, benzidine, alpha-naphthylamine and beta-naphthylamine. Brit J Industr Med 1954; 11: 75.
5. Ishimaru T, Okada H, Tomiyasu T, Tsuchimoto T, Hoshino T, Ichimaru M. Occupational factors in the epidemiology of leukemia in Hiroschima and Nagasaki. Am J Epidemiol 1971; 93: 157–165.
6. Thorpe JJ. Epidemiologic survey of leukemia in persons potentially exposed to benzene. J Occup Med 1974; 17: 5–6.
7. Vigliani EC. Leukemia associated with benzene exposure. Ann NY Acad Sci 1976; 271: 143–151.
8. Aksoy M. Leukemia in workers due to occupational exposure to benzene. New Istanbul Contrib Clin Sci 1977; 12: 3–14.
9. Weiland SK, Straif K, Chambless L, et al. Workplace risk factors for cancer in the German rubber industry. Part I: mortality from respiratory cancers. Occup Environ Med 1998; 55: 317–324.
10. Straif K, Weiland SK, Werner B, et al. Workplace risk factors for cancer in the German rubber industry. Part II: mortality from non-respiratory cancers. Occup Environ Med 1998; 55: 325–332.
11. Gustavsson P, Hogstedt C, Holmberg B. Mortality and incidence of cancer among Swedish rubber workers, 1952–1981. Scand J Work Environ Health 1986; 12: 538–544.
12. Negri E, Piolatto G, Pira E, et al. Cancer mortality in a northern Italian cohort of rubber workers. Br J Ind Med 1989; 46: 624–628.
13. Wang HW, You XJ, Qu YH, et al. Investigation of cancer epidemiology and study of carcinogenic agents in the Shanghai Rubber Industry. Cancer Res 1984; 44: 3101–3105.
14. Fox AJ, Collier PF. A survey of occupational cancer in the rubber and cablemaking industries: analysis of deaths occurring in 1972–1974. Br J Ind Med 1976; 33: 249–264.
15. Andjelkovich D, Taulbee J, Symons M, et al. Mortality of rubber workers with reference to work experience. J Occup Med 1977; 19: 397–405.
16. Baxter PJ, Werner JB. Mortality in the British Rubber Industries 1967–1976. London: Her Majesty’s Stationery Office, 1980.
17. Norseth T, Andersen A, Giltvedt J. Cancer incidence in the rubber industry in Norway. Scand J Work Environ Health 1983;suppl 2:69–71.
18. Segi M. Cancer Mortality for Selected Sites in 24 Countries (1950–1957). Sendai, Tohoku: University School of Public Health, 1960.
19. Weiland SK, Mundt KA, Keil U, et al. Cancer mortality among workers in the German rubber industry: 1981–1991. Occup Environ Med 1996; 53: 289–298.
20. Kauppinen T. Exposure assessment - a challenge for occupational epidemiology. Scand J Work Environ Health 1996; 22: 401–403.
21. Stewart PA, Lees PSJ, Francis M. Quantification of historical exposure in occupational cohort studies. Scand J Work Environ Health 1996; 22: 405–414.
22. Kromhout H, Heederik D. Occupational epidemiology in the rubber industry: implications of exposure variability. Am J Ind Med 1995; 27: 171–185.
23. Maldonado G, Greenland S. The importance of critically interpreting simulation studies. Epidemiology 1997; 8: 453–456.
24. Sim M, Fritschi L, Benke G, et al. Last job as surrogate for all jobs in retrospective studies [abstract]. Harare Zimbabwe: 12th International Symposium ISEOH 1997, 16–19 September 1997.
25. Breslow NE, Day NE. Statistical Methods in Cancer Research. Vol. 2. Lyon, France: IARC, 1987.

cohort study; job titles; exposure assessment; validity; surrogate indicators of exposure

© 2003 Lippincott Williams & Wilkins, Inc.