A Stand-Alone Windows Application for Computing Exact Person-Years, Standardized Mortality Ratios and Confidence Intervals in Epidemiological Studies
Taeger, Dirk; Sun, Yi; Keil, Ulrich; Straif, Kurt
From Institute of Epidemiology and Social Medicine, University of Muenster, Germany.
Submitted December 13, 1999; final version accepted April 6, 2000.
Address correspondence to: Dirk Taeger, Institute of Epidemiology and Social Medicine, University of Muenster, Domagkstrasse 3, D-48129 Muenster, Germany.
We introduce a stand-alone and user-friendly person-years and mortality computation program (PAMCOMP) for calculating exact person-years and standardized mortality and incidence ratios running under Windows® 95/98 and NT. The calculation of person-years allows flexible stratification by self-defined and unrestricted categories of age and calendar years. Furthermore, it is possible to lag person-years to account for latency periods. The standardized mortality ratio computation includes calculation of 90%, 95%, and 99% confidence intervals. Import and export filters for standard personal computer file formats are available. The software is free of charge and can be downloaded from: http://medweb.uni-muenster.de/institute/epi/pamcomp/pamcomp.html
Standardized mortality ratio (SMR) estimation is the most commonly used effect estimation in occupational cohort studies. 1 It is usually applied to compare mortality in specific industries with mortality in national or regional reference populations, especially when numbers of observed and expected deaths were small or when detailed exposure data were unavailable. Furthermore, even if more sophisticated modeling is performed, the calculation of SMRs for major cause groups is often a first step of the analysis. Although the principle of calculating SMR is easy, its implementation is not always straightforward. The main problem is the need for a calculation of the person-year distribution of the study cohort by age and calendar years. Several computer algorithms for calculating person-years have been published, 2–6 and the increasing number of algorithms in recent years may be an indicator for the need of stable and ready-to-use person-years computer programs. Nevertheless, most of these published algorithms 3–6 were written in the SAS language 7 and the use of the programs required that both the cohort data and the algorithm were adjusted to fit each other. The latter task may have been difficult for researchers who were not familiar with the SAS language. Furthermore, some researchers routinely may use other statistical packages and, therefore, may not have an SAS license.
The person-years and mortality computation program (PAMCOMP) offers a stand-alone user-friendly and flexible program to calculate person-year distributions and standardized mortality ratios in epidemiological research. Table 1 presents the basic variables needed to perform a person-years and SMR computation using PAMCOMP. Date variables indicate the date-of birth, entry point of study (EPS), ie, individual start of follow-up, and termination point of study, ie, individual end of follow-up, which may be defined as censoring due to loss to follow-up, attainment of an upper age limit, death or end of the observation period. These values may be different for each individual cohort member and must be provided in a four-digit year format to establish year 2000 compliance.
For inception cohorts the entry point of study is usually equal to the date of hire (DOH) of the individual cohort member. Alternatively, according to the cohort definition, the entry point of study may also be 1 year (or any other specified period) after the date of hire of each cohort member. The specification of two variables, EPS and DOH, gives the opportunity to operate with start of follow-up and date of hire independently, as it may be necessary for specific time-related analyses of census cohorts. Optionally, a numeric variable can be used to stratify by sex. Moreover, this variable can also be used for other non-time-related dichotomous variables such as race, nationality, or socioeconomic status. There is also the possibility to lag person-years from date of hire to account for different latency periods, eg, if the considered outcome is cancer. Lagging is implemented such that a person’s current person-year at risk will be accounted x years after the date of hire. 8 The optional lag may be specified between 1 and 100 years.
The program will compute the exact person-year distribution of all cohort members, ie, person-days divided by 365.25, and report them in a matrix where rows represent the age categories and columns represent the calendar year groups.
Once the person-year distribution is established, the SMR can be calculated by providing a file with the death rates for the respective age and calendar distribution, and the ICD variable is needed to indicate whether a cohort member died from a specific cause of interest. Substituting incidence and incidence reference rates for mortality and mortality reference rates, respectively, the program will calculate standardized incidence ratios (SIR). The calculation of 90%, 95%, and 99% confidence intervals is based on approximations of the chi-square percentiles. 9,10 Furthermore, appropriate matrices of the distribution of deaths are provided.
PAMCOMP is written in Visual Basic® 6.0 and Visual C++® 6.0 and provides a user-friendly interface. All necessary files to run the software will be supplied; no additional run-time environment or software is required. It comes with documentation integrated as a windows help file. Major strengths of PAMCOMP in comparison with previously published algorithms and programs 2–6,11 are that PAMCOMP runs under Windows® 95/98 and NT, and supports ASCII, dBase®, Paradox®, MS-Excel®, and MS-ACCESS® file formats to import cohort and reference data and to export person-year data, death distributions, and SMR or SIR results. Stratification by age and calendar year may be done flexibly and is not restricted to equal interval length of categories of age and calendar years. A weakness of PAMCOMP in comparison with published algorithms and programs 3–4,6,11 is that time-related exposure data may not be considered in the current version, but will be implemented in an updated version.
PAMCOMP is under further development, and can be used without any fees. Exact confidence intervals for sparse cases (less than 5) and weighted SMR as well as standardized rate ratios (SRR) are under way. For interested programmers the source code is also supplied. The program is available at http://medweb.uni-muenster.de/institute/epi/pamcomp/pamcomp.html
1. Callas PW, Pastides H, Hosmer DW. Survey of methods and statistical models used in the analysis of occupational cohort studies. Occup Environ Med 1994; 51: 649–655.
2. Monson RR. Analysis of relative survival and proportional mortality. Comput Biomed Res 1974; 7: 325–332.
3. Pearce N, Checkoway H. A simple computer program for generating person-time data in cohort studies involving time-related factors. Am J Epidemiol 1987; 125: 1085–1091.
4. Macaluso M. Exact Stratification of person-years. Epidemiology 1992; 3: 441–448.
5. Sun J, Shibata E, Kamijima M, Toida M, Takeuch Y. An efficient SAS program for exact stratification of person-years. Comput Biol Med 1996; 27: 49–53.
6. Wood J, Richardson D, Wing S. A simple program to create exact person-time data in cohort analyses. Int J Epidemiol 1997; 26: 395–399.
7. SAS Institute. SAS Language: Reference. Version 6. Cary, NC: SAS Institute, 1990.
8. Checkoway H, Pearce NE, Crawford-Brown DJ. Research Methods in Occupational Epidemiology. New York: Oxford University Press, Inc., 1989.
9. Breslow NE, Day NE. Statistical Methods in Cancer Research. vol. 2. The Design and Analysis of Cohort Studies. IARC Scientific Pub. No. 82. Lyon: International Agency for Research on Cancer, 1987.
10. Sahai H, Khurshid A. Statistics in Epidemiology. Methods, Techniques, and Applications. Boca Raton: CRC Press, 1996.
11. Steenland K, Beaumont J, Spaeth S, Brown D, Okun A, Jurcenko L, Ryan B, Phillips S, Roscoe R, Stayner L, Morris J. New Developments in the Life Table Analysis System of the National Institute for Occupational Safety and Health. J Occup Med 1990; 32: 1091–1098.
epidemiologic methods; epidemiologic measurements; incidence; mortality; software; cohort studies; computation
© 2000 Lippincott Williams & Wilkins, Inc.
What does "Remember me" mean?
By checking this box, you'll stay logged in until you logout. You'll get easier access to your articles, collections,
media, and all your other content, even if you close your browser or shut down your
To protect your most sensitive data and activities (like changing your password),
we'll ask you to re-enter your password when you access these services.
What if I'm on a computer that I share with others?
If you're using a public computer or you share this computer with others, we recommend
that you uncheck the "Remember me" box.
Highlight selected keywords in the article text.
Data is temporarily unavailable. Please try again soon.