Mikulski, Marek A. MD, PhD, MPH; Gerke, Alicia K. MD, MBA; Lourens, Spencer BS; Czeczok, Thomas BS; Sprince, Nancy L. MD, MPH; Laney, Anthony S. PhD; Fuortes, Laurence J. MD, MS
The current American College of Occupational and Environmental Medicine–recommended algorithm for interpretation of spirometry results in medical screening and surveillance programs is based on the values below the lower 5th percentile, that is, lower limit of normal (LLN), characterized as abnormal.1 In clinical settings, the fixed 70% value for the ratio of Forced Expiratory Volume in the 1st second to Forced Vital Capacity (FEV1/FVC%), combined with 80% of predicted (%Pred) values for FVC and FEV1 (Fixed-ratio), is still commonly used as cutoff points for defining and characterizing functional abnormalities.2–5 The fixed 70% ratio approach has been reported to overdiagnose obstructive airways, compared with LLN, especially in older populations.6–8 Comparisons of the fixed ratio with %Pred values on the classification of restrictive physiology pattern and the mixed airways physiology have not been thoroughly studied.
The American Thoracic Society (ATS) recommends that equations based on the Third National Health and Nutrition Examination Survey (NHANES III) be used to calculate predicted (Pred) and LLN reference values for the US population.9 The NHANES III–derived values are based on individuals aged 8 through 80 years with few individuals at the extremes of age.10 Pulmonary physiology declines with age and longitudinal studies confirm a linear drop in FVC, FEV1, and FEV1/FVC% with acceleration of the rate of decline after the age of 70 years.11,12 Limited data are available on the effect of age on the comparison between the Fixed-ratio and LLN algorithms, especially in individuals older than 60 years.
This study compares the prevalence of obstruction, restrictive physiology pattern, and mixed spirometry abnormalities identified using the fixed cutoff 70% ratio with 80%Pred for FVC and FEV1 algorithm versus the LLN criterion. The study investigates the differences between these two algorithms in the prevalence and characterization of abnormalities in an older cohort.
A total of 2536 former nuclear weapons workers from two Department of Energy (DoE) sites in the Midwest were recruited by mail or phone to participate in medical screenings. Methods of participant identification and recruitment have been described in detail in previous manuscripts.13,14 A panel of screening tests including spirometry, chest radiograph, and blood draw was offered to participants on an every three-to-five years basis.
State driver's license, credit bureau records, and Internet searches were used to obtain workers' contact information. A modified ATS adult respiratory questionnaire15 was used to gather basic demographic, height, weight, and smoking histories. Information about the screenings was distributed through local media and the project's Web site. There were no restrictions on age, employment duration, health status, or residence that would prevent workers from participating in the program.
All study participants gave informed consent before spirometry testing. Spirometry was performed without bronchodilator by trained personnel according to ATS guidelines.16,17 Testing equipment was calibrated on a daily basis. An effort was made to obtain at least three acceptable and repeatable results, but no test was rejected on the basis of the lack of three results.18
The most recent spirometry result was selected from each participant for analysis. Predicted and LLN values for FVC, FEV1, and FEV1/FVC% were calculated using reference equations for the US population derived from NHANES III.10 For Asian Americans, a correction factor of 0.88 was used for LLN and predicted FVC and FEV1 values.19
Results were interpreted using the Fixed-ratio criteria (Table 1) 20,21 and the LLN algorithm (Table 2),1,9 respectively. For comparison purposes, the Fixed-ratio algorithm was modified to match the LLN method. Low FEV1/FVC% (<70% or <LLN), low FEV1 (<80% or <LLN), and normal FVC (≥80% or ≥LLN) were characterized as obstruction. Restrictive pattern physiology was identified as normal FEV1/FVC%, low FVC, and normal or low FEV1. All three metrics below the reference values were interpreted as possible mixed obstructive and restrictive pattern physiology. A low FEV1/FVC ratio, combined with a normal FEV1, was considered a variant of normal physiology under either protocol with recommendation for a follow-up evaluation for obstruction.
All spirometries in this study were performed within the framework of a medical surveillance program for former nuclear weapons workers from two sites in the Midwest that was mandated by the US Congress (section 3162 to Public Law 102-484).
Demographics and Predictors of Abnormal Spirometry
Participants' age was recorded as of the date of their most recent spirometry. Height and weight were self-reported twice, once in the prescreening administered questionnaire, and again, at the time of the screening. Participants' height information was compared with that in their questionnaires for any major discrepancies.
Body mass index (BMI) was calculated on the basis of the published formula.22–26 Smoking status was defined categorically as never, ex- and current smoker, and continuously as pack years.27,28
Statistical analyses were performed using SAS 9.2 software.29 Means, standard deviations, and ranges were computed for continuously distributed variables. The Wilcoxon ranked sum test was used to evaluate differences in medians of nonnormally distributed covariates between genders. Differences in gender distribution by age, height, weight, BMI, and pack years of smoking were tested using Cochran–Armitage chi-square test, whereas Pearson chi-square test was used for race.
Concordance between spirometry algorithms was evaluated for normal versus combined abnormal results with simple kappa statistics,30 and with weighted kappa statistics to accommodate for multiple spirometry outcomes.31 Weighting was done according to Cicchetti and Allison's scheme32 with scores assigned as follows: 1.0 for normal results, 5.5 for obstructive airways physiology, 6.0 for restrictive impairment, and 6.5 for mixed results. Concordance was also assessed by age strata with tests of equivalence of kappa statistics between age categories as recommended by Schuirman.33 Category-specific agreement was calculated for abnormal spirometries using the generalized kappa statistic as proposed by Fleiss et al.34 All kappa values were interpreted according to Landis and Koch35 with values between 0 and 0.20 interpreted as slight, 0.21 to 0.40 as fair, 0.41 to 0.60 as moderate, 0.61 to 0.80 as substantial, and 0.81 to 1.0 as almost perfect agreement.
Multivariable generalized logit and multinomial regression models were built using known predictors to assess the validity of each algorithm (Fixed-ratio separately from LLN) in predicting abnormal spirometry. Discordant spirometry results were identified between two algorithms and compared with concordant pairs through multivariable logistic regression modeling and controlling for age, gender, and BMI. All models were built using forward selection criteria. Akaike Information Criterion was used to compare the goodness of fit of the regression models.36 A P value of <0.05 was considered statistically significant.
Approval for this study was obtained from the University of Iowa institutional review board (ID 200008081 and 200509719) and the DoE Central Beryllium institutional review board (ID 209956).
Of the 2358 participants tested with spirometry, 39 (1.7%) were removed from statistical analyses because of missing questionnaire information on height (n = 27), discrepancy (>4 inches) between self-reported height and height used by technicians (n = 5), or unreliable spirometry readings (n = 7). The final number of participants with available test results was 2319.
Table 3 presents characteristics of the studied group by gender. Three of every four tested individuals were men and nearly 70% of subjects were 60 years and older, with 10.5% older than 80 years.
Of the 2319 test results, the Fixed-ratio algorithm identified 184 (7.9%) as normal variant physiology. Sixty-three of those (2.7%) were characterized as normal variant by the LLN criteria, with the remaining 121 (5.2%) labeled as normal spirometry under the LLN method. Conversely, application of the LLN protocol resulted in 91 (3.9%) normal variant results; 63 of those were concordant with the Fixed-ratio interpretation whereas the remaining 28 (1.2%) were split between obstructive (n = 21), normal (n = 6), and mixed (n = 1) results under the Fixed-ratio criteria. Finally, all normal variant spirometries were combined with normal results under each protocol for concordance analyses and logistic regression modeling.
Of 1630 spirometry results identified as normal by the LLN algorithm, 1502 (92.1%) were labeled as normal by the Fixed-ratio method. The remaining 128 normal LLN results were classified as abnormal by the Fixed-ratio criteria, 63 as obstructive, 59 as restrictive, and 6 as mixed. Conversely, of the 1518 normal results with the Fixed-ratio algorithm, only 16 (1.1%) were characterized differently by the LLN criteria including 15 as restrictive pattern and 1 as obstructive airways. The distribution of results by algorithm and age and the observed agreement between the two methods are presented in Table 4.
Age-dependent agreement between the two algorithms was “almost perfect” for individuals younger than 60 years but dropped to “substantial” in those older than 80 years (Table 5). For normal results, the agreement between algorithms was “almost perfect” and significantly different from zero but dropped to “moderate” and “substantial” for categories of abnormal spirometries.
Both methods showed minimal but statistically significant increases in obstructive airways associated with smoking (Fixed-ratio, OR = 1.01, 95% CI, 1.01 to 1.02; and LLN, OR = 1.02, 95% CI, 1.01 to 1.03). Both also revealed statistically significant associations between age and odds of abnormal spirometry (Table 6).
Age had the strongest association with discrepancy between the two algorithms. The oldest individuals were five times more likely to have different categorization of spirometry as those younger than 60 years (OR = 4.85, 95% CI, 3.07 to 7.66) between the two methods (Table 7).
The LLN algorithm is proposed as an improvement to the Fixed-ratio method in characterizing spirometry results.9,37 Most studies reviewing the differences between these two protocols have stressed the overdiagnosis of obstruction in the elderly, specifically related to the use of the fixed 70% FVC/FEV1 ratio.38,39 An increase in the risk of mortality has been found in individuals identified with obstructive airways or restrictive physiology pattern using the Fixed-ratio criteria but normal airways under the LLN protocol, calling into question the significance of age-associated changes in lung function.40 This study found 128 individuals with such normal spirometries under the LLN method but a discrepant Fixed-ratio result. These individuals were on average 10 years older (73 ± 9 vs 63 ± 13; P < 0.001) and were significantly more likely to have ever smoked than those with concordant normal results between the two protocols (OR = 1.45, 95% CI, 1.01 to 2.08). Furthermore, all of those normal by the LLN method but obstructive under the Fixed-ratio criteria (n = 63) had results classified as Global Initiative for Chronic Obstructive Lung Disease (GOLD) stage 2 severity with low FEV1/FVC% and FEV1 between 50% and 80% of the predicted.41,42 The correct interpretation of results suggestive of restriction by either protocol requires follow-up clinical testing with plethysmography regardless of the individual's age, yet the validity of characterization of normality and obstruction is still an issue particularly among the elderly. Collection of new normative data for the elderly should allow for more valid interpretation strategies. A major unresolved issue is the ability of spirometry to accurately characterize abnormal physiology suggestive of either obstruction or restriction with the caveat that elderly individuals in particular may, in fact, have both airways and interstitial diseases.
This study found an expected increase in the prevalence of obstructive and mixed airways under the Fixed-ratio protocol compared with LLN method.7,39,42 The rates of restrictive physiology pattern between the two protocols were virtually the same. These results indicate discrepancies in defining lung physiology, depending on the interpretation algorithm especially among the elderly. The validity of interpretation of spirometries among the elderly by either protocol remains problematic as there have been insufficient numbers of elderly subjects tested by lung volumes and spirometry. Two decades after it has been recommended to change to the LLN,37 many, if not most clinical laboratories, continue using the Fixed-ratio based algorithm and LLN reference values are not consistently reported in commercially used spirometers. There are also other protocols, including those that combine both methods to address the severity of abnormalities, available for interpretation of spirometry results.43–46 Clearer criteria should be established to interpret spirometry results including validation by physiologic testing in age-appropriate populations.
Almost 11% of participants in this study were 81 years of age and older. The NHANES III equations used as a reference for this population were based on a sample of individuals aged 8 through 80 years, and age showed a statistically significant association with spirometric abnormalities under both algorithms. Applying NHANES III-based equations to all tests in this study resulted in physiologically impossible low values of predicted FVC, FEV1, and FEF25-75 in shorter individuals at the upper extremes of age. These findings warrant further population-based studies to accurately determine age-specific reference standards for spirometry results in older individuals.
Application of either algorithm in this study resulted in up to 8% of individuals identified with low FEV1/FVC% but normal FEV1 values. Those tests could be interpreted as a variant of normal or borderline mild obstructive airways physiology. The mean age of this group was 70 (±12, range, 24 to 91) years for the LLN protocol and 66 (±16, range, 24 to 89) years for the Fixed-ratio results, respectively. More than half (56.0%, n = 103) of those with Fixed-ratio normal variant results and 71.4% (n = 65) of those characterized by the LLN method had FEF25%-75%Pred values less than 70%, suggestive of obstructive airways physiology. Significance of low ratio but intact FEV1 by either algorithm remains in question, although the mid-flow reductions suggest that the majority of such cases are, in fact, obstructed.
Discrepancy between self-reported and measured height and weight information may be a limitation of this study.47–49 To minimize potential bias, this study compared two instances of self-reported height information and those with greatest discrepancy were eventually excluded from analyses. Spirometry testing in a surveillance setting is done typically with self-reported information, so the issue of accuracy of height needs to be considered.
In summary, this study found discrepancies in rates and characterization of abnormal spirometries between the LLN and the Fixed-ratio with percent predicted algorithms. The LLN method characterized slightly over half the rate of obstructive and mixed airways as did the Fixed-ratio, but the rates of restrictive physiology pattern were virtually the same. The discrepancies between the algorithms were more pronounced in older individuals, which seems to be related to the lack of age-specific reference equations and standards for individuals older than 80 years. As life expectancy of the general population continues to rise, issues of diagnostic accuracy and significance of spirometry results in the elderly are acquiring greater significance and warrant further population-based studies to accurately determine age-specific reference standards and algorithms for characterizing spirometry results.
This study would not have been possible without the participation of former workers from both DoE sites under study. We thank those workers and DoE staff, Mary Fields, Greg Lewis, Isaf Al-Nabulsi, Moriah Ferullo, Regina Cano, Libby White, and Dr. Patricia Worthington for their support of this program. We thank the University of Iowa team; Jill Welch, Dr Valentina Clottey, Christina Nichols, Nicholas Hoeger, and Carmen Smith for their ongoing contributions to the screening program. We also thank Dr Eva Hnizdo for reviewing this manuscript. The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the National Institute for Occupational Safety and Health.