Physiological evidence demonstrates that the heart's chronotropic response (including its maximal pulse rate) declines with age, partly because of the heart's decreasing sensitivity to beta-adrenergic stimulation, lessened calcium flux, changes in pacemaker tissue, and the effect of prolonged diastolic filling (^{11,14,22,37}). Therefore, maximal heart rate (HR_{max})-prediction equations based on a person's age are frequently used in prescribing exercise intensity and for other clinical applications. The customary formula for predicting HR_{max} (220 − age in years) is acknowledged to be quite variable, with estimates having a standard deviation of 10-12 bpm (^{1}). This variability reflects individual differences in observed HR_{max}, and it has resulted from several studies showing that anywhere from 35 to 80% of HR_{max} variation is accounted for by age alone (^{7,9,14,25,36,38}). Moreover, despite clinicians' acceptance of this simple formula as a standard way to gauge exercise intensity, it was derived merely by a superficial linear best fit to a series of data culled from approximately 10 published studies examining the relationship between age and exercise HR_{max}. The original equation, based on data compiled in 1971 by Fox et al. (^{12}), was intended to be only a rough formulation based on the apparent decline of HR_{max} with age observed in those studies. Furthermore, the individuals examined in the research referenced by Fox and colleagues were never meant to be representative of the population: they were all male, most were under age 55, and the data were from studies conducted under varying conditions and with different criteria for having elicited an HR_{max} (^{12,36}). Even Fox et al. speculated that the actual rate of decline in HR_{max} with age for the population would likely prove to be somewhat less than that suggested by their heuristic equation (^{12}). Thus, although the HR_{max}-prediction model of 220 − age has become well established in the medical literature and is used widely in clinical and fitness settings, its validity is uncertain.

Other HR_{max}-prediction formulae have been proposed over the years, many with a more rigorous foundation than the equation of 220 − age. Results from various meta-analyses (^{16,25,36}) and laboratory-based (^{2,7,9,36,38}) studies have generally resulted in univariate equations featuring a lower intercept and smaller age coefficient or slope than the standard formula. Besides suggesting that HR_{max} drops by less than one beat per year as people age, these data indicate that the traditional equation, 220 - age, *overestimates* HR_{max} in young adults and *underestimates* it in older people. Even without considering individual differences, these observations question the accuracy of gauging exercise intensity relative to a predicted HR_{max} determined by the benchmark of 220 − age. Although results from various cross-sectional studies (^{7,9,10,13,14,17,24,28,29,33,36,38,39}) have shown a linear decrease in HR_{max} with increasing age, it is less well established that *longitudinal* tracking of the same subjects' HR_{max} as they age over several years exhibits an identical linear relationship.

Accordingly, the aim of the present retrospective study was to examine, on a longitudinal basis and employing more suitable statistical analyses, the relationship between age and HR_{max} during exercise to devise (or verify existing) HR_{max}-prediction equations used in prescribing exercise intensity and for other clinical applications.

#### RESEARCH DESIGN AND METHODS

##### Study participants.

The data from which this study is derived are physiological and maximal stress-testing measures obtained from members of a health-assessment/fitness center (located on the campus of Oakland University in Rochester, MI) between the years 1978 and 2003. The total member population of 4666 comprised southeast Michigan residents who responded to the center's outreach efforts that sought adults who wished to maintain or improve their health status. These members underwent one or more comprehensive wellness assessments at this health-enhancement institute, and approximately 12,000 medical physical exams, including more than 7000 graded exercise tests (GXT), were administered during a 26-yr period. All participants provided written informed consent for their data to be included in subsequent research. The present study of this archived clinical information was conducted in 2006 and was approved by the university's institutional review board.

From this membership population, a subset was created of those individuals who had multiple annual health assessments and GXT performed during a span of years at that institute. These tests were conducted in a consistent manner according to standardized laboratory techniques, and the GXT used a modified Balke treadmill protocol. Individuals were instructed to refrain from exercise during the day preceding their exam, and they typically reported to the institute in the morning after a 12-h fast. Each person's records contained pretest measures of age, height, weight, resting heart rate, blood pressure, and standard 12-lead electrocardiogram (ECG), along with medical history information, including prescription drug use. Exercise heart rate, blood pressure, and ECG responses were likewise collected during administration of the stress test. Because the GXT were primarily conducted as part of a fitness assessment and as a preamble to prescribing exercise rather than for purposes of coronary artery disease evaluation, they were designed to achieve a maximal exertion. Accordingly, only exercise tests that were terminated at the point of each individual's volitional fatigue and that elicited an exercise HR_{max} that exceeded 85% of the individual's age-predicted maximum (based on the HR_{max} estimate of 220 − age) were included in this study. People were not excluded on the basis of their smoking or alcohol consumption history or body mass index (BMI). Likewise, habitual caffeine intake was not a disqualifier, because participants were tested in a fasted state, and caffeine has been shown to not have an effect on HR_{max} (^{17}). However, participants taking medications known to affect the heart rate response to exercise (such as beta-blockers, calcium channel blockers, or other heart and antihypertensive medications, stimulants, or antidepressants), or with any previous medical history of heart disease, were excluded. Similarly, cases were excluded where the GXT exhibited an abnormal, sign/symptom-limited response (e.g., angina, ST-segment depression, significant dysrhythmia, or abnormal blood pressure response) leading to an early test end point and a "positive" interpretation suggesting the presence of ischemic heart disease.

Only those individuals who had recorded six or more annual GXT during their tenure at the fitness center were included in this study. These cases were further refined by omitting members' *first* GXT, because empirical evidence indicated that the initial test elicited a somewhat low HR_{max} and, consequently, could be considered a learning experience. Other researchers (^{14}) have observed a similar learning effect, especially among older individuals, whereby people undergoing a GXT were less apprehensive and achieved higher HR_{max} on subsequent testing. In total, these perspectives yielded 132 people and 908 GXT administered during a 25-yr span between 1979 and 2003 that were included in this analysis. There was a 3:1 ratio of males (*N* = 100) to females (*N* = 32) in the final sample, and over 90% of these participants were non-Hispanic whites. After eliminating each participant's first GXT, the range in the number of test years/GXT tracked was 5-17, and both sexes recorded an average of roughly 7 ± 2.3 yearly GXT (males = 697; females = 211) during their history at the institute. Although some members' GXT history may have occurred in a string of consecutive test years, that perspective was not a condition in this study. In fact, the average of seven GXT per person examined in this research occurred during a mean span of approximately 9 ± 3.7 yr, after elimination of the first-year GXT for both men and women.

##### Test protocol.

The GXT was a modified Balke treadmill exercise protocol performed without interruption to individually determined limits of maximal performance. The test started at a 0% grade and 88.5 m·min^{−1} (3.3 mph) for 3 min, followed by progressive increments of 3% in grade every 3 min to a peak elevation of 24% while maintaining that speed. Beyond that point, if the individual was able to continue, the treadmill grade was held constant as the speed increased by 5.4 m·min^{−1} (0.2 mph) each minute thereafter until the person reached exhaustion. Heart rate was monitored continuously by electrocardiography using the 12-lead Mason-Likar electrode placements typical for exercise testing, and relied on the lead II R-R wave recording for heart rate calculation. HR_{max} was defined as the highest value recorded on the ECG during the final stage of the exercise test.

The GXT procedure was explained to participants before commencing the test, and each participant understood he or she was to continue exercising (without rail holding) as long as he or she could tolerate doing so. To ensure that participants achieved maximal exertion, they were encouraged to exercise well beyond a heart rate diagnostic threshold of 85% of their HR_{max} predicted by 220 − age, and to a rating of perceived exertion of at least 18 units on the 6-20 Borg scale (^{1,18}). Observations of profound dyspnea and obvious pallor were accepted as external cues that a participant had reached his or her subjective limit of fatigue. Because computer-assisted, open-circuit spirometry was not consistently applied during administration of these GXT, objective respiratory/metabolic criteria for achievement of maximal exertion were not available.

##### Statistical modeling.

When analyzing longitudinal data, the statistical model must sufficiently represent the mean value of the response over time in terms of explanatory variables, but it is also necessary to take account of the correlations between the repeated measurements on each participant. Equally important, measurements taken close to one another in time will likely exhibit a different degree of correlation or covariance compared with measurements made at more widely spaced time intervals. The linear mixed-models procedure applied in this study, using SPSS version 14.0 (SPSS, Inc., Chicago, IL, 2005), permits the data to exhibit correlated and nonconstant variability. In other words, unlike conventional fixed effects-only models (e.g., linear regression), mixed models can handle data in which the observations are *not* independent (^{4,34}). In this study, the observed pattern of dependencies in the series of repeated HR_{max} values recorded for each participant was modeled by a first-order autoregressive covariance structure. This structure assumes that the covariances between HR_{max} values recorded during GXT made closer together in time are likely to be higher than those made at more widely separated time intervals, and it is a natural model from a time-series viewpoint (^{4}). Furthermore, mixed models accommodate specification of *fixed effects*, such as different treatments or explanatory variables applied in a study whose values of interest are all represented in the data file, as well as *random effects*, such as different people or centers participating in a trial that are considered a random sample from a larger population of values. Random effects are useful for explaining excess variability in the dependent variable attributable to the natural heterogeneity across individuals in their response over time, especially variability resulting from unmeasured characteristics in the data (^{23}). In the current investigation, independent fixed-effects variables including age, sex, BMI, and resting heart rate were evaluated as predictors of HR_{max} in successively more complex models. In addition to subject ID, specifying the test year as a random effect allowed those values to be viewed almost as different fitness centers, implying a belief that GXT administration may have differed slightly between test years. Test conditions, including treadmill and monitoring equipment and environmental setting, were relatively constant throughout administration of these GXT, and all devices were calibrated regularly. Nevertheless, because of frequent staff changes at the institute, this possible source of variation was included in the model. By fitting certain effects as random along with specifying the correct covariance pattern and linking each subject's observations, more appropriate standard errors are produced for fixed effects, and the model's conclusions can be generalized beyond the sample data to a larger population (^{4,23}). Finally, the presence of randomly missing data (such as gaps resulting from participants not all recording the same number of annual GXT or completing them in a sequence of nonconsecutive years) poses a less serious analytical problem in mixed modeling than occurs with traditional statistical techniques (^{4}). Recognizing the member identification, covariance structure, and random effects in the data, the mixed-model procedure essentially created a separate regression line for each participant's history and then, weighing both intra- and intersubject variability, calculated an average line of best fit from the pooled data (^{4,34}).

#### RESULTS

Descriptive characteristics, by sex, for the sample of members and their entire history of GXT included in the current study are displayed in Table 1. The sample comprised a broad range of age and fitness groups; however, aerobic capacity (V˙O_{2max}) estimated by standard metabolic calculations from the progression of the GXT was above average (> 70th percentile) for both men and women according to normative tables (^{1}), as would be expected for long-time members of a fitness center. Independent *t*-tests confirmed that there were no significant differences (all *P* > 0.20; data not shown) between men and women across the GXT examined in this study on the metrics of average age, maximal observed exercise heart rate, and number of GXT recorded. However, males exhibited a lower (*P* < 0.001 for all differences) average resting heart rate, greater mean BMI, and higher average V˙O_{2max} level than females. Though not shown in Table 1, men and women had an average age of approximately 44 ± 9.6 yr at their first GXT examined in this research. Statistics indicate that all data adhered to normality assumptions. Using the sample size-estimation formula suggested by Brown and Prescott (^{4}), the 132 members and their repeated GXT measures investigated in the current study were sufficient to detect a heart rate difference of less than 3 bpm at the 5% significance level (two tailed) with 80% power.

Output estimates for a prediction model that included age as the only independent fixed-effects variable, with exercise HR_{max} as the dependent variable, are displayed in the top section of Table 2 (parameters identified as *intercept* and *age*). The resulting linear mixed model predicted HR_{max} as 207 − 0.7 × age. The *t*-values calculated for the terms in this model were highly statistically significant (*P* < 0.001). Likewise, statistical tests of the exercise heart rate repeated-measures covariance parameters, along with subject ID and test year random effects, were also highly significant (*P* < 0.001; data not shown). Ultimately, a mixed model is fitted by maximizing the so-called likelihood function, which measures the likelihood of obtaining the model parameters given values in the data (^{4}). One measure of model fit, the Bayesian information criterion statistic (BIC; based on the model's −2 × restricted log likelihood adjusted for the number of fixed and covariance parameters and observations), is included in Table 2, and smaller values denote better-fitting equations (^{34}).

Because the relationship of age and HR_{max} may be nonlinear, models that included polynomials of increasing order of age also were considered. As shown in Table 2, equations describing curvilinear patterns of change in HR_{max} using simply a nonlinear predictor (age^{2}) as well as a quadratic function consisting of age + age^{2} both proved to be significant. However, only the model of HR_{max} = 192 − 0.007 × age^{2} featured lower standard errors for its parameter estimates compared with the linear model of HR_{max} = 207 − 0.7 × age. Although the nonlinear predictor model, 192 − 0.007 × age^{2}, was slightly more accurate than the linear equation, 207 − 0.7 × age, it may be harder to apply. Because the BIC goodness-of-fit statistics for both nonlinear equations showed negligible improvement over the linear model's fit, the linear equation (207 − 0.7 × age) based on age alone was deemed sufficient to describe the relationship between age and HR_{max}. Strictly speaking, BIC measures are primarily useful for comparing models that have equivalent fixed effects and cases but different covariance patterns, because that statistic is expected to improve (become smaller) as more parameters are included in the model (^{4,23,34}). Successive bivariate and multivariate models that featured combinations of sex, BMI, and resting heart rate along with age and age^{2} as independent variables were similarly evaluated; pertinent results for one such model representative of those analyses are shown in Table 2 (bottom section). Although those intuitively appealing covariates might be anticipated to have an effect, additional explanatory variables besides age were nonsignificant in predicting HR_{max}. Consequently, from the standpoints of accuracy and practicality, the parsimonious age-only linear model (HR_{max} = 207 − 0.7 × age) represents the preferred equation derived in this investigation.

The HR_{max} values calculated by the standard formula, 220 − age, versus those predicted by the linear and nonlinear age models resulting from this research are displayed in Figure 1. Compared with this study's linear equation of 207 − 0.7 × age, the standard formula tends to *over*predict HR_{max} for younger persons and *under*predict for older individuals. As a result, there are noticeable differences, especially at the ends of the age spectrum, where the linear formula, 207 − 0.7 × age, predicts an HR_{max} that is 3 bpm *lower* for 30-yr-olds and 12 bpm *higher* for 75-yr-olds than the standard formula would calculate. Although the two equations resulted in identical predictions at age 40, paired *t*-test statistics revealed that the average differences in predicted HR_{max} yielded by the two formulae across each 5-yr age range from 30 to 75 were significant (all *P* < 0.03; data not shown). The nonlinear models developed in this study are depicted in Figure 1 by curves describing a rate of change in HR_{max} that vary in different parts of the age range. The statistical significance and fit of the curvilinear models are marginally different from the linear equation of 207 − 0.7 × age; furthermore, they may simply exemplify anomalies in the data. Inspection of HR_{max} trends for individual participants revealed cases where the member's HR_{max} increased consistently over several annual GXT before demonstrating the expected decline with advancing age. Those observations are likely attributable to random variation in stress test performance, or they may reflect an extended "learning" response to which the quadratic component in the nonlinear models was sensitive.

#### DISCUSSION

The HR_{max}-prediction equation, 220 − age, has been the rule-of-thumb method for gauging exercise intensity for many years. The formula actually originated in a 1971 review (^{12}) of data concerning physical activity and the prevention of coronary heart disease that attempted to provide practical guidelines for exercise prescription. Although the equation was only intended to be an approximation of HR_{max}, it likely became the *de facto* standard because the formula of 220 − age appeared close to research findings and would be a simple method to use (^{38}). However, many rigorous studies have reported that HR_{max} declines at a rate of approximately 3-5% per decade (^{2,8,13,16,19,20,29,31,38}), independently of sex or fitness level, whereas the equation of 220 − age implies a decay of 5-7% per decade, so the standard formula's "closeness" or validity is questionable. Although underestimating HR_{max} in some older persons may be preferable from a safety perspective, it would be desirable to more accurately predict HR_{max} at all ages (^{7}).

One of the most accurate general HR_{max}-prediction equations was presented by Tanaka et al. (^{36}). They published cross-confirmatory findings in 2001 from an extensive meta-analysis of available research on HR_{max}-prediction equations, along with their own well-controlled, laboratory-based complementary prospective study. Their research used data from 351 published studies involving 18,712 healthy people, along with observations from 514 healthy individuals recruited for their laboratory research. Both approaches were cross-sectional and yielded virtually identical regression equations for HR_{max} (208 − 0.7 × age), with the age variable alone explaining approximately 80% of the variance in predicted HR_{max}. Furthermore, their model shows no difference in HR_{max} between men and women, and exercise training did not influence it. Their results demonstrate that the traditional equation (220 − age) overestimates HR_{max} in young adults and underestimates it in older individuals, with the original and new heart rate curves starting to diverge at age 40. As early as 1982, Londeree and Moeschberger (^{25}) pooled data from available studies to derive a univariate generalized population equation for HR_{max} (206 - 0.7 × age) that closely resembled the findings of Tanaka et al. (^{36}). In 1992, Whaley et al. (^{38}) published sex-specific HR_{max}-prediction regression equations using discriminant function analyses of retrospective data from participants in a university-based adult exercise program. Their cross-sectional research estimated that men's HR_{max} = 214 − 0.8 × age and that women's HR_{max} = 209 − 0.7 × age.

However, as Stathokostas et al. (^{35}) argue, aging is not uniform between individuals, and cross-sectional studies do not capture individual changes over time, whereas research using longitudinal data can. Moreover, as Dehn and Bruce (^{6}) have suggested, "Cross-sectional studies are vulnerable to an undefined amount of truncation of the population … valid changes with aging can only be established by paired observations in the same person" (p. 807). Fleg (^{11}) points out that the cross-sectional approach of most studies of aging implies that only survivors are studied. Of course, by design, the present longitudinal study guards against dropouts by only tracking those participants with a known series of observations. Nevertheless, selective survivorship can still influence the composition of such samples, because the research will necessarily focus on changes over time in those who remain ambulatory and healthy (^{35}). Notably, results from the present study confirm the findings of several cross-sectional investigations of HR_{max} and age, especially the work of Tanaka and associates (^{36}).

Various authors (^{3,29,39}) have stated that longitudinal investigations continuing over several decades are necessary before definitive conclusions may be drawn about the effect of aging on cardiac function during exercise. Although such designs may be ideal, they are less practical, and most other longitudinal studies of the relationship between age and HR_{max} have not included as many repeated observations as the present research. In fact, many longitudinal HR_{max} investigations (^{2,15,19-21,25,30,31}) have included just two to three repeated measures on each person and have featured durations that were either less than 10 yr or greater than 20 yr. Plowman et al. (^{30}) acknowledge that longitudinal studies with only *two* measurement periods "force" a linear relationship between age and physiological measures. They suggest that multiple measurements or tests on the same individuals as they age are required to accurately describe the aging process, especially if the true relationship is curvilinear. Furthermore, as Marti and Howald (^{26}) reason, longitudinal studies where participants are reexamined at a fixed interval, such as 15 yr later, make it difficult to evaluate the independent effect of aging. Because all study members are exactly 15 yr older at follow-up, the age variable has changed by the same value; lacking a range of distribution, it cannot be entered as a predictor variable in regression analyses. Nevertheless, these longitudinal studies have observed, on average, a 0.7- to 0.9-beat-per-year decline in HR_{max} during their time frame. Those findings seem concordant with published results from cross-sectional HR_{max} research, and they again question the validity of the one-beat-per-year decline in HR_{max} implied by the formula of 220 − age.

In the current study, sex was not a significant explanatory variable in predicting HR_{max}. That result is consistent with the findings of many other published HR_{max}-prediction studies (^{7,15,25,28,29,36}), though some investigations (^{2,9,19,20,33,38}) have detected an HR_{max} difference between men and women. Various cross-sectional studies (^{7,14,24,33}) have found that neither body weight nor BMI had a significant effect on the HR_{max} of the participants they measured, supporting the findings of the present research. Most studies also have found no evidence to suggest that exercise may blunt the age-related loss in HR_{max} (^{8,15,31,33,36,37}). Nevertheless, it should be noted that the higher-than-average cardiovascular fitness of the participants in the present study might have enhanced their capacity to reach a true maximal exertion level during a GXT (^{5,22}), because unfit individuals may be less able or less motivated to attain age-predicted HR_{max}, because of their exaggerated perception of exhaustion or other reluctance to put forth a strenuous effort (^{10,14}). As a matter of fact, the HR_{max} values attained in nearly two thirds of the GXT reviewed in this study *exceeded* the participants' HR_{max} predicted by 220 − age.

Some published HR_{max}-prediction models have featured nonlinear age factors or have specified separate equations by age groups on the basis of analyses that have shown that the decline in HR_{max} was not constant across time. Complex interaction equations developed by Londeree and Moeschberger (^{25}) include negative quadratic and quartic age terms to improve HR_{max}-curve fitting. Plowman et al. (^{30}) and Profant et al. (^{32}) found that HR_{max} remained fairly steady across the middle years and then decreased at a faster rate after age 60, with the pattern of change described more accurately by a cubic rather than a linear equation. In the current study, a quadratic component of age was identified as significant. All of these results could reflect a true phenomenon, or they may well be attributable to older participants having failed to attain a true HR_{max} during the GXT. Despite the above-average aerobic capacity of the members in this study, stress test personnel might have been less willing to push older participants to exhaustion, or, perhaps, orthopedic limitations or local muscle fatigue might have led older members to provide submaximal exercise efforts (^{7,14,25}). Nonetheless, after investigating curvilinear age effects or other physiologic and anthropometric factors, most studies (^{7,9,24,25,36,38}) have found that the linear age parameter alone accounts for the majority of variability and provides the most accurate univariate prediction of HR_{max}.

A limitation of the current study is its reliance on data from GXT that were all performed on a treadmill. Also, the participants consisted of predominantly non-Hispanic white adult members of a single health-assessment/fitness center who exhibited above-average physical condition and were unmedicated; this may reduce the generalizability of the findings. However, including male and female participants with a wide range of ages and physical characteristics enhanced both the external and internal validity of the study. Furthermore, eliminating the initial-visit GXT helped ensure that subsequent yearly tests reflected more consistent test and activity patterns. Most notably, the longitudinal design, in which multiple GXT results for each individual were tracked for many years and were analyzed with a statistical technique that modeled not only the means of the data but their variance and covariance structures as well as random effects, is a particular strength of this research. The decision not to exclude smokers from the present study sample was inconsequential. Only one individual, a male in his mid-30s, currently smoked cigarettes (and only during two of his five GXT test years that were included in this research), and a total of just 21 people had a previous history of some smoking. A longitudinal study of 13- to 36-yr-olds by Bernaards et al. (^{3}) has shown that even heavy smoking reduced HR_{max} in both men and women by less than 3 bpm.

In summary, it is reassuring that this detailed longitudinal investigation into predicting HR_{max} has produced results that match (and validate) the thorough cross-sectional research of Tanaka and colleagues (^{36}). The final linear prediction equation developed in the current study, HR_{max} = 207 − 0.7 × age, has confidence interval bounds corresponding to ± 1 SD of the population's mean HR_{max} (for people roughly 30-75 yr of age) that reflect a range of ± 5-8 bpm. Alternatively, the nonlinear equation detected in this research, HR_{max} = 192 − 0.007 × age^{2}, though perhaps less desirable from a usability point of view, has yielded even tighter corresponding confidence interval bounds of ± 2-5 bpm. These ranges are narrower than the variations inherent in some other HR_{max}-prediction equations, including (probably) the formula of 220 − age. Despite this assurance, prescriptions of exercise intensity should be based on direct measurements of HR_{max} if possible, because an equation may not predict the true HR_{max} in some individuals or for specific populations and modes of exercise (^{36}). Moreover, because percentages of HR_{max} vary considerably in relation to individual anaerobic threshold levels, reliance on standard heart rate-intensity guidelines can result in differing levels of metabolic stress across individuals (^{17,27}). Accordingly, subjective ratings of perceived exertion during physical activity should always be used as a complementary way of monitoring exercise intensity along with checking heart rate.

The authors express their appreciation to Fred Stransky, PhD., director of the health enhancement institute referenced in this study, for devising the data collection instruments and supervising data collection.