Cardiorespiratory fitness reflects the ability of the lungs and cardiovascular system to transport oxygen and the ability of the tissues and organs to extract and use oxygen during sustained exercise. Cardiorespiratory fitness is a strong, independent predictor of morbidity and mortality when measured during a maximal exercise test (4,13). However, the measurement of cardiorespiratory fitness is not deemed feasible in many health care settings because it is expensive, time-consuming, and requires trained personnel (12,16,20). Cardiorespiratory fitness can be estimated using equations containing variables that might be readily available, such as age, body mass index (BMI), resting HR, and physical activity (12,15). When cardiorespiratory fitness is estimated from nonexercise testing equations, it is associated with all-cause and cardiovascular disease mortality (16,20). The existing nonexercise testing equations were developed in samples of predominantly white men and women and may not be generalizable to others (12,15). Cardiorespiratory fitness is significantly different when measured during a maximal exercise test in white men and men of South Asian descent matched for age and BMI, and the lower cardiorespiratory fitness of South Asian men cannot be explained by their lower physical activity (7). The purpose of this study was to develop and validate equations to estimate cardiorespiratory fitness in white men and South Asian men.
Nonexercise testing cardiorespiratory fitness models were developed using data from a cross-sectional study of 100 white European and 100 South Asian men matched for age and BMI living in Glasgow, UK (6). Volunteers were without coronary heart disease, cerebrovascular disease, peripheral vascular disease, or known diabetes. Whites reported having two parents of white European origin and South Asians reporting having two parents of Indian, Pakistani, Bangladeshi, or Sri Lankan origin. Age and smoking status were reported by the participant, BMI was determined by a trained investigator, resting HR was assessed by ECG, and physical activity was assessed by accelerometer (GT3X+ or ActiTrainer; ActiGraph, Pensacola, FL). Cardiorespiratory fitness was assessed during a maximal exercise, which included a continuous incremental uphill walking protocol (21). Respiratory gasses were measured using the Douglas bag technique, and cardiorespiratory fitness was expressed as maximal oxygen consumption (V˙O2max, mL·kg−1·min−1) (6). The study was approved by West of Scotland Research Ethics Committee, and all participants gave written informed consent.
A listwise deletion approach was used for missing data. Multiple linear regression models were created using variables that are thought to be important in predicting cardiorespiratory fitness (6,16,20): age (yr), BMI (kg·m−2), resting HR (bpm), smoking status (0, never smoked; 1, ex or current smoker), physical activity expressed as quintiles (0, quintile 1; 1, quintile 2; 2, quintile 3; 3, quintile 4; 4, quintile 5), categories of MVPA (0, <75 min·wk−1; 1, 75–150 min·wk−1; 2, >150–225 min·wk−1; 3, >225–300 min·wk−1; 4, >300 min·wk−1), or minutes of MVPA (min·wk−1), and ethnicity (0, South Asian; 1, white). The outcome variable was V˙O2max (mL·kg−1·min−1). Three models were created without an ethnicity variable: V˙O2max = age + BMI + HR + smoking status + activity quintile (model 1); V˙O2max = age + BMI + HR + smoking status + activity category (model 2); V˙O2max = age + BMI + HR + smoking status + activity minutes (model 3). Three models were created with an ethnicity variable: V˙O2max = age + BMI + HR + smoking status + activity quintile + ethnicity (model 4); V˙O2max = age + BMI + HR + smoking status + activity category + ethnicity (model 5); V˙O2max = age + BMI + HR + smoking status + activity minutes + ethnicity (model 6). Physical activity was expressed as a categorical variable or a continuous variable in the equations because physical activity is usually expressed as a categorical variable or a continuous variable in observational and experimental studies (10). The regression coefficients are reported along with their corresponding standard errors, 95% confidence intervals, and P values. The R2 statistic and the adjusted R2 value (which accounts for the number of parameters in the model) are reported to indicate the proportion of total variance in V˙O2max explained by the model. The root mean squared error (RMSE) is also reported because it provides a measure of the error between the observed values and the predicted values.
Checking assumptions, collinearity, and goodness-of-fit.
The assumption of constancy of variance was checked by plotting the standardized residuals against the predicted values. Constancy of variance, or homoscedasticity, was assumed to exist if the spread of residuals was constant. The assumption of normality was checked using normal probability plots. Normality was assumed to exist if the standardized residuals were normally distributed. The assumption of linearity was checked by plotting the standardized residuals against each of the covariates, and polynomial terms up to the order of 3 were manually added to the models and tested for significance following the principles of parsimony (data and graphs not shown). The variance inflation factor was used to check for collinearity between any of the covariates in a model. Goodness-of-fit of the model was checked by plotting the observed V˙O2max values against the predicted ones.
Cross-validated models tend to exhibit far better generalizability (out-of-sample performance) than conventionally fitted models (18). In the present study, the leave-one-out cross-validation (LOOCV) procedure was used to assess the predictive performance of the models on “unseen” data: a training (n − 1 observations) and testing analysis (1 observation) were implemented (n times, with a different observation left out each time). The R2 and RMSE of the original models were considered alongside the LOOCV R2 and RMSE. Bootstrapping is an appropriate way to validate a model in the absence of a large second data set (5). In the present study, the bootstrap and jackknife resampling techniques were used to estimate the variance and bias of the models (5). The models were bootstrapped with the use of Monte Carlo simulations. A bootstrap program was written to resample observations with replacement from the data. The number of samples was increased until convergence was seen. The regression coefficients, R2 statistic and RMSE of the original models were considered alongside the corresponding bootstrapped standard errors and nonparametric 95% confidence intervals based on percentiles. The regression coefficients, standard errors, and 95% confidence intervals of the original models were considered alongside the corresponding jackknife standard errors and confidence intervals. Stata (version 13.1.; Stata Corp.) was used for model development, model checking, and model validation.
Physical activity was missing in 31 men, and cardiorespiratory fitness was missing in one man. Therefore, participants in the present study were 83 white European men and 85 South Asian men. Age was 50 ± 7 (40–69) yr in the present sample and the original sample (mean ± SD(range)). BMI was 27.1 ± 4.0 (17.7–46.7) kg·m−2 in the present sample and 27.0 ± 4.1 (17.8–46.7) kg·m−2 in the original sample. Resting HR was 64 ± 9 (45–94) bpm in the present sample and 64 ± 9 (41–94) bpm in the original sample. Sixty-eight percent reported never smoking in the present sample and the original sample. MVPA was 238 ± 160 (14–817) min·wk−1 in the present sample and the original sample. Table 1 shows the characteristics of participants in the present sample by ethnicity.
No assumptions were violated, there was no evidence of collinearity, and all models were a good fit. Figure 1 shows the fit of a model without an ethnicity variable, and Figure 2 shows the fit of the same model with an ethnicity variable. Around fifty percent of the variance in cardiorespiratory fitness was explained by the models without an ethnicity variable (n = 168 in all models). Fifty-four percent of the variance in cardiorespiratory fitness was explained by model 1, the model including age, BMI, smoking status, resting HR, and physical activity quintile (Table 2). Fifty-four percent of the variance in cardiorespiratory fitness was explained by model 2, a similar model including physical activity category (see Table S1, Supplemental Digital Content 1, nonexercise testing regression model for estimating cardiorespiratory fitness in whites and South Asians using physical activity categories without an ethnicity variable, http://links.lww.com/MSS/A609). Fifty-three percent of the variance in cardiorespiratory fitness was explained by model 3, a similar model in which physical activity was expressed as minutes (see Table S2, Supplemental Digital Content 2, nonexercise testing regression model for estimating cardiorespiratory fitness in whites and South Asians using physical activity minutes per week without an ethnicity variable, http://links.lww.com/MSS/A610).
Around 70% of the variance in cardiorespiratory fitness was explained by the models with an ethnicity variable (n = 168 in all models). Seventy-one percent of the variance in cardiorespiratory fitness was explained by model 4, the model including age, BMI, smoking status, resting HR, physical activity quintile, and ethnicity (Table 3). Similar amounts of variance were explained in models 5 and 6, similar models in which physical activity was expressed as categories or minutes (see Table S3, Supplemental Digital Content 3, nonexercise testing regression model for estimating cardiorespiratory fitness in whites and South Asians using physical activity categories with an ethnicity variable, http://links.lww.com/MSS/A611; see Table S4, Supplemental Digital Content 4, nonexercise testing regression model for estimating cardiorespiratory fitness in whites and South Asians using physical activity minutes per week with an ethnicity variable, http://links.lww.com/MSS/A612).
The LOOCV procedure showed small differences in the observed and cross-validated sets in models 4, 5, and 6, suggesting that these models are generalizable (Table 4). For example, the observed R2 value was 0.71, the LOOCV R2 value was 0.68, and the R2 difference was −5.4% in model 4. The bootstrapping program showed that there was low variance and low bias in model 4, the model including age, BMI, smoking status, resting HR, physical activity quintile, and ethnicity (see Table S5, Supplemental Digital Content 5, bootstrapping data for the model include age, BMI, smoking status, resting HR, physical activity quintile, and ethnicity, http://links.lww.com/MSS/A613). There was evidence of convergence with replication. For example, the R2 value was 0.71 in the original model (Table 3), and the standard error was 0.035 (95% confidence interval, 0.661–0.792) with 500 replications, 0.036 (0.654–0.795) with 10,000 replications, and 0.036 (0.654–0.795) with 40,000 replications (see Table S5, Supplemental Digital Content 5, bootstrapping data for the model include age, BMI, smoking status, resting HR, physical activity quintile, and ethnicity http://links.lww.com/MSS/A613). The bootstrapping program showed that there was also low variance and low bias in models 5 and 6, similar models in which physical activity was expressed as categories or minutes (data not shown). The jackknife standard errors were observed to be similar to the original standard errors, and the jackknife 95% confidence intervals did not change the importance or statistical significance of any of the variables in the models, which suggests that there was no overfitting of the data and that the models with an ethnicity variable are reliable (see Table S6, Supplemental Digital Content 6, Jackknife data for the model include age, BMI, smoking status, resting HR, physical activity quintile, and ethnicity, http://links.lww.com/MSS/A614).
The present study demonstrates the importance of incorporating ethnicity in nonexercise equations to estimate cardiorespiratory fitness in multiethnic populations. The equations including age, BMI, smoking status, resting HR, physical activity, and ethnicity explain much of the variance in cardiorespiratory fitness and are generalizable and reliable.
Existing nonexercise testing equations were developed in samples of predominantly white men and women and may not be generalizable to others (12,15). The equations in the present study were developed using data from a study of white European and South Asian men living in the United Kingdom (6). In that study, Ghouri and colleagues (6) measured a difference in V˙O2max in whites and South Asians of 8.24 mL·kg−1·min−1 (95% confidence interval, 6.33–10.15). Ghouri and colleagues (6) concluded that the lower V˙O2max values of South Asians could not be explained by their lower physical activity levels. Ghouri and colleagues (6) noted that South Asians had lower fitness levels at all activity levels, and they suggested that there might be innate differences in fitness between whites and South Asians. In the present study, equations including age, BMI, smoking status, resting HR, and physical activity explained around 50% of the variance in V˙O2max and similar equations with an ethnicity variable explained around 70% of the variance in V˙O2max. These data demonstrate the importance of using an ethnicity variable to help estimate V˙O2max in nonexercise testing equations.
The nonexercise testing equations with an ethnicity variable perform favorably in comparison to existing equations. Consider model 4 in the present study, for example. R2 was 0.72 in model 4, 0.65 in the National Aeronautics and Space Administration (NASA) equation, 0.60 in the Aerobics Centre Longitudinal Study (ACLS) equation, and 0.58 in the Allied Dunbar National Fitness Survey (ADNFS) equation (12). The cross-validated R2 value indicates how well an equation might predict fitness in an unseen data set. Importantly, the cross-validated R2 was 0.67 in model 4 in the present study, 0.58 and 0.56 in the NASA equation, 0.64 and 0.55 in the ACLS equation, and 0.58 and 0.52 in the ADNFS equation (these values were obtained by squaring the cross-validity correlations reported in Table 6 of Jurca and colleagues’ article (12)). These data suggest that the equations with an ethnicity variable would better predict cardiorespiratory fitness in an unseen data set than existing equations.
Wier and colleagues (22) added an ethnicity variable to an early version of the NASA equation. They found that the addition of a nominal ethnicity variable raised R2 by 0.008 in a model includes age, sex, physical activity, and waist girth; by 0.006 in a model including age, sex, physical activity, and percent body fat; and by 0.004 in a model including age, sex, physical activity, and BMI (all P < 0.001). Wier and colleagues (22) suggested that ethnicity explained a little more of the variance in cardiorespiratory fitness because the sample of 140 nonwhites was considerably smaller than the sample of 2417 whites. It is also possible that the blacks, Hispanics, and Asian-Pacific Islanders who made up the nonwhites were different ethnic groups. The families of the nonwhites in the present study were all from South Asia, a region of high diabetes prevalence (11).
It is clear that the best performing of the equations in the present study were those that include an ethnicity variable. These equations contain other variables that might also be readily available in many health care settings: age, BMI, smoking status, resting HR, and physical activity. There are many ways of expressing physical activity outcomes in observational and experimental studies, and it is noteworthy that these equations explained much of the variance in cardiorespiratory fitness, whether physical activity was expressed as quintiles, categories, or minutes. The LOOCV procedure showed that these equations are generalizable, and the bootstrap and jackknife resampling techniques showed that there is low variance and bias in these equations.
It has been argued that models should include variables that are thought to be important from the literature, whether or not they reach statistical significance in a particular data set (2). The literature suggests age, BMI, smoking status, resting HR, physical activity, and ethnicity are important predictors of cardiorespiratory fitness (6,16,20). Accordingly, these variables were retained in models in the present study. Resting HR was a statistically significant predictor of cardiorespiratory fitness in model 1, model 2, model 3, and model 6 (all P < 0.05). Resting HR was not a statistically significant predictor in model 4 (P = 0.091) or model 5 (P = 0.064). Resting HR was not removed from any model because the removal of important variables might lead to unreliable models (2).
Nonexercise testing equations developed in white men and women (12,15) have been shown to predict cardiovascular and all-cause mortality in whites (16,20). In a study of 32,319 adults age 35 to 70 yr at baseline, Stamatakis and colleagues (20) found that nonexercise testing cardiorespiratory fitness predicted cardiovascular and all-cause mortality during 9 yr of follow-up after adjustment for potential confounders. In a study of 20,112 adults age 20 to 60 yr at baseline, Nes and colleagues (16) found that nonexercise testing cardiorespiratory fitness predicted cardiovascular and all-cause mortality during 24 yr of follow-up after adjustment for potential confounders. Further research is required to determine whether nonexercise testing cardiorespiratory fitness predicts mortality in multiethnic populations, such as South Asians. Insulin resistance is higher in South Asian men than white men, and it was recently reported that directly measured cardiorespiratory fitness explained more than two thirds of the ethnic difference in insulin resistance (lower cardiorespiratory fitness explained 68%, lower physical activity explained 29%, and greater adiposity explained 52% of the ethnic variance in the homeostasis model assessment of insulin resistance) (6). These data suggest that it may be particularly important to measure or to estimate cardiorespiratory fitness in South Asian men.
This study has some limitations. The development and validation data set was drawn from a convenience sample that may not be representative (6); however, physical activity and cardiorespiratory fitness levels were similar in other studies of white European and South Asian men (7). Questionnaires and accelerometers have their advantages and disadvantages (17), but questionnaires were not used in the original study of Ghouri and colleagues (6). Questionnaires might be more feasible in many health care settings and model 1 (without an ethnicity variable) and model 4 (with an ethnicity variable) might be used with questionnaires that provide a summary measure of physical activity that can be divided into quintiles, such as the Baecke questionnaire (kcal·wk−1) (1), the Five-City Project 7-Day Physical Activity Recall Questionnaire (kcal·d−1) (19), or the ACLS 7-Day Recall questionnaire (h·wk−1) (14). The data set was made up of men and more research is required to develop and validate equations for women. The nonexercise testing cardiorespiratory fitness equations were developed and validated in the same data set; however, bootstrapping is an appropriate way to validate a model in the absence of a large second data set (5). The cardiorespiratory fitness of white and South Asian men has only been reported in three other studies, in which sample size ranged from 40 to 92 (3,8,9).
To our knowledge, this is the first study to develop and validate equations to estimate cardiorespiratory fitness in white men and Sough Asian men. This study shows the importance of incorporating ethnicity in nonexercise equations to estimate cardiorespiratory fitness. The equations contain readily available variables, they explain much of the variance in cardiorespiratory fitness, and they are generalizable and reliable.
Ghouri was supported by a Fellowship from Chest, Heart and Stroke Scotland. The research was supported by The National Institute for Health Research Collaboration for Leadership in Applied Health Research and Care-East Midlands (NIHR CLAHRC-EM), the Leicester Clinical Trials Unit and the NIHR Leicester-Loughborough Diet, Lifestyle and Physical Activity Biomedical Research Unit which is a partnership between University Hospitals of Leicester NHS Trust, Loughborough University and the University of Leicester. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health.
The authors have no conflicts of interest. The results of the present study do not constitute endorsement by ACSM.
1. Baecke JAH, Burema J, Frijters JER. A short questionnaire for the measurement of habitual physical activity in epidemiological studies. Am J Clin Nutr
. 1982; 36: 936–42.
2. Collins GS, Mallett S, Omar O, Yu LM. Developing risk prediction models for type 2 diabetes: a systematic review of methodology and reporting. BMC Med
. 2011; 9: 103.
3. Davey G, Roberts J, Patel S, et al. Effects of exercise on insulin resistance in South Asians and Europeans. J Exerc Physiol
. 2000; 3(2): 6–11.
4. Fogelholm M. Physical activity, fitness and fatness: relations to mortality, morbidity and disease risk factors. A systematic review. Obes Rev
. 2010; 11(3): 202–21.
5. Fox J. Bootstrapping regression models. In: Applied regression analysis and generalized linear models
. London: Sage; 2015. pp. 587–606.
6. Ghouri N, Purves D, McConnachie A, Wilson J, Gill JM, Sattar N. Lower cardiorespiratory fitness contributes to increased insulin resistance and fasting glycaemia in middle-aged South Asian compared with European men living in the UK. Diabetologia
. 2013; 56(10): 2238–49.
7. Gill JM, Celis-Morales CA, Ghouri N. Physical activity, ethnicity and cardio-metabolic health: does one size fit all? Atherosclerosis
. 2014; 232(2): 319–33.
8. Hall LM, Moran CN, Milne GR, et al. Fat oxidation, fitness and skeletal muscle expression of oxidative/lipid metabolism genes in South Asians: implications for insulin resistance? PLoS One
. 2010; 5(12): e14197.
9. Hardy CP, Eston RG. Aerobic fitness of Anglo-Saxon and Indian students. Br J Sports Med
. 1985; 19(4): 217–8.
10. Helmerhorst HJ, Brage S, Warren J, Besson H, Ekelund U. A systematic review of reliability and objective criterion-related validity of physical activity questionnaires. Int J Behav Nutr Phys Act
. 2012; 9: 103.
11. International Diabetes Federation. IDF Diabetes Atlas
. Brussels, Belgium: International Diabetes Federation; 2013. Available from: International Diabetes Federation.
12. Jurca R, Jackson AS, LaMonte MJ, et al. Assessing cardiorespiratory fitness without performing exercise testing. Am J Prev Med
. 2005; 29(3): 185–93.
13. Kodama S, Saito K, Tanaka S, et al. Cardiorespiratory fitness as a quantitative predictor of all-cause mortality and cardiovascular events in healthy men and women: a meta-analysis. JAMA
. 2009; 301(19): 2024–35.
14. Kohl HW, Blair SN, Paffenbarger RS Jr, Macera CA, Kronenfeld JJ. A mail survey of physical activity habits as related to measured physical fitness
. Am J Epidemiol
. 1988; 127(6): 1228–39.
15. Nes BM, Janszky I, Vatten LJ, Nilsen TI, Aspenes ST, Wisloff U. Estimating V.O 2peak from a nonexercise prediction model: the HUNT Study, Norway. Med Sci Sports Exerc
. 2011; 43(11): 2024–30.
16. Nes BM, Vatten LJ, Nauman J, Janszky I, Wisloff U. A simple nonexercise model of cardiorespiratory fitness predicts long-term mortality. Med Sci Sports Exerc
. 2014; 46(6): 1159–65.
17. Pedisic Z, Bauman A. Accelerometer-based measures in physical activity surveillance: current practices and issues. Br J Sports Med
. 2015; 49(4): 219–23.
18. Porta M. A Dictionary of Epidemiology
. 5th ed. Oxford: Oxford University Press; 2008.
19. Sallis JF, Haskell WL, Wood PD, et al. Physical activity assessment methodology in the Five-City Project. Am J Epidemiol
. 1985; 121(1): 91–106.
20. Stamatakis E, Hamer M, O’Donovan G, Batty GD, Kivimaki M. A non-exercise testing method for estimating cardiorespiratory fitness: associations with all-cause and cardiovascular mortality in a pooled analysis of eight population-based cohorts. Eur Heart J
. 2013; 34(10): 750–8.
21. Taylor HL, Buskirk E, Henschel A. Maximal oxygen intake as an objective measure of cardio-respiratory performance. J Appl Physiol
. 1955; 8(1): 73–80.
22. Wier LT, Jackson AS, Ayers GW, Arenare B. Nonexercise models for estimating V˙O2max
with waist girth, percent fat, or BMI. Med Sci Sports Exerc
. 2006; 38(3): 555–61.