The usual population strategy for prevention emphasizes the importance of shifting the population distribution of risk factors, as opposed to targeting prevention efforts to the high risk “tail” of the risk distribution.1 However, population strategies that disproportionately benefit socially advantaged low-risk persons could shift the low-risk tail of the distribution, while leaving the high-risk segments of the population behind, widening the distribution of risk.2 By simultaneously decreasing the average level of risk in the population and increasing the variance in risk, a population strategy may widen socioeconomic disparities in disease risk. Whether a population strategy results in the widening or shrinking of health disparities may depend on the specific nature of the intervention and how risk is initially distributed in the population.3,4 Public health strategies addressing structural or institutional conditions are likely to have a different impact on health disparities than public health strategies that target behavioral changes.4 Previous research in this debate has drawn primarily on effect estimates, using conventional regression methods.5,6
Although the potential importance of nonlinear relation between risk factors and mean outcomes is increasingly recognized,7 less is understood about possible heterogeneity, in effects among persons with underlying high versus low risk. If the effect of a possible intervention varies across percentiles of the outcome, this could provide insight on who is benefiting the most from such interventions, and also help address the persistence of health inequities, even with improvements in average population health. In some settings, public health interventions may have differential effects on the outcome distribution, and the average effect would not adequately describe the effect of the intervention on population health. Previous research has examined nonuniform effects of socioeconomic status across the distribution of health status, by comparing simple differences in percentiles of health status scores according to income level.8 However, such methods for differential treatment effects can be modeled explicitly, only when modifiers of the treatment effect are known. Conditional and marginal quantile regression approaches provide a useful framework to assess the population impact of a hypothetical public health intervention on the complete outcome distribution, even when factors that modify the effect of intervention are not measured.9,10
We propose that evaluations of the population health consequences of proposed interventions should assess whether treatment effects are homogeneous across the outcome distribution. To illustrate how treatment effects may differ, we compare the association of education with 10-year risk of coronary heart disease (CHD) and with body mass index (BMI), using conventional linear regression models and quantile regression models. Both BMI and the Framingham CHD risk score have been recommended for use in clinical settings, to identify patients at high risk for CHD and other adverse health outcomes.11 – 14
CHD is the leading cause of death in the United States for both men and women, and obesity is a major risk factor for cardiovascular disease (CVD) and premature death.15 – 18 Both BMI and the Framingham CHD risk score have strong inverse associations with educational attainment; lower educational level is associated with higher prevalence of obesity and CHD risk factors.19 – 22 Under a “social determinants of health” framework, educational policy would be an important population strategy in addressing health disparities in the United States. Education has long been considered an investment in human capital that benefits both the individual and society. Similarly, changes in educational levels can lead to important changes in individual and population-level health. Our goal is to evaluate whether the health advantage associated with education disproportionately benefits high-risk populations, and to show how conventional regression methods may be limited in addressing this question.
We analyzed data from the following 2 nationally representative publicly available datasets—the 2006 wave of the Health and Retirement Study and the 2003–2008 National Health and Nutrition Examination Survey (NHANES). Details regarding the Health and Retirement Survey and NHANES surveys are available in previously published reports.23,24
The Health and Retirement Survey is funded by the National Institute of Aging and coordinated and managed by the Institute for Survey Research at the University of Michigan. The 2006 sample is nationally representative of birth cohorts born before 1954. Half of the households in the sample were randomly selected to complete an enhanced in-person interview that includes a set of physical performance measures. Among 7,940 age-eligible respondents randomly selected for the physical measures component, we excluded 17 (<1%) respondents who were missing information on degree credentials. Health and Retirement Survey models with CHD risk, as the outcome also excluded 3,363 respondents (42%) with missing/invalid/noncompliant measurements necessary to calculate the Framingham Risk Score (eg, total cholesterol reading). Another 1,140 (14%) who reported having a doctor diagnosis of a heart attack, CHD, angina, coronary heart failure, or other heart problem were also excluded, resulting in a sample of 1,353 men and 2050 women. The Health and Retirement Survey models with BMI as the outcome excluded 1,210 (15%) for missing/invalid height or weight measurements, leading to a sample of 2853 men and 3860 women.
The NHANES sample consisted of 13,080 men and women aged 30 years or older, of whom 95% (n = 12,461) were interviewed and examined between 2003 and 2008. Respondents who were missing education information (n = 3,882) or who were pregnant (n = 118) were excluded from all analyses. Analyses with CHD risk as the outcome were conducted on 3,531 men and 3,648 women with complete Framingham risk score component information, who were not previously diagnosed with CHD (n = 380). The NHANES sample for the BMI analyses included 4,155 men and 4,142 women.
We used respondent's self-reported highest educational degree completed at the time of the survey as a categorical variable (less than high school; high school degree/high school equivalency diploma; or BA/BS/Graduate/professional degree).
Although targeting individual CVD risk factors is important, public health usually aims to reduce overall risk. For that reason, we focus on the 10-year Framingham CHD risk score, which can be interpreted as a person's chances of developing CHD in the next 10 years. To illustrate findings for a single risk factor, we also show results for BMI, which was calculated according to the standard formula (weight in kg/height in m2). Outcome variables were log-transformed (natural log). After convention, we added a constant of 1 to the CHD risk score because any number between and excluding 0 and 1 is a negative logarithm.
Estimates from ordinary least squares were compared with estimates from conditional and marginal quantile regression models, to examine average and distributional education-outcome associations. Linear regression provides an estimate of the conditional mean effect. The coefficients from an ordinary least squares regression model can be interpreted as the difference in the mean of the dependent variable associated with a unit difference in the independent variable, conditional on the covariates. Quantile regression provides a flexible framework to model, potentially heterogeneous effects of the risk factor across the entire distribution of the outcome. This allows us to investigate whether education induces a more complex change in the health-outcome distribution. Coefficients from a quantile regression model are interpreted in a manner similar to ordinary least squares, but focusing on differences in a particular quantile of the distribution. For example, in an ordinary least squares regression model predicting BMI, the regression coefficient for years of schooling is interpreted as the difference in mean value of BMI, per additional year of schooling. The coefficient in a quantile regression model for the 50th percentile is interpreted as the difference in the median value of BMI, per additional year of schooling, and the coefficient for the 90th percentile is interpreted as the difference in the 90th percentile of BMI, per additional year of schooling. If education shifts the entire outcome distribution (ie, a location shift in the outcome distribution), quantile regression estimates will be constant across the percentiles and also similar to ordinary least squares regression estimates. However, if education affects both the location and shape of the outcome distribution, we expect to observe systematic differences in the quantile regression estimates. Conditional quantile regression describes the associations between the corresponding characteristics and CHD and BMI risk at specific percentiles of the conditional outcome distribution. We also present estimates from the marginal quantile regression in the electronic appendix, using the Firpo, Fortin, and Lemieux25 method and their accompanying Stata ado program, which calculates the quantile marginal effect by estimating and rescaling the influence function for specific percentiles, and then using the recentered influence function as the outcome.
All analyses were sex-stratified to address previously reported differences in effect sizes in men and women; they were performed using log-transformed calculated 10-year CHD risk values and BMI. This monotonic transformation will not affect the pattern of results because quantiles are order statistics.26 However, log transformation of the outcomes changes the interpretation of the regression model. The quantile regression models with a log-transformed outcome variable, now quantify whether degree credentials exert a constant percentage change across the outcome distribution; the actual change in the value of the CHD risk score (or BMI) can differ from one value of y to another. We also present percentage change in the outcome associated with the comparison versus reference group, as estimated by 100*(exponentiated log-transformed point estimates − 1). Standard errors for all regression analyses were computed by bootstrapping the results 500 times, accounting for the complex survey design of Health and Retirement Survey and NHANES. Bootstrapped standard errors were then used to construct normal-based 95% confidence intervals. Adjusted analyses included race (non-Hispanic white, non-Hispanic black, Hispanic, and Other), nativity (US born vs. non-US born), and linear and quadratic age terms. Data manipulation, descriptive analysis, and marginal quantile regression analyses were completed in Stata (version 11), and conditional quantile regression analyses were conducted using R (R Project for Statistical Computing; version 2.14).
Respondents in the Health and Retirement Survey sample were mostly non-Hispanic white women with an average age of 68 years. The NHANES was also largely non-Hispanic white, with an average age of 55 years (Table 1). Average 10-year CHD risk was inversely associated with educational attainment for both sexes, with women generally having lower risk. Less variation among educational groups and between sex was noted for BMI.
The sex- and sample-specific distribution of log-transformed CHD risk and BMI are shown in Figure 1. The log CHD risk distribution for college degree holders is characterized by both a location shift (ie, change in the mean) and a change in the distribution shape, in both the NHANES and Health and Retirement Survey samples. The distribution of log CHD risk for college graduates was farthest to the left (lowest risk), whereas the distribution for those with no degree was farthest to the right (highest risk). Moreover, the Kolmogorov-Smirnov tests for equality of distributions suggest the differences in the distributions of log CHD risk by degree attainment and are unlikely to be attributable strictly to sampling variability (Figs. 1, 2). Among women, the log BMI distribution for college graduates was noticeably right-skewed compared with the BMI distribution for high school graduates and those with no degree. There was little difference in BMI distribution by degree credential among men.
Table 2 presents associations of educational attainment with the calculated 10-year risk for CHD. In NHANES, higher degree credentials were associated with lower mean CHD risk, especially among women. A college degree was associated with 27% (95% confidence interval = −29% to −25%) lower average CHD risk among women and 19% (−23% to −14%) lower average CHD risk for men. Conditional quantile regression estimates associated with degree credentials were uniformly negative and systematically increased in magnitude, moving from women with low to high CHD risk. On average, female high school graduates had 12% lower CHD risk than women with no degree (15% to −8%). At the 10th percentile, female high school graduates had a 4% lower CHD risk (−9% to 1%) than women who had not completed high school. At the 90th percentile, female high school graduates had a 17% lower CHD risk (−24% to −8%) than women with less than a high school degree. Similarly, female college graduates had lower CHD risk at the upper end of the distribution (−15% in the 10th percentile vs. −30% in the 90th percentile).
The disproportionate difference in the regression coefficients at the extremes of the distribution led to a reduction in the spread or scale of the CHD distribution, as well as possible changes in the skewness of the CHD distribution. For example, the interquartile range (75th percentile minus the 25th percentile) in the CHD risk distribution was 8% points lower for NHANES female high school graduates than women with less than a high school degree. In the Health and Retirement Survey sample, women had a comparable but less marked pattern, with education associated with systematically lower CHD risk in the upper end of the distribution (Table 2). This pattern of larger relative effect estimates at higher quantiles of risk necessarily implies that absolute effect estimates are also larger at higher quantiles of risk than lower quantiles. Among men, estimates from the ordinary least squares regression and conditional quantile regression models were similar, suggesting a uniform decrease in CHD risk throughout the distribution.
Female high school graduates in the NHANES sample had mean BMI similar to respondents who had not completed high school (0% [95% confidence interval = −2% to 2%]; Table 3). At the mean and at specific percentiles of CHD risk, female college graduates had a BMI approximately 8% lower (−10% to −5%) than those with less than a high school degree. Among NHANES men, a high school degree was not associated with any difference in BMI, compared with those who had less than a high school degree. However, a college degree for men was associated with different directions of effects depending on BMI. Male college graduates at the 10th percentile of the BMI distribution (ie, underweight/ normal) had a BMI that was 6% greater (2% to 11%) than men who had not graduated from high school. Conversely, male college graduates who were at the 90th percentile of the distribution (ie, overweight/ obese) had a BMI 7% lower than men with less than high school degree (−10% to −3%). The interquartile range of the BMI distribution for male college graduates was reduced by 8% points. Estimates from the Health and Retirement Survey sample were comparable with those from the NHANES sample.
Marginal quantile regression estimates (eTables 1 and 2, http://links.lww.com/EDE/A598) showed patterns similar to conditional quantile regression estimates. Among women, the benefits of a college degree were generally larger at the upper end of the CHD risk distribution (90th percentile) than at the low end (10th percentile). Among men, marginal quantile regression estimates associated with a college degree were fairly uniform throughout the CHD risk distribution. Marginal quantile regression estimates associated with a college degree were constant throughout the BMI distribution for women in both the NHANES and Health and Retirement Survey samples. Among men, marginal quantile regression estimates associated with a college degree varied across the BMI distribution, ranging from no difference among the normal weight to a lower BMI at the upper end of the BMI distribution. This trend among men was especially evident in the NHANES sample.
Consistent with a growing body of research, we found educational attainment was associated with lower average 10-year CHD risk and lower BMI. Moreover, these associations varied across the outcome distribution, especially among women. Among women, college degrees were associated with lower CHD risk at all points of the distribution, but effect estimates were largest at the upper end of the distribution. In such instances, conventional linear regression models may underestimate the benefits of education for those at the highest risk. By allowing the effects to vary depending on a person's position on the outcome distribution, quantile regression models present a more comprehensive picture of the covariate effects on the whole distribution. The larger decreases at the upper end of the distribution suggest that education is associated with important changes in the spread (shape) of the distribution. Differential associations showing larger estimated benefits for CHD risk among well-educated women, imply that education decreases the within level inequality for CHD risk and BMI. Similar patterns in our marginal quantile regression estimates suggest that increasing the proportion of persons with college degrees may affect the distribution of CHD risk and BMI in the population.
In his classic example, Geoffrey Rose showed that shifting the distribution curve of a single risk factor (eg, blood pressure or cholesterol) by a small amount has a greater effect on mortality rates than does treating only the people with high levels of that risk factor.27 Rose's population strategy of prevention hinges on the assumption that the entire distribution would shift to the left, ie, a location shift. Alternatively, prevention strategies may exacerbate health inequalities, if the risk levels for the most advantaged groups are reduced, whereas the risk levels of the most disadvantaged groups remain unchanged. In other words, inequalities could grow if an intervention affected the shape of the distribution but left the risk level of high-risk (and disadvantaged) groups unchanged. Our results suggest that college education may be associated both with decreases in the population mean and with the distribution of risk. Generally, we found larger changes associated with high levels of education among those at high risk (ie, in the upper end of the distribution), especially among women. Education may have a larger impact on groups who are disadvantaged and with the highest health risk because these groups may depend more on education to achieve well-being.28 This impact applies to both the relative and absolute scale. The finding of larger relative effect estimates at higher than lower quantiles necessarily implies larger absolute effects at higher than lower quantiles. Many of our findings would therefore be even more marked, if expressed in absolute terms. Because absolute effect estimates are often preferred when evaluating total public health impact of an intervention, this highlights the importance of considering effects across the outcome distribution when considering public health impact.
Finally, similarities in the results from models using NHANES and Health and Retirement Survey suggest that the associations between educational attainment and health persist in differing social and historical contexts. Although the Health and Retirement Survey respondents'; educational experience most likely differed from that of the majority of the NHANES respondents, patterns of the associations between college education and CHD risk and BMI were similar.
Our study has several limitations. Educational choices reflect complex decision making, and these choices are likely to be strongly correlated with unobserved factors such as cognitive ability and time preference.29 Self-selection into higher education, which is not addressed in this study, would also bias the results. Third, because the less educated group may have higher mortality rate leading to right truncated distributions of BMI and CHD risk, and because the Framingham score may also underestimate CHD risk in socioeconomically disadvantaged groups,30 our results may underestimate effects of higher education in the high-risk population. Additionally, any association may have limited generalizability because educational effects are context dependent.
CVD remains a leading cause of morbidity and mortality in the United States,15 and there are persistent socioeconomic disparities in CVD burden and mortality.31 Reducing health disparities and improving overall population health are 2 major components of the Healthy People 2020 objectives.32 To reach these goals, it will be necessary to understand variability in health outcomes both between and within subpopulations at various levels of risk. Substantively, quantile regression results may be more helpful for policy makers than results from standard regression models because people in the lower tail of the distributions bear the greatest burden of health risk. It is important to examine variability in health outcomes between and within subpopulations defined by levels of risk. Evaluations of the population health consequences of proposed interventions or policy changes should assess whether there are differences in how various parts of the distribution respond to proposed interventions.
1. Rose G, Khaw K, Marmot M. Rose's Strategy of Preventive Medicine. Oxford: Oxford University Press; 2008.
2. Frohlich KL, Potvin L. The inequality paradox: the population approach and vulnerable populations. Am J Public Health. 2008; 98: 216–221.
3. Manuel DG, Rosella LC. Commentary: assessing population (baseline) risk is a cornerstone of population health planningâ€ looking forward to address new challenges. Int J Epidemiol. 2010; 39: 380–382.
4. McLaren L, McIntyre L, Kirkpatrick S. Rose's population strategy of prevention need not increase social inequalities in health. Int J Epidemiol. 2010; 39: 372–377.
5. Emberson J, Whincup P, Morris R, Walker M, Ebrahim S. Evaluating the impact of population and high-risk strategies for the primary prevention of cardiovascular disease. Eur Heart J. 2004; 25: 484–491.
6. Zulman DM, Vijan S, Omenn GS, Hayward RA. The relative merits of population-based and targeted prevention strategies. Milbank Q. 2008; 86: 557–580.
7. Rehkopf DH, Krieger N, Coull B, Berkman LF. Biologic risk markers for coronary heart disease: nonlinear associations with income. Epidemiology. 2010; 21: 38–46.
8. Ferrer RL, Palmer R. Variations in health status within and between socioeconomic strata. J Epidemiol Community Health. 2004; 58: 381–387.
9. Ahern J, Hubbard A, Galea S. Estimating the effects of potential public health interventions on population disease burden: a step-by-step illustration of causal inference methods. Am J Epidemiol. 2009; 169: 1140–1147.
10. Arnold BF, Khush RS, Ramaswamy P, et al.. Causal inference methods to study nonrandomized, preexisting development interventions. Proc Natl Acad Sci. 2010; 107: 22605–22610.
11. Grundy SM, Pasternak R, Greenland P, Smith S Jr, Fuster V. Assessment of cardiovascular risk by use of multiple-risk-factor assessment equations: a statement for healthcare professionals from the American Heart Association and the American College of Cardiology. Circulation. 1999; 100: 1481–1492.
12. Berger JS, Jordan CO, Lloyd-Jones D, Blumenthal RS. Screening for cardiovascular risk in asymptomatic patients. J Am Coll Cardiol. 2010; 55: 1169–1177.
13. Barlow SE, Expert Committee. Expert committee recommendations regarding the prevention, assessment, and treatment of child and adolescent overweight and obesity: summary report. Pediatrics. 2007; 120: S164–S192.
14. National Heart, Lung, and Blood Institute and The National Institute of Diabetes and Digestive and Kidney Diseases. The Clinical Guidelines on the Identification, Evaluation, and Treatment of Overweight and Obesity in Adults: The Evidence Report. Washington, DC: National Institute of Health; 1998. NIH Publication No. 98–4083
15. Heron M, Hoyert D, Murphy S, Xu J, Kochanek K, Tejada-Vera B. Deaths: final data for 2006 National Vital Statistics Reports. Vol. 14. Hyattsville, MD: Washington, DC: National Center for Health Statistics; 2009:57.
16. Wilson PWF, D'Agostino RB, Sullivan L, Parise H, Kannel WB. Overweight and obesity as determinants of cardiovascular risk: the Framingham experience. Arch Intern Med. 2002; 162: 1867–1872.
17. Hubert H, Feinleib M, McNamara P, Castelli W. Obesity as an independent risk factor for cardiovascular disease: a 26-year follow-up of participants in the Framingham heart study. Circulation. 1983; 67: 968–977.
18. Franks PW, Hanson RL, Knowler WC, Sievers ML, Bennett PH, Looker HC. Childhood obesity, other cardiovascular risk factors, and premature death. New Engl J Med. 2010; 362: 485–493.
19. Roskam A-JR, Kunst AE, Van Oyen H, et al.. Comparative appraisal of educational inequalities in overweight and obesity among adults in 19 European countries. Int J Epidemiol. 2010; 39: 392–404.
20. MacLaren L. Socioeconomic status and obesity. Epidemiol Rev. 2007; 29: 29–48.
21. Strand B, Tverdal A. Trends in educational inequalities in cardiovascular risk factors: a longitudinal study among 48,000 middle-aged Norwegian men and women. Eur J Epidemiol. 2006; 21: 731–739.
22. Loucks E, Abrahamowicz M, Xiao Y, Lynch J. Associations of education with 30 year life course blood pressure trajectories: Framingham offspring study. BMC Public Health. 2011; 11: 139.
23. Juster F, Suzman R. An overview of the health and retirement study. J Hum Resour. 1995; 30: S7–S56.
25. Firpo S, Fortin NM, Lemieux T. Unconditional quantile regressions. Econometrica. 2009; 77: 953–973.
26. Hao L, Naiman D. Quantile Regression. Thousand Oaks: Sage Publications; 2007.
27. Rose G KK, Marmot M. Rose's Strategy of Preventive Medicine. Oxford: Oxford University Press; 2008.
28. Ross C, Mirowsky J. Sex differences in the effect of education on depression: resource multiplication or resource substitution? Soc Sci Med. 2006; 63: 1400–1413.
29. Farrell P, Fuchs VR. Schooling and health: the cigarette connection. J Health Econ. 1982; 1: 217–230.
30. Brindle PM, McConnachie A, Upton MN, Hart CL, Davey Smith G, Watt GC. The accuracy of the Framingham risk-score in different socioeconomic groups: a prospective study. Br J Gen Prac. 2005; 55: 838–845.
© 2012 Lippincott Williams & Wilkins, Inc.
31. Braveman PA, Cubbin C, Egerter S, Williams DR, Pamuk E. Socioeconomic disparities in health in the united states: what the patterns tell us. Am J Public Health. 2010; 100: S186–S196.