Nyman, John A. PhD; Barleen, Nathan A. BA; Dowd, Bryan E. PhD
With the costs of health care rising dramatically, many employers have implemented comprehensive health promotion programs at the worksite to reduce the health expenditures of their employees and therefore their health insurance costs. Health promotion programs often are justified on the basis of studies that show that a significant proportion of health care expenditures are attributable to modifiable lifestyle risks, such as obesity and smoking,1–3 as well as studies that show that an even larger proportion of expenditures are attributable to the management of chronic health conditions and diseases, such as lower back pain and diabetes.4,5 Thus, comprehensive health promotion programs are designed to help employees reduce modifiable risks (eg, quit smoking) and better manage health conditions (eg, achieve acceptable blood glucose levels in diabetics) with the expectation that these actions will result in reduced health care expenditures and insurance costs, and at the same time, improve the health, absenteeism rates and productivity of their employees.
For employers, expenditures on comprehensive worksite health promotion programs are often viewed as an investment. Given this perspective, employers naturally are interested in whether such an investment yields a return. As a result, a number of economic evaluations—return-on-investment and cost-benefit analyses—have been performed on these programs. Over the years, this literature has expanded greatly and now has been reviewed by a number of authors.6–15 These reviews generally conclude that: 1) comprehensive worksite health promotion programs usually, but not always, produce a substantial return-on-investment, and 2) substantial variability exists in the methodological quality of the studies.
The central methodological issue in these studies is selection bias. Selection bias occurs because those who participate in the health promotion programs are often fundamentally different from those who do not. If selection bias is present, simply comparing the health care expenditures of participants to those of non-participants would not produce reliable results because it would capture the effect of the program on expenditures confounded by the effect of any systematic differences in the characteristics of the two groups. For example, if those who participate in a disease management (DM) program have lower costs because they are also better managers of their health condition, the health promotion program would seem to be more effective than it actually is.
One way to address this issue is to assign workers at random to the health promotion program or control group, and then evaluate the difference in spending between the two groups. Randomized data, however, are usually not available because worksite health promotion programs are typically provided to employees on a voluntary, rather than assigned, basis. Moreover, some have questioned whether randomized studies should be regarded as the gold standard in this type of research16,17 and others have actually viewed randomized, controlled trials as inappropriate for the evaluation of worksite health promotion programs.11,18
In the absence of randomized studies, some of the effects of selection can be eliminated by including observed worker characteristics as variables in a regression analysis of health care expenditures. Accordingly, regression analysis has been used in a number of studies, but in a recent article, Ozminkowski and Goetzel19 noted that the inability of a single regression equation to deal adequately with selection and recommended instead the two-step propensity score approach.20–22 The propensity score approach, however, may still generate biased results because it does not capture the effect of unobserved differences between the participants and non-participants. For example, the employee’s level of motivation to quit smoking may be a major factor in the success of a lifestyle management (LM) program and may also determine whether the employee participates in the program in the first place, but because the motivational intensity of workers is typically not observed (that is, not measured), the regression equation would not be able to hold this factor constant.
In the present article, a differences-in-differences (DD) model with random effects is used to deal with selection and capture unobserved differences in subjects in the intervention and non-intervention groups that might be correlated with health care expenditures, the dependent variable in our analysis. This approach controls for unobserved sources of variability in the dependent variable by using repeated observations of that variable from the preintervention and intervention periods. This technique has also been shown to compare favorably to the propensity score approach and to produce results that are similar to the instrumental variables approach.23–27
In this article, the health promotion program that was implemented by the University of Minnesota in 2006 is evaluated using DD regression equations with random effects to isolate the effect of the health promotion program on health care expenditures. In the next section, this health promotion program is described, as are the data and the DD methodology. Then, the results of the DD analysis are presented. In the final section, the return-on-investment findings of the study and its limitations are discussed.
Health Promotion Program
The University of Minnesota implemented a health promotion program at the beginning of 2006. Eligibility for this program was determined by whether an employee was enrolled in any of the health plans offered by the University at the time. If so, the employee, employee’s spouse and any dependents age 16 or older would be eligible for the program.
The health promotion program had six main components in 2006. First, a web-based wellness assessment (WA) survey was made available to employees, spouses and dependents over age 16, with a $65 incentive payment for completion of the questionnaire (or any of the other components) by the employee or spouse.
Second, if (as a result of this survey) a respondent was identified to have a modifiable risk factor (such as high blood pressure, high cholesterol, fatigue, improper nutrition, overweight, physical inactivity, stress, or alcohol or tobacco use), a health coach would contact the respondent to determine whether he or she would like to participate in a LM program. If the respondent chose to participate, the coach would provide motivational support and one-on-one guidance on the phone over a period of 6 months or a year with the goal that the participant would make beneficial lifestyle changes.
Third, if as a result of either the WA survey or a review of the person’s internal health care claims records, the respondent was identified to have a chronic disease (such as asthma/chronic obstructive pulmonary disease/lung disease, arthritis, coronary artery disease, congestive heart failure, depression, diabetes, gastrointestinal disease, low back pain, migraine, osteoporosis, or stroke), the respondent would receive a letter and phone call inviting him or her to team with a health coach—often a registered nurse or dietician—to develop a DM program tailored to his or her illness. In this regard, it is important to note that participant eligibility was not a function of the level of preintervention expenditures, which would have introduced regression to the mean.
Fourth, a 10,000 steps program was implemented to increase a participant’s ambulatory activity during the day, again with a $65 achievement incentive. The 10,000 steps program was not evaluated in this study because it was implemented only during the last 2 months of 2006, as opposed to the entire year for the other programs.
Fifth, a nurse consultation telephone line also was established for anyone at the University who wanted additional health information, regardless of whether they participated in the health promotion program or not. Because it was not specific to the health promotion program, the nurse line was not evaluated as a separate program, either.
Sixth, the University also made available to all employees an internet-based resource for written health and wellness information. Again, because it was not specific to the health promotion program, this component was not evaluated.
The WA was administered by StayWell Health Management of St. Paul, Minnesota and the LM and DM programs were originally administered by Harris HealthTrends of Toledo, Ohio, which is became part of Healthways, Inc, of Franklin, Tennessee in 2007. The 10,000 steps program was sponsored in part by HealthPartners, Inc, a local health plan which was one of the choices of health insurers for University employees. HealthPartners also paid for the distribution of free pedometers to participants.
Payments to the firms for these programs were made on a fixed basis for the WA program and on a marginal basis for the LM and DM programs, meaning that the health promotion firm received greater payments for enrolling additional employees. The marginal payment structure may have represented an incentive to enroll employees.
The disaggregated costs of the various components of the University of Minnesota’s program appear in Table 1. Although the effectiveness of the nurse line and internet-based health literature in reducing costs was not evaluated, these components were regarded as such integral parts of LM and DM programs that their costs were included in return-on-investment calculations. The nurse line, internet-based health literature and the WA cost the University $596,400 in 2006. The LM and DM cost the University $724,673 in 2006. In addition, internal programming and administration of the program cost $235,490 and the incentive payments for completing various components totaled $642,358 in 2006. The cost of the entire program totaled $2,198,921.
Expenditure data were drawn from individual claim level data for inpatient, medical, and prescription drug claims to University-sponsored health plans from 2004 to 2006. Expenditures were summed for each individual within each year. In keeping with our intent to calculate the economic return to the University from this investment, the expenditures included only the amount that was covered by the health plans or paid for out of the University’s self-insurance account. The expenditures, therefore, excluded costs that were paid for by claimants, such as insurance copayments or deductible amounts. Summations of expenditures relied on the date when services were rendered rather than the date when the claim was paid to determine the appropriate time period for each claim amount.
An eligibility file included some demographic information for individuals. It also contained information on whether individuals were covered by a qualifying health plan each month. This eligibility information allowed us to separate individuals who were covered by a qualifying plan and had no qualifying expenditures (zero expenditures) from individuals that were not covered in a given time period and therefore had no qualifying expenditures (missing expenditures). To reduce the variability in the expenditure variable, the analysis included only those who were enrolled and eligible for the entire year 2006. For the same reason, the sample was limited to those who had at least one entire year of expenditures in either 2005 or 2004. A total of 21,124 individuals were included in the analysis.
Program participation information was drawn from Harris HealthTrends data files. Individuals were counted as having participated in the WA if they were included in the Harris HealthTrends wellness information file. The number of people in the sample who completed a WA in 2006 was 5918. Participation in the LM and DM programs was more difficult to define. Individuals with one or more risks (the criterion for eligibility in LM) and one or more diseases/conditions (the criterion for eligibility in DM) were identified, as well as whether the individual enrolled in, completed, or disenrolled from the program, or whether the risk or condition was not addressed. For LM and DM, individuals were counted as having participated if they had participated in the program any way (enrolled in, completed, or disenrolled from) during 2006 for at least one risk or condition. This definition yielded a total of 1946 individuals in the sample who participated in LM in 2006 and 1044 individuals who participated in DM.
As stated, a DD regression analysis with random effects was used to account for selection. The dependent variable in these regression equations was the individual’s average monthly expenditures in 2004, 2005, and 2006. For those individuals enrolled during an entire year, the aggregated expenditures for that year were divided by 12. For those with partial year enrollments in 2004 or 2005, the expenditures for the year were divided by the number of months of enrollment.
Explanatory variables, such as gender, age group, presence of specific risk factors, overall risk score and presence of specific chronic diseases, were also included in the equations. The DD specification included qualitative (dummy) variables to account for 1) whether the program was in place at the time—a variable representing the year 2006 compared with years 2005 and 2004, 2) whether the enrollee had participated in the WA, LM or DM programs, and 3) the interaction between the two. The dummy variables representing the year accounted for any time trend on expenditures. The coefficient on the program participation variable captures the effect of any unobserved differences between participants and non-participants in their health care expenditures. The interaction term between participation in the program and the year 2006 captures the difference in effect of the health promotion program on expenditures for participants versus non-participants. This is the effect of the health promotion program intervention. Theoretically, a remaining source of selection bias still might exist if those who participated had a greater response to the program. If such a response bias existed, this statistical approach would not eliminate it.
Three separate analyses were performed, corresponding to the three components—WA, LM, and DM—of the health promotion program. Those individuals who would have been eligible to participate in a given intervention, but did not participate, were used as the comparison groups for each analysis.
There are well-known complexities associated with analyzing medical expenditure data. Specifically, the distribution of medical expenditures frequently is heavily skewed and typically includes a large number of zeros. Two-part models often are used to deal with the large number of zero observations. In a two-part model, a probit regression first is run to estimate the effect of the intervention on the probability of having medical expenditures greater than zero. Then a second regression model is run to estimate the effect of the intervention on the magnitude of medical expenditures, given that expenditures are greater than zero. Within a two-part model, the analyst still must correctly model the skewed distribution of medical expenditures. Manning and Mullahy address the best way to model skewed medical expenditure data and the necessity of using two-part models to deal with zero observations.28 We relied on the recommendations in this article in selecting appropriate models.
A random effect was used to capture variation in medical expenditures that was specific to the individual. The random effect is modeled as a random intercept, meaning that it modifies the constant term to provide an individual-specific intercept. These random effects are assumed to be normally distributed with a mean of zero and variance of ςB2. Use of such a model requires the assumption that the random effect is uncorrelated with any of the independent variables in the model.29 A Hausman test of the independence of random effects and the regressors in the model failed to reject the random effects specification.
The Hausman test and all random effects regressions were performed using Stata 9.2. The specific analytical approach used for each of the three interventions is discussed below.
Wellness Assessment Methods
The first analysis was of the effect of the WA on monthly expenditures, specifically, whether and by how much did those who filled out the WA in 2006 have lower monthly health care expenditures than those who did not fill out the assessment. The WA was primarily designed to identify those who would be invited to join the LM or DM programs by virtue of the presence of a modifiable risk factor or chronic disease. As a result, the WA in itself was unlikely to reduce health care costs. Nevertheless, it is possible that a beneficial effect on health care costs might have obtained if an employee realized that he or she had an improvable lifestyle condition or DM issue through the WA and undertook measures that reduced costs outside of participation in the LM or DM programs. The effect of the WA on costs was evaluated because of this possibility.
As mentioned earlier, the WA was completed by 5918 individuals from the sample in 2006. The 15,009 individuals in the sample who were eligible to complete a WA but did not were used as the comparison group. The probit equation used to predict the effect of completing the WA evaluation on the probability of having positive medical expenditures is specified as:
where Zit* is an unobserved index variable, corresponding to the first part of the two-part model described above. In the data, the observed value of Zit* (that is, Zit) takes on the value 1 if individual i had positive medical expenditures in year t, and 0 if he or she had no medical expenditures. X is a vector of k dummy variables for gender and for the age group that the individual falls into in each of the 3 years, with age 16 to 24 being the excluded category. The variables T5 and T6 are indicators for the years 2005 and 2006, respectively, with 2004 being the excluded category. DM, LM and WA represent whether the individual ever participated in each of the three main components of the health promotion program. The model also includes interactions between each of the participation variables and T6, the intervention-period variable. DM and LM variables are included as a control because many of the individuals that completed a WA also participated in either DM or LM programs. Including these other interventions would isolate the effect of completing a WA on health care expenditures from effects of the other components. Finally, εi is individual i’s specific random intercept and u is the normally distributed error term.
The second part of the two-part model estimates the effect of the WA on the magnitude of medical expenditures given that expenditures are positive. Manning and Mullahy28 provide a useful algorithm for choosing an appropriate procedure for estimating the positive medical expenditures. Following this algorithm, a preliminary generalized linear model (GLM) was fitted and residuals were obtained in both the log scale and the raw dollar scale. Examining the kurtosis of the log scale residuals showed that the tails of the distribution were somewhat thicker than the normal distribution, but not dramatically so. Manning and Mullahy28 found that thick tails result in a loss of efficiency in GLM models. A modified Park test showed the variance function to be quadratic in the raw scale prediction, indicating the appropriateness of both 1) a GLM using a log link function and the gamma family and 2) an ordinary least squares (OLS) regression on the log of expenditures.30 OLS regression on the log of expenditures, however, requires a retransformation to dollars which may result in bias in the presence of heteroscedasticity.28 Both the GLM and OLS model using the log of expenditures were estimated.
Equation 2 shows the model estimated for the second part of the two-part model.
Equation 2 is identical to Equation 1, except that here E is either the natural log of average monthly expenditures in the case of the OLS model or average monthly expenditures in dollars for the GLM estimations.
In this two-part model, the marginal effect of the program would be calculated as the mean difference between 1) predicted expenses when the interaction term is set to zero and 2) predicted expenses when the interaction term is set to one. Predicted expenditures would be calculated by multiplying the predicted probability that an individual has positive expenditures by the predicted expenditures given that they are positive.
Lifestyle Management Methods
A total of 1946 people in the sample participated in the LM intervention during 2006. The comparison group was comprised of 77 individuals who were identified as having at least one health risk but did not participate at all in the LM program. The number of observations with zero expenditures was lower, approximately 6%, than in the first analysis. Buntin and Zaslavsky31 suggest that GLMs account well for this proportion of observations with zero expenditures, making a two-part model unnecessary. The Manning and Mullahy algorithm28 was applied and it suggested that the most appropriate model would be a GLM with a log link function using the gamma family to describe the variance function. This model was fitted, as well as an OLS model on log expenditures, for observations with positive expenditures.
For those invited to participate in LM, the equation estimated was
where Eit in Equation 3 is average monthly health care spending by LM program invitee i in year t, and X represents k observed explanatory variables including gender, the series of age range categories, and dummy variables for each of the risk factors identified for that individual in the WA survey in 2006. The coefficients on the T’s capture the time trend in expenditures, the coefficient on LM reflects the effect that participation in this program had on expenditures over all 3 years, and the coefficient on the interaction term captures the effect of participation in the LM program in the year it was implemented on spending—the result of interest. The individual’s specific random intercept is εi and u is the error term. It should be noted that because changing one’s lifestyle is unlikely to occur immediately or to generate immediate health effects when it does occur, participation in the LM program is unlikely to have a noticeable effect on health care expenditures in the initial year.
Disease Management Methods
A total of 1044 individuals participated in some fashion in the DM program. The comparison group for this analysis consisted of 333 individuals identified as having at least one disease or condition that qualified for DM but who did not enroll at any time in that program. Less than 4% of observations had zero expenditures. Again, the model selection algorithm28 suggested that the most appropriate model would be a GLM with a log link and using the gamma family. This model was fitted, as was the OLS model on the natural log of expenditures.
For those who were invited to participate in DM, the regression equation was specified as:
The variables in Equation 4 correspond to those in Equation 3, except that DM participation is substituted for LM participation and the various disease/condition categories are included (diabetes being the excluded category) as qualitative explanatory variables in addition to the gender and age categories. The only component of the health promotion program that appears to be capable of effective health care cost reduction in the initial year is the DM component. This is because better management of a disease or condition often is focused on reducing management errors that would directly lead to additional health care use.
As mentioned, for all three of these analyses, the predicted expenditures could be calculated directly in dollars from the GLM model, but the OLS model on log expenditures first requires a retransformation to dollars. This retransformation necessitates the calculation of a smearing factor which is consistent if the error term is homoscedastic but biased in the presence of heteroscedasticity.28 A two-smearing-factor retransformation was employed in our analyses, as suggested by Buntin and Zaslavsky31 with one-smearing-factor for individuals in the top decile of expenditures and a second smearing factor for the remainder of individuals. Because of the presence of heteroskedasticity, the two-smearing-factor retransformation would improve the robustness of the estimates of the marginal effect of health promotion when the OLS model was used.
Wellness Assessment Results
Table 2 shows the descriptive statistics for the sample of those who were employed and enrolled in a health plan during all of 2006 and at least one entire previous year. About 28% of those in the sample completed the WA, and these participants represented only employees (no spouses or dependents) in 2006. Of those who completed the WA, one-third participated in the LM program. The 1044 who participated in the DM program were partially identified by the WA and partially by claims data. Those who participated in the LM and DM programs tended to be older. Being stressed was the most common health risk and diabetes was the least. Health risks generally were more prevalent for those who participated in the programs compared with those who simply completed the WA, and their overall risk scores were worse.
The effect of completing the WA on whether or not the employee had any health care expenditure at all is presented in Table 3. The probit equation shows that those who eventually chose to complete the WA were significantly more likely to have any expenditure at all, but that the effect of the WA (in the year it was completed) on the likelihood of any expenditure was not significantly different from zero. Those who eventually participated in the DM and LM programs also had a significantly greater likelihood of any expenditure at all, as did females and those in the higher age groups.
Table 4 shows the random effects DD results for the amount of the expenditure, for anyone with any expenditure at all. The choice to participate eventually in the WA was significantly associated with lower expenditures in the GLM equation, but the effect of the WA program itself on expenditures was not significantly different from zero. Females, those advanced in age, and eventual participants of the DM program had significantly higher expenditures, consistent across the two equations.
Lifestyle Management Results
Table 5 shows the descriptive statistics for those who were invited to join the LM program. Those who participated were more likely to be male than those who did not participate, although females dominated both groups. Those who participated were more likely not to have diabetes and not to smoke than those who did not participate, and those who participated had better total risk scores.
Table 6 presents the DD regression results, again for both OLS and GLM estimation procedures. Eventual participation in an LM program was associated with significantly lower health care expenditures in the OLS equation. The LM program itself seemed to increase expenditures, but the coefficients were insignificant in both equations. Table 6 also shows that for those invited to join the LM program, higher expenditures were associated with being female, advancing age, expenditures in 2005 rather than 2004, and the health risks of asthma, high cholesterol, diabetes, poor nutrition, and lack of physical activity.
Disease Management Results
Table 7 shows the descriptive statistics for those who were identified either through the WA or by claims data as possessing any of 33 specific chronic diseases or conditions. About three-fourths of those identified as having a chronic disease or condition participated in the DM program.
Of the 33 diseases or conditions, information on only 11—diabetes, asthma/pulmonary disease, coronary artery/cardiovascular disease, congestive heart failure, arthritis, depression, osteoporosis, musculo-skeletal diseases, low-back pain, migraines and gastro-intestinal disease—were collected from both those whose diseases/conditions were identified by the WA and by a review of their medical claims. Whether the claims-identified DM invitees had each of other 22 diseases or conditions is not clear and as a result, the percentages of DM invitees in our sample who had each of these diseases/conditions (as listed in Table 7) are suspect. As a result of this issue, these 22 disease/condition variables were excluded from the regression analyses.
The participation rate in the DM program was greater for those who were identified as having a chronic disease as a result of completing the voluntary WA, compared with those who were identified by a review of claims data. Anecdotal evidence suggests that some of those who were identified by claims data complained about it as an invasion of privacy and may not have participated in the DM program because of anger over the identification process. Those who participated tended to be younger and female.
Table 8 shows the DD results. Those who eventually chose to participate in DM had insignificantly lower health care expenditures, and the effect of the DM itself was to reduce expenditures significantly in the OLS equation. Calculating the effect of DM on costs from the significant coefficient, participation in the DM program would be expected to reduce health care expenditures by $1391 a year for each DM participant. If the insignificant GLM coefficient were used in the calculations, only about $1375 would be saved per participant per year. Table 8 also shows greater expenditures were significantly associated with being female, advancing age, and expenditures in 2006 as opposed to 2004. A number of diseases were found to have significant expenditure effects.
The health promotion program at the University of Minnesota is a multi-component program that was designed as an ongoing program for which 2006 was the initial year. Of the components evaluated, we found that only DM significantly reduced costs, and only according to the OLS model. The OLS estimators are more efficient than the GLM estimators, which may explain the significance. The issue with the OLS results, however, is the accuracy of the retransformation from log dollars to dollars. Using a single smearing factor to retransform is unbiased when the error is homoskedastic, but we found heteroskedasticity in the error. Although our two-smearing-factor retransformation likely reduced the bias, it may not have eliminated it. Therefore, even though the GLM coefficient was not significant, the GLM-based estimate of the cost savings is probably more accurate than the OLS-based estimate, and so the $1375 per enrollee per year from the GLM equation was used to calculate the return-on-investment rather than the $1391 per year from the OLS equation.
Although only 1044 employees were used in the analysis under the more restrictive data requirements for inclusion, in all there were 1253 employees who participated in the DM program in 2006. If DM saved $1375 in expenditure for each participant in 2006, it would have resulted in an estimated cost savings of about $1,722,875. Comparing the cost savings from those who actually participated to the $2,198,921 cost of the health promotion program in 2006, the program would have generated a loss of about $476,046, based on these point estimates. There are, however, a number of factors that would qualify these conclusions.
First, DM may have had a positive effect on University of Minnesota in ways that are independent of the reduction of health care expenditures. If the DM reduced absenteeism or increased productivity, and these effects were evaluated in dollars, the inclusion of these effects may have contributed to a more positive return-on-investment for the University. An investigation of these effects was, however, beyond the scope of the present study.
Second, DM also may have increased the health of participants. The increase in health may have value to the individual that is independent of the health care cost savings and increased productivity that would be part of the University’s return-on-investment calculations. Again, the effect of DM on health was beyond the scope of this study.
Third, the LM program at the University of Minnesota may not be as effective not only because of insufficient time has passed, but also because of the population being studied. Minnesota often ranks at or near the top of state comparisons of measures of health and healthy lifestyle, and anecdotally, there is perhaps no more health-conscious subgroup in the state than University employees. Therefore, the University of Minnesota may not have been as fertile an environment for LM as other worksite settings, helping to explain the lack of significant LM results.
Fourth, some of those who joined the DM program did so in March, April or May of 2006, when a portion of the intervention year had already past. For this reason, the results may understate the impact of a program implemented over the entire year.
Fifth, Table 7 showed that 18% of the participants in the DM program were identified as being eligible by claims data, whereas 54% of the non-participants were identified as being eligible through review of claims. This evidence is consistent with anecdotes suggesting that some of those who were identified as eligible for DM by a review of claims data regarded it as an invasion of privacy, and because of their anger at this identification method, chose not to participate in the DM. With $1375 in cost savings at stake for each eligible employee who becomes a DM participant, rethinking the process by which employees are identified and notified of their DM-eligible status may result in substantial cost-savings gains. For example, if all the non-participants who had been invited to join the DM program had participated (and had complied with the program as well as those who chose to participate did), the additional 434 participants would have increased the cost savings by $596,750. An increase in participation of this magnitude would have resulted in the University realizing a positive return on investment.
Sixth, some of the costs attributed to the health promotion program in Table 1 are likely overstated. For example, the cost of the nurse line was $360,893. While a portion of the costs of the nurse line could legitimately be allocated the health promotion programs because the nurse line helped to generate the change in behavior that resulted in a reduction of health care expenditures, a portion of the nurse line costs covered services that were available to employees who did not participate in the health promotion program. No data was collected on whether the callers were health promotion program participants or not, therefore, the portion of costs to attribute to non-participants is not known. This is true for other line items in Table 1, including the internet-based health literature ($84,928), the management of the programs ($36,850), the other miscellaneous category ($5144), and the internal programming and administration ($235,490). As a result, the return-on-investment analysis is likely to be a conservative estimate of the portion of program costs that were returned by a reduction in expenditures generated by the effectiveness of the DM program.
Beyond determining whether a return-on-investment could be demonstrated for the University of Minnesota’s worksite health promotion program, this article also demonstrated the usefulness of the random effects DD estimation procedure in eliminating selection bias. This procedure seems to be superior to the propensity score matching procedure and equally as effective as the instrumental variable approach.23–27
Reviews of the health promotion literature suggest that such programs generally result in a positive return-on-investment. A central issue in this literature is whether researchers effectively have accounted for selection bias in those studies that use data from programs where participation is voluntarily. The reviewers of this literature, however, have characterized the studies as being of uneven methodological quality with regard to how successfully they deal with selection.
This article sought to estimate the effect on health care expenditures of the health promotion program adopted by the University of Minnesota in 2006. A DD regression equation model with random effects was used to investigate the impact of this health promotion program on health care expenditure and account for possible selection bias. It was found that, although the DM program reduced health care expenditures, the program did not reduce them sufficiently to yield a positive return-on-investment. A number of factors, however, qualify this conclusion.
The study was reviewed for human subjects content by the University of Minnesota’s Institutional Review Board as study number 08805E32782 and was found to be exempt from review under guidelines 45 CFR Part 46.101(b) category 4, Existing Data; Records Review; Pathological Specimens.
This paper has benefitted from the comments of Dann Chapman, Karen Chapin, Ted Butler, an anonymous referee, John Harris and other participants of a meeting and conference call with Healthways staff, and the participants of the Administrative Working Group seminar at the University of Minnesota.
Any remaining errors or oversights are the responsibility of the authors alone.
This study was funded by Employee Benefits, Office of the Vice President for Human Resources, University of Minnesota.
1. Goetzel RZ, Anderson DR, Whitmer RW, et al. The relationship between modifiable health risks and health expenditures. J Occup Environ Med. 1998;40:843–854.
2. Pronk NP, Goodman MJ, O’Connor PJ, Martinson BC. Relationship between modifiable health risks and short-term health care charges. JAMA. 1999;282:2235–2239.
3. Leutzinger JA, Ozminkowski RJ, Dunn RL, et al. Projecting future medical care costs using four scenarios of lifestyle risk rates. Am J Health Promot. 2000;15:35–44.
4. Goetzel RZ, Hawkins K, Ozminkowski RJ, Wang S. The health and productivity cost burden of the “Top 10” physical and mental health conditions affecting six large U.S. employers in 1999. J Occup Environ Med. 2003;45:5–14.
5. Goetzel RZ, Long SR, Ozminkowski RJ, Hawkins K, Wang S, Lynch W. Health, absence, disability, and presenteeism cost estimates of certain physical and mental health conditions affecting U.S. employers. J Occup Environ Med. 2004;46:398–412.
6. Pelletier KR. A review and analysis of the health and cost-effectiveness outcome studies of comprehensive health promotion and disease management programs. Am J Health Promot. 1991;5:311–315.
7. Pelletier KR. A review and analysis of the health and cost-effectiveness outcome studies of comprehensive health promotion and disease management programs at the worksite: 1991–1993 update. Am J Health Promot. 1993;8:50–61.
8. Pelletier KR. A review and analysis of the health and cost-effectiveness outcome studies of comprehensive health promotion and disease management programs at the worksite: 1993–1995 update. Am J Health Promot. 1996;10:380–388.
9. Pelletier KR. A review and analysis of the clinical and cost-effectiveness studies of comprehensive health promotion and disease management programs at the worksite: 1995–1998 update (IV). Am J Health Promot. 1999;13:333–345.
10. Pelletier KR. A review and analysis of the clinical- and cost-effectiveness studies of comprehensive health promotion and disease management programs at the worksite: 1998–2000 update. Am J Health Promot. 2001;16:107–116.
11. Pelletier KR. A review and analysis of the clinical and cost-effectiveness studies of comprehensive health promotion and disease management programs at the worksite: update VI 2000–2004. J Occup Environ Med. 2005;47:1051–1058.
12. Aldana SG. Financial impact of health promotion programs: a comprehensive review of the literature. Am J Health Promot. 2001;15:296–320.
13. Chapman LS. Meta-evaluation of worksite health promotion economic return studies: 2005 update. Am J Health Promot. 2005;19:1–11.
14. Goetzel RZ, Ozminkowski RJ, Villagra VG, Duffy J. Return on investment in disease management: a review. Health Care Financ Rev. 2005;25:1–19.
15. Goetzel RZ, Ozminkowski RJ. The health and cost benefits of work site health-promotion programs. Annu Rev Public Health. 2008;29:303–323.
16. Concato J, Shah N, Horwitz RI. Randomized controlled trials, observational studies and the hierarchy of research designs. N Engl J Med. 2000;342:1887–1892.
17. Benson K, Hartz A. A comparison of observational studies and randomized, controlled trials. N Engl J Med. 2000;342:1878–1886.
18. World Health Organization. European Working Group on Health Promotion Evaluation, Health Promotion Evaluation: Recommendations to Policymakers. Geneva: World Health Organization; 1998. Cited in Pelletier K. J Occup Environ Med. 2005;47:1051–1058.
19. Ozminkowski RJ, Goetzel RZ. Getting closer to the truth: overcoming research challenges when estimating the financial impact of worksite health promotion programs. Am J Health Promot. 2001;15:289–295.
20. Rosenbaum P, Rubin D. Reducing bias in observational studies using subclassification on the propensity score. J Am Stat Assoc. 1984;79:516–524.
21. Robins J, Mark S, Newey W. Estimating exposure effects by modeling the expectation of exposure conditional on confounders. Biometrics. 1992;48:479–495.
22. Drake C, Fisher L. Prognostic models and the propensity score. Int J Epidemiol. 1995;24:183–187.
23. Heckman J. Shadow prices, market wages, and labor supply. Econometrica. 1974;42:679–694.
24. Heckman JJ. Sample selection bias as a specification error. Econometrica. 1979;47:153–160.
25. Heckman J, Ichimura H, Smith J, Todd P. Characterizing selection bias using experimental data. Econometrica. 1998;66:1017–1098.
26. Stukel TA, Fisher ES, Wennberg DE, Alter DA, Gottlieb D, Vermeulen MJ. Analysis of observational studies in the presence of treatment selection bias: effects of invasive cardiac management on AMI survival using propensity score and instrumental variable methods. JAMA. 2007;297:278–285.
27. Bertrand M, Duflo E, Mullainathan S. How much should we trust difference-in-differences estimates? Q J Econ. 2004;119:249–275.
28. Manning WG, Mullahy J. Estimating log models: to transform or not to transform? J Health Econ. 2001;20:461–494.
29. Greene WH. Econometric Analysis. 5th ed. Upper Saddle River, New Jersey: Pearson Education Inc; 2003.
30. Park R. Estimation with heteroscedastic error terms. Econometrica. 1966;34:888.
31. Buntin MB, Zaslavsky AM. Too much ado about two-part models and transformation? Comparing methods of modeling medicare expenditures. J Health Econ. 2004;23:525–542.