Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
C-1 A P P E N D I X C Regression Models for Indianapolis, Indiana; Rochester, New York; and Louisville, Kentucky C.1 Limitations of Regression Analysis The results of the Indianapolis Public Transportation Corporation (IndyGo), Rochester Regional Transit Service (RTS), and Transit Authority of River City (TARC) health promotion program evaluations, combined with employee data, could not substantiate any statistically significant improvements on the health outcomes for participants in health promotion programs as compared with non-participants when it comes to data-driven benefits, measured in this study as changes in absenteeism, workersâ compensation, and turnover. The results of the Des Moines Area Regional Transit Authority (DART) health promotion program did show a statistically significant effect associating participation in the program with reduced absenteeism. The majority of studies have not shown that wellness programs affect the bottom line, as measured by absenteeism and other metrics. A study conducted by the National Bureau of Economic Research of 4,800 university employees showed statistically insignificant differences between those who participated in an incentivized wellness program and those who did not, as measured by sick days and healthcare spending (Jones, Molitor, and Reif 2019). Other studies have also found behavioral changes among participants in healthcare (e.g., rate of hospitalization) but no reduction in overall costs to the employer (Gowrisankaran, et al. 2013). Other analyses reviewing ROIs have found mixed results among health promotion programs (Baxter, Sanderson, Blizzard, and Palmer et al. 2014). The lack of statistical significance may be due to several factors. First, variables measuring individual health improvements may not be appropriate to obtain a representative measure of program impacts. Tracking sick days is an obvious potential metric, but it may be that all workers take their allotted sick time, regardless of their health. Assuming employees use all of their days, the metric would then measure the number of sick days an employee was granted rather than reflect an employeeâs actual healthâor the number of days an employee was sick. Possible improvements to this metric involving the addition of paid personal and/or unpaid days to sick days do not necessarily contribute toward a reliable measure. Employees should have every incentive to take any kind of paid leave (e.g., personal leave in addition to sick leave), and unpaid leave could be the result of a one-off period used for a family emergency not related to the employeeâs personal health. Workersâ compensation could be another promising potential metric, but it has done a poor job for many reasons, including the fact that only a small percentage of the population receives it (5%â8% annually, from IndyGo data). The infrequency of use of workersâ compensation can
C-2 Improving the Health and Safety of Transit Workers with Corresponding Impacts on the Bottom Line time, infrequency coupled with a small population would create high variability from year to year and the downward trend might not be detected. It is possible that an agencyâs wellness program may have been substantially beneficial, but in addition to the benefits not being picked up by the metrics, the effects of the program may have been diffused. For example, the benefits may include improvements in sleep, lower levels of pain when walking or conducting other activities, less fatigue during the day, and more overall energyâthat is, benefits that are spread out to different areas but do not register enough in any one metric to be detected by observable variables. Also, the program may not have been in place for a long enough time to pick up these diffuse effects. The programs in these case studies were not designed to measure changes in the risk factors of smoking and weight loss, which exact a toll on healthcare costs. Currently, these are intermediate variables that may be measured in outcome variables from annual health insurance claim statements (e.g., those related to respiratory illness or cardiovascular illness). The benefit of an agency program may be an infrequent event with a large payoff. For example, if a blood pressure screening prevents a potential death, then even one prevention every 20 years would be extremely valuable; however, such an event may not happen during the observed period (in the case of the 5â6 years of the IndyGo and RTS programs). Thus, the duration of the study period was inadequate to capture all the benefits that might accrue to the program. A program may have been substantially positive, but taken individually, the measurable effects may not have been strong enough to show a significant result, even using ideal health promotion metrics (if they were available). In other words, the participants may have become healthier, but the improvements may not have manifested in ways measured (or easily measurable) by the program. The program may have had positive spillover effects outside the workplaceâfor example, by improving the quality of leisure timeâor it may have contributed to better health habits of family members. This might explain self-reported results, such as surveys of employers and employees that report positive outcomes from the wellness programs when the data-driven metrics do not reflect this. Self-selection into a program also may affect the estimated effects. Statisticians are always on the lookout for potential self-selection issues. The hope is that unhealthy workers would select into a program at a higher rate than healthier workers, meaning that the program would get more positive results for a given expense of resources than would otherwise be expected if participants were randomly chosen to participate. The healthier workers might feel healthy enough to forego any health promotion program; therefore, participants would have large gains on a per employee basis. Instead, it seems that the reverse might happen: Employees who are already healthier may have taken advantage of the program, whereas less-healthy employees failed to participate or may even have resisted participating. The reasons for employee participation or non-participation in voluntary programs are variable, and a detailed analysis of those motivations was not within the scope of this research project. This will be a continuing research question to explore in future case studies where participation in the health promotion programs is voluntary. create fluctuations from year to year in the aggregate compensation, particularly if the population is not large enough to smooth this sum out; were there a downward trend in compensation over
Regression Models for Indianapolis, Indiana; Rochester, New York; and Louisville, Kentucky C-3 employees who do not opt in for insurance do not receive access to the program and act as the control group. As Bushnell notes, those who obtain insurance through their spouse may work for employers offering less generous health benefits and may have different working conditions and health characteristics than those who have spouses with more generous plans (Bushnell, Li, and Landen 2011). Therefore, it is not necessarily the case that those employees who could best take advantage of the programs are the ones who actually do so. If opting for insurance is not correlated with initiative, drive, and enthusiasmâwhich it likely is notâthen the IndyGo non-participants are closer to a classic control group. The analysis would pick up the pure effect of the program. This is good for gauging the effect of the program compared with programs in other localities, but it does not necessarily indicate that the employees who are most likely to benefit from the program participate in it. Any estimated effects are less likely to be detected as statistically significant. For the RTS groups examined, the participants were the onsite employees who had access to the programs and the gym at the headquarters location, whereas the control group was made up of offsite employees, based on the assumption that they had less access to those options at headquarters. However, offsite employees may have higher rates of utilization of health and wellness activities than was assumed. The higher the offsite groupsâ utilization, the less likely it is that the true effects of the health promotion programs can be isolated, measured, and deemed a statistically significant improvement. For TARC and DART, participation in the wellness program appeared to be correlated with enthusiasm for the programâs activities, which could mean that the employees who benefited from the program were not the ones who would have received the greatest marginal gain from participation. In other words, self-selected participants were likely more fit to begin with. Such self-selection can be interpreted as a contributing factor for participants in workshops like those at DART or in more rigorous activities at TARC, like boot camps. An analysis by RAND found some meaningful improvements in exercise frequency, smoking, weight loss, and cholesterol control; however, the analyses could not âaccount for unobservable differences between participants and non-participants such as differential motivation to change behavior,â however (Mattke et al. 2013). Thus, the factor of self-selection could play a role in a result indicating that program participants were more likely to improve their health if these participants were the ones that were more motivated to change behavior. Even if a factor of self-selection is related to participation in programs, the gains in outcomes that the programs achieve are material only for those participants. These gains may not, therefore, be associated with the targeted populations most in need of marginal gains (i.e., the population with the most at-risk factors or prevalence of conditions). C.2 Regression Modeling Linear regression models designed to test whether a factor changes an outcomeâhere, whether wellness programs improve healthâfollow a similar construction. That construction was applied to four of the five metropolitan transit agencies analyzed in this study. The aggregated data from California was examined using a different approach, so it is not addressed in this appendix. There are self-selection issues specific to the populations at the different agencies. For IndyGo, an employee receives access to the health promotion program if he or she opts in for insurance. Those
C-4 Improving the Health and Safety of Transit Workers with Corresponding Impacts on the Bottom Line 1. The observations that make up the data were at the employee level. That is, each âobservationâ represented an employee of the transit agency, regardless of whether that employee did or did not participate in a wellness program. 2. Following standard practice, the dependent variable was a measure of the health differences before and after the program was initiated (again, regardless of whether the employee did or did not participate in any of the programs). 3. The key independent variable in the model indicated if (and/or the extent to which) the employee participated in the available program(s). If that variable was statistically significantly different from zero, then that was an indication that the wellness programs benefited employee health. Factors that are not included will be unobserved by the regression model, which can distort results. Therefore, a carefully designed model includes (that is, controls for) as many factors as possible. In the case of this project, the regression model included as many demographic, geographic, and economic factors as feasible so that effects on health that were due to these factors were not incorrectly credited (for good or bad) to whether or not the employee participated in the wellness program. For example, if younger employees are more likely to participate in a wellness program and young peoplesâ outcomes also remain stable for approximately 10 years, then if age is not controlled for (i.e., if age is not included as a variable in the regression), the program will appear less effective than if there had been a more representative mix of people by age. In this research, if a wellness programâs effects were expected to differ by age, then an interaction term of the indicator variable times age was added to the model as an independent variable to account for that effect. The all-new primary source data collected on employee wellness program participation, health conditions, and demographic characteristics were relatively similar across the four major metropolitan transit agencies examined. Information was available on sick days (and other absentee days) and on workersâ compensation, which allowed the project team to construct a dependent variable for changes in health before and after the programs were initiated. Information on the employees who did and did not participate in the programs also was collected for use in constructing the key independent variable on program participation. Information collected on age, gender, and race of each employee served as controlsâindependent variables that allowed the true program effect on health to be isolated. C.3 Indianapolis (IndyGo) The model controlled for demographic factors that affect health to isolate the effect due to the wellness program. The model also employed a dependent variable that measured a potential health improvement due to the wellness programâthe change in absent hours before and after the initiation of the program. Many potential independent variables were considered for inclusion in the regressions presented, as were interactions with the key independent variable measuring program participation. After considering various possible linear regression equations, two representative equations were selected to control for (1) age and gender and (2) age, gender, and race. (The second equation Three key features of the analysis were:
Regression Models for Indianapolis, Indiana; Rochester, New York; and Louisville, Kentucky C-5 included an additional variable to capture effects of race.) Table C-1 presents the results of the two regressions. Table C-1. Effect of health promotion program on absent hours, IndyGo. Ordinary Least Squares (OLS): Change in Hours Estimate t-Stat Estimate t-Stat Intercept â40.6 â0.59 â44 â0.64 Ever in Program 40.5 1.77 37.9 1.65 Age 0.6 0.53 0.9 0.76 Female 15.1 0.64 7.2 0.29 White â35 â1.21 Observations 252 252 R2 0.013 0.027 Adjusted R2 0.002 0.016 F-Statistic 1.131 2.445 In both equations, the dependent variable was the change in absent hours from 2011 (the year before the program was instituted) to the most recent full year that the worker was employed (through 2017). The key independent variable in the equation was whether the employee was ever in the program. The interaction terms were designed to detect effects due to the program, controlling for age, gender, and (in the second regression presented) race. In this model, the base consisted of workers of African-American descent (making up 77% of the population), and workers of Asian descent, Native American descent, or âtwo or moreâ races/ethnicities (1%). âWhiteâ was used as the 0â1 race variable. âMaleâ (65% of the population) was used as the base, and âFemaleâ was used as the 0â1 gender variable. No variable in the model, including the intercept, was statistically significantly different from zero at the 10% level (t-stat for a one-sided 90% confidence level is 1.28). The key variable had t-stats of 1.77 and 1.65 (significant at even the one-sided 95% level), but they were the incorrect sign, indicating that participation in the program increased absentee hours. Even the F-Statistic, which measures the significance of the regression in its entirety (meaning that if even one of the dependent variables is significant, the regression will be significant) was insignificant (1.13). One other variation on the above pair of regressions was presented. The two runs shown in Table C-2 have the same independent variables as those listed above, but the dependent variable is slightly altered. In Table C-1, the dependent variable was the change in absent days from 2010 to the most recent year. In Table C-2, it is the change from 2010 to 2013 (the first full year after the program was established). The rationale was to try and isolate the change that happened soon after establishment of the program but before all âunobservablesâ about an employee (the error term) became notably different from 2010. As many other unobserved factors may have come into play from 2011 to 2017, using absentee days from 5 years later introduced uncertainty into the estimateâmeaning the estimate was much less likely to be statistically significant.
C-6 Improving the Health and Safety of Transit Workers with Corresponding Impacts on the Bottom Line Table C-2. Effect of health promotion program on absent hours, 2010â2013. OLS: Change in Hours Estimate t-Stat Estimate t-Stat Intercept 38.2 0.59 35.2 0.55 Ever in Program 17.0 0.79 14.8 0.69 Age 0.7 0.68 0.9 0.88 Female 29.3 1.33 22.6 0.99 White â29.7 â1.09 Observations 252 252 R2 0.009 0.014 Adjusted R2 â0.003 â0.002 F-Statistic 0.771 0.878 The results were not much different using this variation on the dependent variable. The key variable of whether the employee was in the program was still statistically insignificant from zero at the 10% level. Only âFemaleâ was significant at the 10% level (the t-stat of 1.33 was greater than one-sided 90% confidence level cutoff of 1.28). This second set of regressions demonstrated the robustness of the result that none of the independent variablesâin particular, the program- participation variablesâwere statistically significant. C.4 Rochester (RTS) The results from Rochester on the detection of a wellness program effect on health were similar to those for Indianapolis. The model controlled for demographic factors (age, gender, race) which affect health to isolate the effect due to the wellness program. The model employed a dependent variable that measured a potential health improvement due to the wellness programâthe change in absent hours before and after the initiation of the program. Table C-3 presents the regression. Table C-3. Effect of health promotion program on absent hours, RTS 2011â2017. OLS: Change in Hours Estimate t-Stat Intercept 28.2 3.63 Onsite -2.0 â0.73 Age -0.5 â4.04 Female -2.8 â0.97 White 5.3 1.88 Observations 466 R2 0.036 Adjusted R2 -0.027 The dependent variable was the change in absent hours from 2013 (the year before the program was instituted) to the most recent full year that the worker was employed (through 2017). The key independent variable in the model was whether the employee was onsite, because onsite employees had access to the programs and the gym at the headquarters. This variable was designed to pick up any effect due to the program, controlling for age, gender, and race. African-American workers (61% of the population) were kept as the base, and âWhiteâ was used as the 0â1 race variable. âMaleâ (72% of the population) was used as the base, and âFemaleâ was used as the 0â1 gender variable. The results showed that the coefficient for onsite was negative, indicating that onsite employees (in the model, the participants in the wellness program) were associated with 2.04 fewer hours of absentee leave than offsite employees. The t-
Regression Models for Indianapolis, Indiana; Rochester, New York; and Louisville, Kentucky C-7 statistic for this coefficient was not significant at the 10% levels of significance, however, so the results were not substantiated. None of the variables in the model were statistically significantly different from zero except for oneâageâfor which the estimated effect was in the wrong theorized direction. With a t-statistic of -4.04, the age variable was estimated such that an employee was associated with 0.52 fewer hours of sick leave in a year for each additional year of age. In this analysis, the F-Statistic was insignificant (0.71). The F-Statistic measures the significance of the regression in its entiretyâmeaning that if even one of the independent variables is significant, the regression will be significant. The insignificance of any of the independent variables could have been for any combination of the reasons listed in this appendix, especially as the offsite employees may have had a higher rate of program/gym utilization than what was assumed (which is zero). C.5 Louisville (TARC) A wealth of descriptive statistics were culled from the primary source data on the wellness program participation, health, and demographics of TARC employees. Beginning with the health information collected, Figures C-1, C-2, and C-3 compare the differences in average annual absenteeism over time by participation (or not) for three wellness programsâboot camp, Humana Go level, and bioscreen attendance. All three graphs present the average annual absentee hours for frontline employees for the four years from 2015 to 2018. Figure C-1 shows that frontline workers who participated in boot camp had a much lower average total number of hours of absenteeism than nonparticipants. Boot camp attendees averaged approximately 20 hours of annual sick leave, whereas workers who did not attend averaged three times that (60 hours). Only a small percentage of workers participated in the boot camp, however: the number of participants was 11 in 2015, and 12 in each year from 2016 through 2018, while the number of employees increased from 305 to 365 between 2016 and 2018. Consequently, participation was between 3% and 4% for each of the four years (2015, 2016, 2017, and 2018). Figure C-1. Average annual total absentee hours, frontline employeesâ boot camp participation, Louisville, Kentucky, 2015â2018. 0 10 20 30 40 50 60 70 80 90 2015 2016 2017 2018 Boot Camp Participation No Boot Camp Participation Grand Total
C-8 Improving the Health and Safety of Transit Workers with Corresponding Impacts on the Bottom Line Figure C-2 compares average annual total absentee hours between frontline employees with a high Humana Go level (blue) and employees with a baseline Humana Go level (red). The graph shows little difference in the absentee levels between the two levels. Figure C-2. Average annual total absentee hours, frontline employeesâ Humana Go level, Louisville, Kentucky, 2015â2018. Figure C-3 compares the average annual total of absentee hours between frontline employees who attended a bioscreen (blue) and those who did not (red). This graph shows little difference between the two groups until 2018, when the average for those attending (60 hours) is 25% less than those who did not (80 hours). Figure C-3. Average annual total absentee hours, frontline employeesâ bioscreen attendance, Louisville, Kentucky, 2015â2018. Three distinct multiple regression equations were developed, each with an independent variable to isolate the possible health effects of the different wellness programs: (1) Humana Go level, comparing frontline employees with the baseline (blue) status and higher level statuses (bronze and silver); (2) boot camp attendance, comparing frontline employees who attended at least one boot camp with those who attended none; and (3) bioscreen attendance, comparing frontline employees who went to at least one bioscreen and those who attended none. The models contained observations at the employee level, the change in the number of absentee hours from before to after program participation as the dependent variable, and controlled for age, gender, and race. 0 10 20 30 40 50 60 70 80 90 2015 2016 2017 2018 High Humana Go Level Low Human Go Level Grand Total 0 10 20 30 40 50 60 70 80 90 2015 2016 2017 2018 Attended Bioscreen No Bioscreen Attended Grand Total
Regression Models for Indianapolis, Indiana; Rochester, New York; and Louisville, Kentucky C-9 The health program was introduced in January 2016. To capture any lagged effects, three time periods were compared in creating the dependent variables for the Humana Go level and boot camp equations. They were (1) 2016â2015, to compare the year of program introduction to the baseline; (2) 2017â2016, to compare the year after the introduction to the year of introduction; and (3) 2017â2015, to compare the year after introduction to the baseline. For the bioscreens, time periods were pre- and post-2017 Quarter 2, because the first bioscreen was introduced at that time. Table C-4, Table C-5, and Table C-6 show the results of these runs. Any factors significantly different than zero at the one-sided 90% level of confidence (t-stats greater than 1.28) that are the correct theorized sign are denoted in bold. Table C-4. Effect of Humana Go level on absent hours, TARC 2016â2015, 2017â2016, and 2017â2015, OLS. Dependent: Change in Absent Hours (1) 2017â2015 (2) 2016â2015 (2) 2017â2016Estimate t-Stat Estimate t-Stat Estimate t-Stat Intercept -38.5 -0.48 118.3 1.2 60.4 0.6 Humana Go Level 63.3 1.55 -51.7 -1 19.9 0.39 Age 2.4 1.56 -1.1 -0.59 1.3 0.68 Male -46.6 -1.51 -44.1 -1.15 -74.7 -1.96 White -42.5 -1.33 -1 -0.02 -41.2 -1.04 Observations 361 386 361 R2 0.026 0.008 0.021 Adjusted R2 0.015 -0.003 0.01 Table C-5. Effect of boot camp participation on absent hours, TARC 2016â2015, 2017â2016, and 2017â2015, OLS. Dependent: Change in Absent Hours (1) 2017â2015 (2) 2016â2015 (2) 2017â2016Estimate t-Stat Estimate t-Stat Estimate t-Stat Intercept -7.7 -0.1 98.6 1.02 72.3 0.74 Boot Camp Participation -47.2 -0.62 -2.5 -0.03 -43.8 -0.47 Age 2 1.3 -0.9 -0.46 1.1 0.59 Male -48.6 -1.57 -42 -1.09 -74.6 -1.96 White -37.6 -1.16 -2.2 -0.06 -38.1 -0.96 Observations 361 386 361 R2 0.021 0.005 0.021 Adjusted R2 0.01 -0.005 0.01 Table C-6. Effect of bioscreens on absent hours, TARC 2016 Quarter 3â2018 Quarter 2. Dependent: Change in Absent Hours 2018 Q2â2016 Q3Estimate t-Stat Intercept -174.2 -3.02 Bioscreens -6.9 -0.23 Age 1 0.91 Male 35.6 1.47 White 41.9 1.73 Observations 565 R2 0.016 Adjusted R2 0.009 Although some of the control variables in the equations presented are the correct sign and are statistically significant at the one-sided 10% level (1.28), none of the wellness participation variables are statistically significant.