Read "Understanding and Communicating Reliability of Crash Prediction Models" at NAP.edu

« Previous: Chapter 6. Reliability Associated with Using a CPM to Estimate Frequency of Rare Crash Types and Severities: Overview of the Problem with Possible Solutions

Page 73

Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners ." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.

Page 74

Page 75

Page 76

Page 77

Page 78

Page 79

Page 80

Page 81

Page 82

Page 83

Page 84

Page 85

Page 86

Page 87

Page 88

Page 89

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

71 Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners Background Crash prediction models (CPMs) or safety performance functions (SPFs), may include only traffic volumes as predictor variables, traffic volumes plus a limited number of geometric and traffic control variables, or, traffic volumes and a large number of geometric and traffic control variables. Depending on the data that are used for the estimation, the CPMs are considered more reliable within the range of the independent variables that were available in the data used for the estimation of the CPMs. The 1st edition of the HSM (AASHTO, 2010) indicates that the application of the CPMs to âsites with AADTs substantially outside this range may not provide reliable resultsâ. However, when these CPMs are applied by practitioners in their own jurisdiction, it is possible that AADT and other characteristics of a particular site may be outside the range of the data that were used to estimate the CPMs. This chapter provides a discussion of this problem and possible solutions. Maximum AADT Values in Different CPMs Table 29 shows the maximum AADT values for selected CPMs from the 1st edition of the HSM, 2nd edition of the HSM (proposed), and SafetyAnalyst. Table 29. Maximum AADT Values for Selected CPMs Roadway Type Source of CPM Maximum AADT (veh/day) Rural Two-Lane Road Segments SafetyAnalyst 30,025 1st edition of the HSM 17,800 NCHRP Project 17-62* 21,622 Rural Multilane Undivided Segments SafetyAnalyst 42,638 1st edition of the HSM 33,200 NCHRP Project 17-62* 21,667 Rural Multilane Divided Segments SafetyAnalyst 31,188 1st edition of the HSM 89,300 NCHRP Project 17-62* 66,504 Note. *Proposed for the 2nd edition of the HSM. There are significant differences in the maximum AADT values between the data sets that were used for estimating CPMs in SafetyAnalyst, 1st edition of the HSM, and proposed CPMs for 2nd edition of the HSM (based on CPMs from NCHRP Project 17-62). There may be multiple reasons for these differences including the State(s) that were used for the model estimation and the approach used to select the datasets. SafetyAnalyst used all the available data in a particular state to estimate the CPM which was based on segment length and AADT. The approach used for estimating the CPMs in the 1st edition of the HSM was different for each roadway type. For example, for rural two-lane roads, a fully specified model was first estimated, and in this model, the values corresponding to the base condition were substituted to derive a base condition model. In NCHRP Project 17-62, CPMs were estimated using only the data for those sites that corresponded to the base condition. Nevertheless, the significant differences in the maximum AADT values for these models illustrate the need for some guidance regarding the reliability of using the CPMs to predict the number of crashes at site whose site characteristics (especially, AADT) are outside the range of the data used to estimate the CPMs. Before going further, there needs to be a

72 discussion of the functional form of CPMs and SPFs, since they influence the guidance that is provided in the subsequent sections. Functional Form of CPMs and SPFs When CPMs are used to predict the number of crashes at sites whose site characteristics (especially, AADT) are outside the range of the data used to estimate the CPMs, then the user is implicitly assuming that the functional form of the CPM is applicable/valid outside the range of the original data used to estimate the CPMs. In reality, the true functional form of an SPF is not known. To simplify the discussion, since traffic volume is often the most important contributor to crashes, this issue is illustrated below using CPMs for roadway segments with AADT as the only independent variable (Srinivasan and Bauer, 2013). The most common form for a CPM that relates crash frequency and AADT is the following: Equation 60 ð ð¿ ðð¥ð ð ð ðð ð´ð´ð·ð ð¿ ð ð´ð´ð·ð Where: - ð is the predicted average number of crashes on a segment, - L is the length of the segment, and - âaâ and âbâ are regression coefficients to be estimated. This type of model is sometimes called a power function. In the power model, it is generally accepted that b is positive since the number of crashes are expected to increase with increase in traffic volume. Note that from in Equation 60 can also be used for CPMs for specific base conditions, where roadways meet certain conditions. When the site-specific conditions do not meet the base conditions, it is recommended in the 1st edition of the HSM (AASHTO, 2010) to adjust the crash estimate from Equation 60 by applying crash modification factors (also called SPF adjustment factors in the upcoming 2nd edition of the HSM), and then a calibration factor (CF) to account for differences between jurisdictions. The CF can be calculated as follows: Equation 61 ð¶ð¹ â ððð ððð£ðð ðððð âðð â ððððððð¡ðð ðððð âðð Following the procedure illustrated in Part C of the HSM, the computed calibration factor is then applied to the CPMs to predict crashes for each site in the new data. The CPM for the new data becomes: Equation 62 ð ð â ð¶ð¹ â ð¶ðð¹ â ð¶ðð¹ â¦ð¶ðð¹ Where, ð¶ðð¹ ,ð¶ðð¹ â¦ð¶ðð¹ =crash modification factors for local conditions for site characteristics variables 1 through n. The NCHRP Project 17-45 Final Report (Bonneson et al., 2012) illustrates that the CPMs developed using CA, MI, and WA freeway data for higher and lower AADTs are indeed different. This implies that the CPMs may provide biased estimates of crashes when directly applying to the sites where AADTs are outside of the range of the original data used to estimate the CPMs. In that study, CMFs for AADTs are applied to both multi-vehicle (MV) and single-vehicle (SV) crashes to address high traffic volume effects: For multi-vehicle crashes: Equation 63 ð¶ðð¹ , ð ,

73 Where ð¶ðð¹ , âððâ ð¡ðððððð ð£ððð¢ðð ðððð â ððððððððð¡ððð ðððð¡ðð ððð ðð¢ðð¡ðð£ðâðððð ðððð âðð ð , âððâ ð¡ðððððð ð£ððð¢ðð ðððððððð¡ððð ððððððððððð¡ ððð ðð¢ðð¡ðð£ðâðððð ðððð âðð ð_âð£ ððððððð¡ððð ðð ð´ð´ð·ð ðð¢ðððð âðð¢ðð ð¤âððð ð£ððð¢ðð ðð¥ððððð 1,000 ð£ðâ/âðð¢ð /ðððð For single-vehicle crashes: Equation 64 ð¶ðð¹ , ð , Where ð¶ðð¹ , âððâ ð¡ðððððð ð£ððð¢ðð ðððð â ððððððððð¡ððð ðððð¡ðð ððð ð ððððð ð£ðâðððð ðððð âðð ð , âððâ ð¡ðððððð ð£ððð¢ðð ðððððððð¡ððð ððððððððððð¡ ððð ð ððððð ð£ðâðððð ðððð âðð The next section provides an overview of the goodness-of-fit measures that can be used assess the performance of different calibration and re-estimation options for predicting outside the range of independent variables. Following that is a discussion of different options illustrated using HSIS data from California freeways from 2005 to 2014. The final section provides a recommended procedure for practitioners. Objective of This Chapter The objective of this chapter is to provide guidance on the potential reliability of using CPMs to predict outside the range of independent variables. The bias, variance, and repeatability associated with this factor are shown in Table 30. Table 30. Bias, Variance, and Repeatability Associated with Predicting Outside the Range of the Input Variable. Influence Category Factor Effect of Factor on Reliability of CPM Bias Variance Repeatability Application- related factors influencing reliability Application is outside the range of an input variable Less reliable Less reliable Less reliable if error is due to poor description of range associated with each input variable. Goodness-of-Fit Measures Many goodness-of-fit measures have been proposed including mean absolute deviation (MAD), modified R2 value, dispersion parameter (K), coefficient of variation of calibration factor (defined as CV), cumulative residual (CURE) Plots, percent of CURE plot ordinates for fitted values (after calibration) exceeding 2Ï limits, and the maximum absolute deviation from zero. The definitions of these criteria can be found in the User Guide for the FHWA Calibrator Tool (Lyon et al., 2016) and are described below. Mean Absolute Deviation (MAD) The mean absolute deviation is a measure of the average value of the absolute difference between observed and predicted crashes.

74 Equation 65 ðð´ð· â |ð¦ ð¦ |ð where: ð¦ = predicted values from the SPF. ð¦ = observed counts. n = validation data sample size. Modified R2 This GOF measure seeks to measure the amount of systematic variation explained by the SPF. Larger values indicate a better fit to the data in comparing two or more competing SPFs. Values greater than 1.0 indicate that the SPF is over-fit and some of the expected random variation is incorrectly explained as the systematic variation. Equation 66 ð â ð¦ ð¦ â ðâ ð¦ ð¦ â ð¦ Â where: ð¦ = observed counts. ð¦ = predicted values from the SPF. ð¦= sample average. ð = ð¦ -ð¦ . Dispersion Parameter (k) The dispersion parameter is a measure of the variability in the data. It can be expressed as follows: Equation 67 ð ððð ð ð¸ ðð¸ ð where: k = estimate of the dispersion parameter in the calibration procedure. Var{m} = estimated variance of mean crash rate. E{m} = estimated mean crash rate. The estimated variance increases as dispersion increases, and consequently the standard errors of estimates increase as well. As a result, an SPF with lower dispersion parameter estimates (i.e., smaller values of k) is preferred to an SPF with more dispersion. Note that the FHWA Calibrator Tool (Lyon et al., 2016) can provide either a constant dispersion parameter, or one that varies by length (for road segments). Coefficient of Variation of Calibration Factor The CV of the calibration factor is the standard deviation of the calibration factor divided by the estimate of the calibration factor as shown in the following equation.

75 Equation 68 ð¶ð ð CC Where: CV = coefficient of variation of the calibration factor. C = estimate of the calibration factor. V(C) = variance of the calibration factor, can be calculated as follows: Equation 69 ð ð¶ â ð¦ ð â ð¦â ð¦ Where: ð¦i = observed counts. ð¦ = uncalibrated predicted values from the SPF. k = dispersion parameter. CURE Plots and Related Measures A CURE plot is a graph of the cumulative residuals (observed minus predicted crashes) against a variable of interest sorted in ascending order (e.g., major road traffic volume). CURE plots provide a visual representation of GOF over the range of a given variable, and help to identify potential concerns such as the following: ï· Long trends: long trends in the CURE plot (increasing or decreasing) indicate regions of bias that analysts should rectify through improvement to the SPF. This can be seen from the CURE plots. ï· Percent exceeding the confidence limits (Outside 95% CI (%)): cumulative residuals outside the 95% confidence limits indicate a poor fit over that range in the variable of interest. Cumulative residuals frequently outside the confidence limits indicate possible bias in the SPF. ï· Vertical changes (Max_Cure): Large vertical changes in the CURE plot are potential indicators of outliers, which require further examination. Further information can be found in Chapter 7 of Hauerâs book (Hauer, 2015). ï· Maximum value exceeding 95% confidence limits (Max_DCure): This measures the distance between the CURE and the 95% confidence limits if CURE is outside the confidence limits. The bigger the values, the poorer the fit. ï· Average value exceeding 95% confidence limits (Avg_DCure): While Max_DCure measures the maximum difference between CURE and the 95% confidence limits, Avg_DCure measures the overall distance between the CURE and the 95% confidence limits for those outside the confidence limits. Similar to Max_DCure, smaller average value exceeding 95% indicates less bias in the SPF. The FHWA Calibrator Tool (Lyon et al., 2016) provides CURE plots, percent exceeding the confidence limits, and maximum vertical change of the CURE plot. Maximum value exceeding 95% confidence limits and average value exceeding 95% confidence limits were added in this study to compare the proposed options. Examination of Different Options for Predicting Outside the Range of AADT

76 The following five options were examined: ï· Option 1: Perform calibration ï· Option 2: Adjust parameter/coefficient for AADT and perform calibration ï· Option 3: Estimate calibration function or SPF by modifying the coefficient for AADT and perform calibration ï· Option 4: Estimate calibration function or SPF and perform calibration ï· Option 5: Estimate calibration function or SPF with different parameters for AADT and the other factors, and perform calibration Some of these options (e.g., Options 1 and 2) would be easier for practitioners to apply. Estimation of calibration functions would be more involved. However, Srinivasan et al (2016) provide guidance on using readily available tools such Microsoft Excel to estimate calibration functions. Option 1: Perform Calibration This is probably the most common option used by practitioners because it is relatively straightforward and discussed in the 1st edition of the HSM (AASHTO, 2010). The practitioners would compile the necessary data and follow Equation 61 to estimate the calibration factor (CF). Option 2: Adjust Parameter/Coefficient for AADT and Perform Calibration Since AADTs in the new data are outside the range of the data used to estimate the original CPMs, based on previous research (e.g., Bonneson et al., 2012), the coefficient associated with AADT may be different depending on the range of AADT. Essentially, this option tries to identify a more appropriate coefficient for AADT based on trial and error. A trial and error approach would be more time consuming compared to estimating the parameter, but it can be implemented by practitioners without knowledge of statistical methods. This option could be implemented if Option 1 does not provide a satisfactory fit of the data based on the FHWA calibration tool. The procedure for this option can be explained below: ï· Step 1: Assume the parameter/coefficient b for AADT in the new data as b_new = b*A_adj, where A_adj, is an adjustment factor. There is no current guideline on what this adjustment factor should be, and a trial and error approach is recommended. ï· Step 2: Predict the number of crashes based on b_new. ï· Step 3: Calculate CF using Equation 61. ï· Step 4: Use the FHWA calibration tool to assess the performance. ï· Step 5: If the performance is not satisfactory, modify A_adj, and repeat the process. ï· Step 6: Select the A_Adj that provides the best fit. Option 3: Estimate Calibration Function or SPF by Modifying the Coefficient for AADT and Perform Calibration This approach essentially involves estimating a calibration function or SPF by modifying only the coefficient for AADT, and then performing a simple calibration to ensure that the observed and predicted crashes to be equal. Step 1: Estimate calibration function or SPF of the following form: Equation 70 ð ð´ð´ð·ð ð Â Step 2: Calculate predicted crashes using the newly developed ð Step 3: calculate CF as follows:

77 Equation 71 ð ð¶ð¹ ð Â Option 4: Estimate calibration function or SPF and perform calibration This option also involves the estimation of a calibration function, but unlike Option 3, the coefficient for all the terms in the SPF/CPM are estimated. If the NSPF includes CMFs (also called SPF adjustment factors), they are also raised to a power, and that can be a possible source of criticism: Step 1: Estimate calibration function of the following form. Equation 72 ð ð ð )Â Step 2: Calculate predicted crashes using the newly developed ð Step 3: Calculate CF as follows: Equation 73 ð ð¶ð¹ ð Â Option 5: Estimate calibration function or SPF with different parameters for AADT and the other factors, and perform calibration This option can be seen as a combination of Options 3 and 4. A calibration function is estimated, but different coefficients are introduced for AADT and the other parameters: Step 1: Recalibrate using SPF and AADT as independent variables, both variables are assumed to be power function in the new model, shown below. Equation 74 ð ð ðððð¡ ð Â Steps 2 and 3 are the same as for options 3 and 4. Illustration of the Options Using Data California freeway data from 2005 to 2014 (from HSIS) were used for the illustration. Ramp influence areas (based on 0.3 miles on either side of a ramp) were excluded. Short segments less than 0.01 miles were also excluded. The data were categorized based on number of lanes, terrain, and area types (rural or urban areas). The crash types considered included: total crashes, single-vehicle crashes, and multi-vehicle crashes. For the different freeway categories, SPFs were estimated using data from segments with lower AADT values, and they were tested using data from segments with higher AADT values. The results for rural 4-lane flat terrain segments is discussed below as an illustration. For estimating the SPFs, rural 4-lane Flat Terrain Freeway segments with maximum AADT < 30000, were selected. For testing, rural 4-lane Flat Terrain Freeway segments with AADT between 30000 and 60000, were selected. The AADT range for the testing data set was specifically chosen to be higher than the data set used for the initial SPF estimation. Summary statistics for these data sets are provided in Table 31. Results of the Testing Multi-Vehicle Crashes The five options were investigated for total, multi-vehicle (MV), and single-vehicle (SV) crashes, respectively. The GOF are shown in Table 32 and the CURE plots are shown in Figure 10 through

78 Figure 14. For Option 2, a few values on the change AADT parameter were investigated, and the best one was used to compare with the other options. Table 31. Summary Statistics for the Data Sets Consisting of CA Rural Flat Highway Segments Used in the Illustration. Variable Data Used for SPF Development* (AADT < 30,000) Data used for Testing the Different Options** (AADT: 30,000 to 60,000) Min Max Mean Stdev Sum Min Max Mean Stdev Sum AADT 1,590 29,909 19,806 5,697.70 NA 30,100 59,706 39,096 7,636.01 NA Seg length (mi) 0.01 9.018 0.570 0.850 393.07 0.01 4.127 0.621 0.810 194.50 Single-vehicle crashes 0 78 6.927 10.315 4773 0 104 12.661 17.465 3963 Multi-vehicle crashes 0 51 4.084 6.804 2814 0 138 13.217 17.929 4137 Total crashes 0 107 11.012 16.346 7587 0 231 25.879 34.344 8100 Note: *Number of segments = 689; **Number of segments = 313. NA is not applicable. Â

79 Table 32. Testing Results for Multi-Vehicle Crashes (Rural 4-lane flat terrain freeways). Option Number of crashes k Modified R2 CV MAD Max_Cure Max_DCure Avg_DCure Outside 95% CI (proportion) Option 1 4,137 0.268 0.840 0.052 4.557 202.206 105.923 28.183 0.524 Option 2: increase AADT coefficient by 40% 4,137 0.259 0.825 0.051 4.721 189.437 93.705 24.031 0.508 Option 3 4,137 0.259 0.820 0.051 4.758 187.201 91.281 24.193 0.514 Option 4 4,137 0.264 0.809 0.051 4.748 193.223 92.279 25.255 0.521 Option 5 4,137 0.256 0.833 0.051 4.767 184.737 89.959 23.211 0.518 The best option depends on the GOF that is chosen for consideration. For example, Option 1 would be the best one if MAD is used, while it would be the worst based on the other GOF measure. Except for Modified R2, the lower the value of each GOF measure, the better the performance for the option. Overall, Option 1 has the worst performance to predict crashes for the new data while Option 5 has the best performance. Options 2 and 3 are good candidates for the second-best performance.

80 Â Â Â Â Figure 10. CURE Plots for MV Crashes - Option 1 Â Â

81 Figure 11. CURE Plots for MV Crashes - Option 2Â

82 Â Figure 12. CURE Plots for MV Crashes - Option 3Â Â Â

83 Figure 13. CURE Plots for MV Crashes - Option 4

84 Figure 14. CURE Plots for MV Crashes - Option 5Â

85 Single-Vehicle Crashes The evaluation results for single-vehicle crashes are shown in Table 33. Overall, Option 5 has very good performance, and Option 1, Option 3, and Option 4, have comparable results and are good candidates for the second-best performance. Table 33. Results for Single-Vehicle Crashes. Option Number of crashes k Modified R2 CV MAD Max_Cure Max_DCure Avg_DCure Outside 95% CI (proportion) Option 1 3,963 0.223 0.843 0.048 4.540 164.208 58.179 19.131 0.597 Option 2: increase AADT coefficient by 50% 3,963 0.225 0.835 0.048 4.594 169.912 58.177 17.024 0.604 Option 3 3,963 0.223 0.841 0.048 4.546 170.083 56.191 18.070 0.597 Option 4 3,963 0.224 0.841 0.048 4.544 172.073 56.699 18.485 0.601 Option 5 3,963 0.219 0.836 0.048 4.615 160.145 58.148 15.749 0.594

86 Total Crashes (TOT) The results for total crashes are shown in Table 34. Not surprisingly, the evaluation results for total crashes are quite similar to those for MV crashes â Option 1 is the worst based on all the criteria except MAD. Option 5 has the best performance across the majority of the criteria. Table 34. Results for Total Crashes. Option Number of crashes k Modified R2 CV MAD Max_Cure Max_DCure Avg_DCure Outside 95% CI (proportion) Option 1 8,100 0.245 0.866 0.048 7.943 385.276 192.196 66.998 0.649 Option 2: increase AADT coefficient by 50% 8,100 0.237 0.860 0.047 8.050 359.618 173.546 59.974 0.629 Option 3 8,100 0.237 0.853 0.047 8.177 354.778 174.326 57.655 0.617 Option 4 8,100 0.239 0.852 0.047 8.143 362.567 178.741 59.359 0.613 Option 5 8,100 0.233 0.858 0.047 8.296 327.990 170.257 52.475 0.623

87 These results indicate that Option 5 generally performs the best, while Option 1 has the worst performance based on all criteria except MAD and Modified R2. Generally, Option 2 performs reasonably well compared to Option 1. The advantage of Option 2 is that it just involves trial and error with different adjustment factors for AADT and does not need the practitioner to conduct statistical analysis as in Options 3 through 5. Hence, Option 2 is an option that practitioners should consider. Recommended Option and Procedure for Practitioners The adjustment factor associated with AADT in Option 2 could be greater or less than 1.0. Although the demonstration was only provided with adjustment factor for AADT, a similar approach could be adopted for adjusting the coefficients for other variables depending on the specific SPF. To implement Option 2, the following procedures are recommended. Step 1: Choose MAD, Percent exceeding the confidence limits, and CURE Plots, as GOF measures. Step 2: Investigate multiple adjustment factors, e.g., 1.5, 1.25, 0.75, and 0.5. Step 3: Based on the GOF, find the most appropriate adjustment factors. In some cases, none of the adjustment factors may provide a satisfactory result. In that case, Options 3 through Option 5 may need to be investigated, but that will require the estimation of calibration functions or SPFs.

Next: Chapter 8. Reliability Associated with Predictions Using CPMs Estimated for Other Facility Types: Problem Illustration with Possible Solutions »

Understanding and Communicating Reliability of Crash Prediction Models (2021)

Chapter: Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners

Welcome to OpenBook!

Get Email Updates