National Academies Press: OpenBook

Understanding and Communicating Reliability of Crash Prediction Models (2021)

Chapter:Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners

« Previous: Chapter 6. Reliability Associated with Using a CPM to Estimate Frequency of Rare Crash Types and Severities: Overview of the Problem with Possible Solutions
Page 73
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page73
Page 74
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page74
Page 75
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page75
Page 76
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page76
Page 77
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page77
Page 78
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page78
Page 79
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page79
Page 80
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page80
Page 81
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page81
Page 82
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page82
Page 83
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page83
Page 84
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page84
Page 85
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page85
Page 86
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page86
Page 87
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page87
Page 88
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page88
Page 89
Suggested Citation:"Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners." National Academies of Sciences, Engineering, and Medicine. 2021. Understanding and Communicating Reliability of Crash Prediction Models. Washington, DC: The National Academies Press. doi: 10.17226/26440.
×
Page89

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

71 Chapter 7. Reliability Associated with Predicting Outside the Range of Independent Variables: Problem Description and Procedure for Practitioners Background Crash prediction models (CPMs) or safety performance functions (SPFs), may include only traffic volumes as predictor variables, traffic volumes plus a limited number of geometric and traffic control variables, or, traffic volumes and a large number of geometric and traffic control variables. Depending on the data that are used for the estimation, the CPMs are considered more reliable within the range of the independent variables that were available in the data used for the estimation of the CPMs. The 1st edition of the HSM (AASHTO, 2010) indicates that the application of the CPMs to “sites with AADTs substantially outside this range may not provide reliable results”. However, when these CPMs are applied by practitioners in their own jurisdiction, it is possible that AADT and other characteristics of a particular site may be outside the range of the data that were used to estimate the CPMs. This chapter provides a discussion of this problem and possible solutions. Maximum AADT Values in Different CPMs Table 29 shows the maximum AADT values for selected CPMs from the 1st edition of the HSM, 2nd edition of the HSM (proposed), and SafetyAnalyst. Table 29. Maximum AADT Values for Selected CPMs Roadway Type Source of CPM Maximum AADT (veh/day) Rural Two-Lane Road Segments SafetyAnalyst 30,025 1st edition of the HSM 17,800 NCHRP Project 17-62* 21,622 Rural Multilane Undivided Segments SafetyAnalyst 42,638 1st edition of the HSM 33,200 NCHRP Project 17-62* 21,667 Rural Multilane Divided Segments SafetyAnalyst 31,188 1st edition of the HSM 89,300 NCHRP Project 17-62* 66,504 Note. *Proposed for the 2nd edition of the HSM. There are significant differences in the maximum AADT values between the data sets that were used for estimating CPMs in SafetyAnalyst, 1st edition of the HSM, and proposed CPMs for 2nd edition of the HSM (based on CPMs from NCHRP Project 17-62). There may be multiple reasons for these differences including the State(s) that were used for the model estimation and the approach used to select the datasets. SafetyAnalyst used all the available data in a particular state to estimate the CPM which was based on segment length and AADT. The approach used for estimating the CPMs in the 1st edition of the HSM was different for each roadway type. For example, for rural two-lane roads, a fully specified model was first estimated, and in this model, the values corresponding to the base condition were substituted to derive a base condition model. In NCHRP Project 17-62, CPMs were estimated using only the data for those sites that corresponded to the base condition. Nevertheless, the significant differences in the maximum AADT values for these models illustrate the need for some guidance regarding the reliability of using the CPMs to predict the number of crashes at site whose site characteristics (especially, AADT) are outside the range of the data used to estimate the CPMs. Before going further, there needs to be a

72 discussion of the functional form of CPMs and SPFs, since they influence the guidance that is provided in the subsequent sections. Functional Form of CPMs and SPFs When CPMs are used to predict the number of crashes at sites whose site characteristics (especially, AADT) are outside the range of the data used to estimate the CPMs, then the user is implicitly assuming that the functional form of the CPM is applicable/valid outside the range of the original data used to estimate the CPMs. In reality, the true functional form of an SPF is not known. To simplify the discussion, since traffic volume is often the most important contributor to crashes, this issue is illustrated below using CPMs for roadway segments with AADT as the only independent variable (Srinivasan and Bauer, 2013). The most common form for a CPM that relates crash frequency and AADT is the following: Equation 60 𝑁 𝐿 𝑒𝑥𝑝 𝑎 𝑏 𝑙𝑛 𝐴𝐴𝐷𝑇 𝐿 𝑒 𝐴𝐴𝐷𝑇 Where: - 𝑁 is the predicted average number of crashes on a segment, - L is the length of the segment, and - ‘a’ and ‘b’ are regression coefficients to be estimated. This type of model is sometimes called a power function. In the power model, it is generally accepted that b is positive since the number of crashes are expected to increase with increase in traffic volume. Note that from in Equation 60 can also be used for CPMs for specific base conditions, where roadways meet certain conditions. When the site-specific conditions do not meet the base conditions, it is recommended in the 1st edition of the HSM (AASHTO, 2010) to adjust the crash estimate from Equation 60 by applying crash modification factors (also called SPF adjustment factors in the upcoming 2nd edition of the HSM), and then a calibration factor (CF) to account for differences between jurisdictions. The CF can be calculated as follows: Equation 61 𝐶𝐹 ∑ 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑐𝑟𝑎𝑠ℎ𝑒𝑠 ∑ 𝑝𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑 𝑐𝑟𝑎𝑠ℎ𝑒𝑠 Following the procedure illustrated in Part C of the HSM, the computed calibration factor is then applied to the CPMs to predict crashes for each site in the new data. The CPM for the new data becomes: Equation 62 𝑁 𝑁 ∗ 𝐶𝐹 ∗ 𝐶𝑀𝐹 ∗ 𝐶𝑀𝐹 …𝐶𝑀𝐹 Where, 𝐶𝑀𝐹 ,𝐶𝑀𝐹 …𝐶𝑀𝐹 =crash modification factors for local conditions for site characteristics variables 1 through n. The NCHRP Project 17-45 Final Report (Bonneson et al., 2012) illustrates that the CPMs developed using CA, MI, and WA freeway data for higher and lower AADTs are indeed different. This implies that the CPMs may provide biased estimates of crashes when directly applying to the sites where AADTs are outside of the range of the original data used to estimate the CPMs. In that study, CMFs for AADTs are applied to both multi-vehicle (MV) and single-vehicle (SV) crashes to address high traffic volume effects: For multi-vehicle crashes: Equation 63 𝐶𝑀𝐹 , 𝑒 ,

73 Where 𝐶𝑀𝐹 , ℎ𝑖𝑔ℎ 𝑡𝑟𝑎𝑓𝑓𝑖𝑐 𝑣𝑜𝑙𝑢𝑚𝑒 𝑐𝑟𝑎𝑠ℎ 𝑚𝑜𝑑𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛 𝑓𝑎𝑐𝑡𝑜𝑟 𝑓𝑜𝑟 𝑚𝑢𝑙𝑡𝑖𝑣𝑒ℎ𝑖𝑐𝑙𝑒 𝑐𝑟𝑎𝑠ℎ𝑒𝑠 𝑏 , ℎ𝑖𝑔ℎ 𝑡𝑟𝑎𝑓𝑓𝑖𝑐 𝑣𝑜𝑙𝑢𝑚𝑒 𝑐𝑎𝑙𝑖𝑏𝑟𝑎𝑡𝑖𝑜𝑛 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑓𝑜𝑟 𝑚𝑢𝑙𝑡𝑖𝑣𝑒ℎ𝑖𝑐𝑙𝑒 𝑐𝑟𝑎𝑠ℎ𝑒𝑠 𝑃_ℎ𝑣 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛 𝑜𝑓 𝐴𝐴𝐷𝑇 𝑑𝑢𝑟𝑖𝑛𝑔 ℎ𝑜𝑢𝑟𝑠 𝑤ℎ𝑒𝑟𝑒 𝑣𝑜𝑙𝑢𝑚𝑒 𝑒𝑥𝑐𝑒𝑒𝑑𝑠 1,000 𝑣𝑒ℎ/ℎ𝑜𝑢𝑟 /𝑙𝑎𝑛𝑒 For single-vehicle crashes: Equation 64 𝐶𝑀𝐹 , 𝑒 , Where 𝐶𝑀𝐹 , ℎ𝑖𝑔ℎ 𝑡𝑟𝑎𝑓𝑓𝑖𝑐 𝑣𝑜𝑙𝑢𝑚𝑒 𝑐𝑟𝑎𝑠ℎ 𝑚𝑜𝑑𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛 𝑓𝑎𝑐𝑡𝑜𝑟 𝑓𝑜𝑟 𝑠𝑖𝑛𝑔𝑙𝑒 𝑣𝑒ℎ𝑖𝑐𝑙𝑒 𝑐𝑟𝑎𝑠ℎ𝑒𝑠 𝑏 , ℎ𝑖𝑔ℎ 𝑡𝑟𝑎𝑓𝑓𝑖𝑐 𝑣𝑜𝑙𝑢𝑚𝑒 𝑐𝑎𝑙𝑖𝑏𝑟𝑎𝑡𝑖𝑜𝑛 𝑐𝑜𝑒𝑓𝑓𝑖𝑐𝑖𝑒𝑛𝑡 𝑓𝑜𝑟 𝑠𝑖𝑛𝑔𝑙𝑒 𝑣𝑒ℎ𝑖𝑐𝑙𝑒 𝑐𝑟𝑎𝑠ℎ𝑒𝑠 The next section provides an overview of the goodness-of-fit measures that can be used assess the performance of different calibration and re-estimation options for predicting outside the range of independent variables. Following that is a discussion of different options illustrated using HSIS data from California freeways from 2005 to 2014. The final section provides a recommended procedure for practitioners. Objective of This Chapter The objective of this chapter is to provide guidance on the potential reliability of using CPMs to predict outside the range of independent variables. The bias, variance, and repeatability associated with this factor are shown in Table 30. Table 30. Bias, Variance, and Repeatability Associated with Predicting Outside the Range of the Input Variable. Influence Category Factor Effect of Factor on Reliability of CPM Bias Variance Repeatability Application- related factors influencing reliability Application is outside the range of an input variable Less reliable Less reliable Less reliable if error is due to poor description of range associated with each input variable. Goodness-of-Fit Measures Many goodness-of-fit measures have been proposed including mean absolute deviation (MAD), modified R2 value, dispersion parameter (K), coefficient of variation of calibration factor (defined as CV), cumulative residual (CURE) Plots, percent of CURE plot ordinates for fitted values (after calibration) exceeding 2σ limits, and the maximum absolute deviation from zero. The definitions of these criteria can be found in the User Guide for the FHWA Calibrator Tool (Lyon et al., 2016) and are described below. Mean Absolute Deviation (MAD) The mean absolute deviation is a measure of the average value of the absolute difference between observed and predicted crashes.

74 Equation 65 𝑀𝐴𝐷 ∑ |𝑦 𝑦 |𝑛 where: 𝑦 = predicted values from the SPF. 𝑦 = observed counts. n = validation data sample size. Modified R2 This GOF measure seeks to measure the amount of systematic variation explained by the SPF. Larger values indicate a better fit to the data in comparing two or more competing SPFs. Values greater than 1.0 indicate that the SPF is over-fit and some of the expected random variation is incorrectly explained as the systematic variation. Equation 66 𝑅 ∑ 𝑦 𝑦 ∑ 𝜇∑ 𝑦 𝑦 ∑ 𝑦   where: 𝑦 = observed counts. 𝑦 = predicted values from the SPF. 𝑦= sample average. 𝜇 = 𝑦 -𝑦 . Dispersion Parameter (k) The dispersion parameter is a measure of the variability in the data. It can be expressed as follows: Equation 67 𝑘 𝑉𝑎𝑟 𝑚 𝐸 𝑚𝐸 𝑚 where: k = estimate of the dispersion parameter in the calibration procedure. Var{m} = estimated variance of mean crash rate. E{m} = estimated mean crash rate. The estimated variance increases as dispersion increases, and consequently the standard errors of estimates increase as well. As a result, an SPF with lower dispersion parameter estimates (i.e., smaller values of k) is preferred to an SPF with more dispersion. Note that the FHWA Calibrator Tool (Lyon et al., 2016) can provide either a constant dispersion parameter, or one that varies by length (for road segments). Coefficient of Variation of Calibration Factor The CV of the calibration factor is the standard deviation of the calibration factor divided by the estimate of the calibration factor as shown in the following equation.

75 Equation 68 𝐶𝑉 𝑉 CC Where: CV = coefficient of variation of the calibration factor. C = estimate of the calibration factor. V(C) = variance of the calibration factor, can be calculated as follows: Equation 69 𝑉 𝐶 ∑ 𝑦 𝑘 ∗ 𝑦∑ 𝑦 Where: 𝑦i = observed counts. 𝑦 = uncalibrated predicted values from the SPF. k = dispersion parameter. CURE Plots and Related Measures A CURE plot is a graph of the cumulative residuals (observed minus predicted crashes) against a variable of interest sorted in ascending order (e.g., major road traffic volume). CURE plots provide a visual representation of GOF over the range of a given variable, and help to identify potential concerns such as the following:  Long trends: long trends in the CURE plot (increasing or decreasing) indicate regions of bias that analysts should rectify through improvement to the SPF. This can be seen from the CURE plots.  Percent exceeding the confidence limits (Outside 95% CI (%)): cumulative residuals outside the 95% confidence limits indicate a poor fit over that range in the variable of interest. Cumulative residuals frequently outside the confidence limits indicate possible bias in the SPF.  Vertical changes (Max_Cure): Large vertical changes in the CURE plot are potential indicators of outliers, which require further examination. Further information can be found in Chapter 7 of Hauer’s book (Hauer, 2015).  Maximum value exceeding 95% confidence limits (Max_DCure): This measures the distance between the CURE and the 95% confidence limits if CURE is outside the confidence limits. The bigger the values, the poorer the fit.  Average value exceeding 95% confidence limits (Avg_DCure): While Max_DCure measures the maximum difference between CURE and the 95% confidence limits, Avg_DCure measures the overall distance between the CURE and the 95% confidence limits for those outside the confidence limits. Similar to Max_DCure, smaller average value exceeding 95% indicates less bias in the SPF. The FHWA Calibrator Tool (Lyon et al., 2016) provides CURE plots, percent exceeding the confidence limits, and maximum vertical change of the CURE plot. Maximum value exceeding 95% confidence limits and average value exceeding 95% confidence limits were added in this study to compare the proposed options. Examination of Different Options for Predicting Outside the Range of AADT

76 The following five options were examined:  Option 1: Perform calibration  Option 2: Adjust parameter/coefficient for AADT and perform calibration  Option 3: Estimate calibration function or SPF by modifying the coefficient for AADT and perform calibration  Option 4: Estimate calibration function or SPF and perform calibration  Option 5: Estimate calibration function or SPF with different parameters for AADT and the other factors, and perform calibration Some of these options (e.g., Options 1 and 2) would be easier for practitioners to apply. Estimation of calibration functions would be more involved. However, Srinivasan et al (2016) provide guidance on using readily available tools such Microsoft Excel to estimate calibration functions. Option 1: Perform Calibration This is probably the most common option used by practitioners because it is relatively straightforward and discussed in the 1st edition of the HSM (AASHTO, 2010). The practitioners would compile the necessary data and follow Equation 61 to estimate the calibration factor (CF). Option 2: Adjust Parameter/Coefficient for AADT and Perform Calibration Since AADTs in the new data are outside the range of the data used to estimate the original CPMs, based on previous research (e.g., Bonneson et al., 2012), the coefficient associated with AADT may be different depending on the range of AADT. Essentially, this option tries to identify a more appropriate coefficient for AADT based on trial and error. A trial and error approach would be more time consuming compared to estimating the parameter, but it can be implemented by practitioners without knowledge of statistical methods. This option could be implemented if Option 1 does not provide a satisfactory fit of the data based on the FHWA calibration tool. The procedure for this option can be explained below:  Step 1: Assume the parameter/coefficient b for AADT in the new data as b_new = b*A_adj, where A_adj, is an adjustment factor. There is no current guideline on what this adjustment factor should be, and a trial and error approach is recommended.  Step 2: Predict the number of crashes based on b_new.  Step 3: Calculate CF using Equation 61.  Step 4: Use the FHWA calibration tool to assess the performance.  Step 5: If the performance is not satisfactory, modify A_adj, and repeat the process.  Step 6: Select the A_Adj that provides the best fit. Option 3: Estimate Calibration Function or SPF by Modifying the Coefficient for AADT and Perform Calibration This approach essentially involves estimating a calibration function or SPF by modifying only the coefficient for AADT, and then performing a simple calibration to ensure that the observed and predicted crashes to be equal. Step 1: Estimate calibration function or SPF of the following form: Equation 70 𝑁 𝐴𝐴𝐷𝑇 𝑁   Step 2: Calculate predicted crashes using the newly developed 𝑁 Step 3: calculate CF as follows:

77 Equation 71 𝑁 𝐶𝐹 𝑁   Option 4: Estimate calibration function or SPF and perform calibration This option also involves the estimation of a calibration function, but unlike Option 3, the coefficient for all the terms in the SPF/CPM are estimated. If the NSPF includes CMFs (also called SPF adjustment factors), they are also raised to a power, and that can be a possible source of criticism: Step 1: Estimate calibration function of the following form. Equation 72 𝑁 𝑎 𝑁 )  Step 2: Calculate predicted crashes using the newly developed 𝑁 Step 3: Calculate CF as follows: Equation 73 𝑁 𝐶𝐹 𝑁   Option 5: Estimate calibration function or SPF with different parameters for AADT and the other factors, and perform calibration This option can be seen as a combination of Options 3 and 4. A calibration function is estimated, but different coefficients are introduced for AADT and the other parameters: Step 1: Recalibrate using SPF and AADT as independent variables, both variables are assumed to be power function in the new model, shown below. Equation 74 𝑁 𝑎 𝑎𝑎𝑑𝑡 𝑁   Steps 2 and 3 are the same as for options 3 and 4. Illustration of the Options Using Data California freeway data from 2005 to 2014 (from HSIS) were used for the illustration. Ramp influence areas (based on 0.3 miles on either side of a ramp) were excluded. Short segments less than 0.01 miles were also excluded. The data were categorized based on number of lanes, terrain, and area types (rural or urban areas). The crash types considered included: total crashes, single-vehicle crashes, and multi-vehicle crashes. For the different freeway categories, SPFs were estimated using data from segments with lower AADT values, and they were tested using data from segments with higher AADT values. The results for rural 4-lane flat terrain segments is discussed below as an illustration. For estimating the SPFs, rural 4-lane Flat Terrain Freeway segments with maximum AADT < 30000, were selected. For testing, rural 4-lane Flat Terrain Freeway segments with AADT between 30000 and 60000, were selected. The AADT range for the testing data set was specifically chosen to be higher than the data set used for the initial SPF estimation. Summary statistics for these data sets are provided in Table 31. Results of the Testing Multi-Vehicle Crashes The five options were investigated for total, multi-vehicle (MV), and single-vehicle (SV) crashes, respectively. The GOF are shown in Table 32 and the CURE plots are shown in Figure 10 through

78 Figure 14. For Option 2, a few values on the change AADT parameter were investigated, and the best one was used to compare with the other options. Table 31. Summary Statistics for the Data Sets Consisting of CA Rural Flat Highway Segments Used in the Illustration. Variable Data Used for SPF Development* (AADT < 30,000) Data used for Testing the Different Options** (AADT: 30,000 to 60,000) Min Max Mean Stdev Sum Min Max Mean Stdev Sum AADT 1,590 29,909 19,806 5,697.70 NA 30,100 59,706 39,096 7,636.01 NA Seg length (mi) 0.01 9.018 0.570 0.850 393.07 0.01 4.127 0.621 0.810 194.50 Single-vehicle crashes 0 78 6.927 10.315 4773 0 104 12.661 17.465 3963 Multi-vehicle crashes 0 51 4.084 6.804 2814 0 138 13.217 17.929 4137 Total crashes 0 107 11.012 16.346 7587 0 231 25.879 34.344 8100 Note: *Number of segments = 689; **Number of segments = 313. NA is not applicable.  

79 Table 32. Testing Results for Multi-Vehicle Crashes (Rural 4-lane flat terrain freeways). Option Number of crashes k Modified R2 CV MAD Max_Cure Max_DCure Avg_DCure Outside 95% CI (proportion) Option 1 4,137 0.268 0.840 0.052 4.557 202.206 105.923 28.183 0.524 Option 2: increase AADT coefficient by 40% 4,137 0.259 0.825 0.051 4.721 189.437 93.705 24.031 0.508 Option 3 4,137 0.259 0.820 0.051 4.758 187.201 91.281 24.193 0.514 Option 4 4,137 0.264 0.809 0.051 4.748 193.223 92.279 25.255 0.521 Option 5 4,137 0.256 0.833 0.051 4.767 184.737 89.959 23.211 0.518 The best option depends on the GOF that is chosen for consideration. For example, Option 1 would be the best one if MAD is used, while it would be the worst based on the other GOF measure. Except for Modified R2, the lower the value of each GOF measure, the better the performance for the option. Overall, Option 1 has the worst performance to predict crashes for the new data while Option 5 has the best performance. Options 2 and 3 are good candidates for the second-best performance.

80         Figure 10. CURE Plots for MV Crashes - Option 1    

81 Figure 11. CURE Plots for MV Crashes - Option 2 

82   Figure 12. CURE Plots for MV Crashes - Option 3     

83 Figure 13. CURE Plots for MV Crashes - Option 4

84 Figure 14. CURE Plots for MV Crashes - Option 5 

85 Single-Vehicle Crashes The evaluation results for single-vehicle crashes are shown in Table 33. Overall, Option 5 has very good performance, and Option 1, Option 3, and Option 4, have comparable results and are good candidates for the second-best performance. Table 33. Results for Single-Vehicle Crashes. Option Number of crashes k Modified R2 CV MAD Max_Cure Max_DCure Avg_DCure Outside 95% CI (proportion) Option 1 3,963 0.223 0.843 0.048 4.540 164.208 58.179 19.131 0.597 Option 2: increase AADT coefficient by 50% 3,963 0.225 0.835 0.048 4.594 169.912 58.177 17.024 0.604 Option 3 3,963 0.223 0.841 0.048 4.546 170.083 56.191 18.070 0.597 Option 4 3,963 0.224 0.841 0.048 4.544 172.073 56.699 18.485 0.601 Option 5 3,963 0.219 0.836 0.048 4.615 160.145 58.148 15.749 0.594

86 Total Crashes (TOT) The results for total crashes are shown in Table 34. Not surprisingly, the evaluation results for total crashes are quite similar to those for MV crashes – Option 1 is the worst based on all the criteria except MAD. Option 5 has the best performance across the majority of the criteria. Table 34. Results for Total Crashes. Option Number of crashes k Modified R2 CV MAD Max_Cure Max_DCure Avg_DCure Outside 95% CI (proportion) Option 1 8,100 0.245 0.866 0.048 7.943 385.276 192.196 66.998 0.649 Option 2: increase AADT coefficient by 50% 8,100 0.237 0.860 0.047 8.050 359.618 173.546 59.974 0.629 Option 3 8,100 0.237 0.853 0.047 8.177 354.778 174.326 57.655 0.617 Option 4 8,100 0.239 0.852 0.047 8.143 362.567 178.741 59.359 0.613 Option 5 8,100 0.233 0.858 0.047 8.296 327.990 170.257 52.475 0.623

87 These results indicate that Option 5 generally performs the best, while Option 1 has the worst performance based on all criteria except MAD and Modified R2. Generally, Option 2 performs reasonably well compared to Option 1. The advantage of Option 2 is that it just involves trial and error with different adjustment factors for AADT and does not need the practitioner to conduct statistical analysis as in Options 3 through 5. Hence, Option 2 is an option that practitioners should consider. Recommended Option and Procedure for Practitioners The adjustment factor associated with AADT in Option 2 could be greater or less than 1.0. Although the demonstration was only provided with adjustment factor for AADT, a similar approach could be adopted for adjusting the coefficients for other variables depending on the specific SPF. To implement Option 2, the following procedures are recommended. Step 1: Choose MAD, Percent exceeding the confidence limits, and CURE Plots, as GOF measures. Step 2: Investigate multiple adjustment factors, e.g., 1.5, 1.25, 0.75, and 0.5. Step 3: Based on the GOF, find the most appropriate adjustment factors. In some cases, none of the adjustment factors may provide a satisfactory result. In that case, Options 3 through Option 5 may need to be investigated, but that will require the estimation of calibration functions or SPFs.

Next: Chapter 8. Reliability Associated with Predictions Using CPMs Estimated for Other Facility Types: Problem Illustration with Possible Solutions »
Understanding and Communicating Reliability of Crash Prediction Models Get This Book
×
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Understanding and communicating consistently reliable crash prediction results are critical to credible analysis and to overcome barriers for some transportation agencies or professionals utilizing these models.

The TRB National Cooperative Highway Research Program's NCHRP Web-Only Document 303: Understanding and Communicating Reliability of Crash Prediction Models provides guidance on being able to assess and understand the reliability of Crash Prediction Models.

This document is supplemental to NCHRP Research Report 983: Reliability of Crash Prediction Models: A Guide for Quantifying and Improving the Reliability of Model Results.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!