Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
47  Introduction In some instances, CPMs may include only traffic volumes as predictor variables, traffic vol- umes plus a limited number of geometric and traffic control variables, or traffic volumes and a large number of geometric and traffic control variables. One question a practitioner may face is whether a CPM with more variables is more reliable than a simpler CPM. Likewise, if a user does not have the information for each variable in a CPM, there is a question as to how reliable its application may be. For practitioners, the reliability of a CPM concerns its bias and the precision of the estimate, expressed by the estimate variance. Assuming that a CPM was developed with an appropriate functional form and estimation method, additional variables in a CPM would be expected to reduce any biases in its application. However, additional variables may also increase the variance of the estimates provided, and CPMs with many variables may also be in danger of being overfit to the calibration data, thus not performing as well when applied elsewhere. A CPM that was overfit to the estimation data will result in more bias when applied. Factors Affecting the Potential Magnitude of Reliability Issues How significant the addition or absence of a variable is to the reliability of a CPM may depend on the context of its use. Three ways in which reliability is influenced are described as follows: 1. Relative Impact of the Variable. The importance of the variable to the expected number of crashes influences the reliability to a large degree. For example, traffic volumes have been shown to be the most influential predictor of crashes. The absence of appropriate traffic- volume variables (e.g., left-turn volumes in a CPM for intersection left-turn crashes) would result in larger biases in application and a more unreliable CPM. On the other hand, there are variables with a relatively low impact on expected crashes, and the inclusion of those variables would do little to reduce bias and increase reliability. For example, the shoulder type may have little impact on the frequency of Total crashes for rural multilane roads. 2. Omitted Variables in the CPM. Reliability is also influenced if the application sites differ from the sites used to develop the CPM, and those variables are not included in the model. For example, consider a CPM for two-lane rural roads that do not include horizontal curvature C H A P T E R 4 Quantifying the Reliability of CPM Estimates for How the Number of Variables in Crash Prediction Models Affects Reliability
48 Reliability of Crash Prediction Models: A Guide for Quantifying and Improving the Reliability of Model Results as an independent variable in the CPM even though the CPM was developed from a dataset containing 10% curved segments and 90% tangent segments. If the CPM were applied to a group of segments that have 60% horizontal curves, bias would almost certainly exist given that expected crash frequencies are higher on curved segments than on tangents. 3. Missing Application Data. If the practitioner does not have data for one or more of the CPM independent variables, then bias may result. Although it would be prudent to acquire those data, there may be instances where this is not possible. For example, a practitioner may be applying the CPM to estimate the safety of a site under design before all design elements have been finalized. In such a case, the practitioner may âremoveâ the variable with missing data from the CPM by substituting the average value of that variable from the estimation data (if available). The extent of the bias will depend on the importance of the variable to the estimate of expected crashes and the similarity between the application and model estimation sites. Procedure to Assess Potential Reliability Because the impact on reliability can be so variable depending on the relative impact of a variable on the overall CPM prediction, it is not possible to give strict guidance on how the number of variables in a CPM will affect its reliability. The guidance developed is a heuristic procedure that practitioners can use to assess how the use or absence of additional variables in a CPM affects reliability. This procedure answers two questions: 1. Which of multiple CPMs to apply, particularly when the number of variables varies between SPFs? 2. What are the impacts on reliability of using a CPM when not all the variables in the CPM are known? Procedural Steps and Example Application The procedural steps lead to an evaluation and rating system for the impact of the number of variables in a CPM on the predictions from CPMs. The procedure does not aim to develop a CPM; it aims to help practitioners determine which of multiple candidate CPMs to use, or, if not all variables are readily available for applying a CPM, how much reliability is lost if those variables are not used in applying the CPM. The procedure makes use of the FHWAâs The Calibrator (https://safety.fhwa.dot.gov/rsdp/toolbox-content.aspx?toolid=150). These following procedural steps outline Scenario 3, Case A and Case B. Procedure: Scenario 3, Case A Reliability of CPM estimates with a focus on the number of variables in CPM (for design applications or evaluation of countermeasures) Procedure: Scenario 3, Case B Reliability of CPM estimates with a focus on the number of variables in CPM (for network screening)
Quantifying the Reliability of CPM Estimates for How the Number of Variables in Crash Prediction Models Affects Reliability 49  The procedure also provides a related rating of reliability that could contribute to an overall rating system of CPM reliability. The procedure combines the steps for both Scenario 3, Case A, and Scenario 3, Case B. There are five steps in the procedure for Scenario 3, Case A, and eight steps in the procedure for Scenario 3, Case B. The initial three steps are the same for both scenarios, and after both procedures (i.e., Step 5A and Step 8B), the results are jointly evaluated. Step 1A/B. Assemble the data needed to apply the procedure. Step 2A/B. Select CPMs and their respective variables. Step 3A/B. Calibrate CPMs. Step 4A. Estimate the modified R2, MAD, dispersion parameter, CURE plot, and the percent exceeding the confidence limits. Step 4B. Sensitivity analysis for network screening: Compute the EB Expected or EB Excess estimate for each site. Step 5B. Rank all sites. Step 6B. Determine Spearmanâs correlation coefficient and compare the rankings. Step 7B. Tabulate the top 30, 50, and 100 sites. Step 5A/8B. Evaluate the alternate CPMs relative to the full CPM. The following steps describe data needs, equations, variable estimation, selected GOF measures, and outcome related to the quantitative assessment of the degree of reliability. Step 1A/B. Assemble the Data Needed to Apply the Procedure Assemble all data required for applying the CPM. HSM (Part C, Appendix A2) guidance on minimum sample sizes can be used to determine the number of locations required. All variables required for applying the full CPM are collected. This may mean a sub-sample of all sites in a jurisdiction is used to assess the reliability of alternate CPMs. Step 2A/B. Select CPMs and Their Respective Variables Decide how many alternate CPMs are to be compared and which variables will be included in each model. One of the CPMs will be the full CPM with all variables and another should include only traffic-volume variables and length, if the site type is a segment. The other CPMs of interest are derived from the full CPM by removing one or more variables. For these derived CPMs, variables not to include may be those that are difficult to obtain for all sites or where the estimated values are suspect. To remove a variable from a CPM, substitute the average value for that variable from the data used to develop the CPM. Step 3A/B. Calibrate CPMs Follow the HSM guidance (Part C, Appendix A) for calibrating the CPMs if they were developed in another jurisdiction or a different time period than the data to which they will be applied. Step 4A. Estimate the Modified R2, MAD, Dispersion Parameter, CURE Plot, and the Percent Exceeding the Confidence Limits There are two sub-steps for this computation: 1. For each CPM being considered, estimate the modified R2, MAD, dispersion parameter, and the percent of observations outside of two standard deviation (2Ï) limits for the CURE plot for the fitted values.
50 Reliability of Crash Prediction Models: A Guide for Quantifying and Improving the Reliability of Model Results It is recommended to use The Calibrator (https://safety.fhwa.dot.gov/rsdp/toolbox- content.aspx?toolid=150) to perform this sub-step. For ease of comparison, it is recom- mended to calibrate a constant dispersion parameter. Modified R2 â â â â ( ) ( ) = â â µ â â 2 2 2 2R y y y y y ii ıi ii ıi where yi = observed counts yı = predicted values from the SPF yâ = sample average µı = yi â yı Mean Absolute Deviation â = â MAD y y n i ii where yi = predicted values from the SPF yi = observed counts n = validation data sample size Dispersion Parameter ( ){ } { } { }= + 2Var m E m f k E m or ( ) { } { } { } = â 2f k Var m E m E m where f(k) = estimate of the dispersion parameter Var{m} = estimated variance of mean crash rate E{m} = estimated mean crash rate CURE Plot and the Percent Exceeding the Confidence Limits CURE plots will enable the computation of the percent of cumulative residuals exceeding the 95% confidence limits. 2. For each of the measures estimates in sub-step 1, divide the values by the value for the full CPM with all variables. This sub-step is undertaken to assess the changes in each measure compared with the full model which should (in theory) have the best fit to the data. Step 4B. Sensitivity Analysis for Network Screening: Compute the EB Expected or EB Excess Estimate for Each Site The practitioner sets the number of years of observed crash data to be used in the network screening program and whether sites are to be screened by the EB Expected or the EB Excess methods. Then, the screening measure is computed for each site (for each CPM applied) as
Quantifying the Reliability of CPM Estimates for How the Number of Variables in Crash Prediction Models Affects Reliability 51  outlined in the HSM (Chapter 3, Chapter 4, Part C, Appendix A). If sites are road segments, divide the screening measure estimates by length to normalize by length. Note: if the EB Excess method is selected, it requires that the EB estimate have the CPM estimate subtracted from it. This CPM should be a simple CPM representing an average site, with only AADT and length (if a segment) as a predictor variable. Step 5B. Rank All Sites For each CPM applied, rank all sites separately by the network screening measure used (EB Expected or EB Excess). Step 6B. Determine the Spearmanâs Correlation Coefficient and Compare the Rankings For each ranked list, determine Spearmanâs correlation coefficient and compare the rankings using the CPM with all variables and alternate CPMs. Note that the same sites must be repre- sented on all ranked lists. Spearmanâs Correlation Coefficient â ( ) ( ) ( ) = â â â 1 6 1 2 2 Spearmanâs correlation coefficient Rho Rank Rank n n full alti where Rankfull = rank number using the full CPM with all variables Rankalt = rank number using the alternate CPM n = number of sites in ranked list Step 7B. Tabulate the Top 30, 50, and 100 Sites For each ranked list of EB Expected or EB Excess estimates, the percentage of false positives is calculated, meaning the percentage of those sites not included in the ranked lists using each of the alternate CPMs is tabulated for the top 30, 50, and 100 sites ranked using the CPM with all variables (full CPM). Step 5A/8B. Evaluate the Alternate CPMs Relative to the Full CPM Using the GOF measures calculated in Step 4A, and Step 6B and Step 7B, evaluate the alter- nate CPMs using the guidance in Tables 13 and 14, respectively. Using Table 13, each CPM Modified R2 relative to full CPM >= 0.90 0.76 to 0.90 0.50 to 0.75 <0.50 MAD relative to full CPM <1.20 1.20 to 1.50 1.51 to 2.00 >2.00 Overdispersion relative to full CPM <1.20 1.20 to 1.50 1.51 to 2.00 >2.00 % Values outside of CURE plot versus fitted values relative to full CPM <1.20 1.20 to 1.50 1.51 to 2.00 >2.00 Measure Reliability Rating High Medium Low Critically Low Table 13. Sensitivity analysis of predicted crash values (GOF measure evaluation guidance).
52 Reliability of Crash Prediction Models: A Guide for Quantifying and Improving the Reliability of Model Results Measure Reliability Rating High Medium Low Critically Low Spearmanâs correlation coefficient (Rho) 0.90 to 1.00 0.70 to 0.89 0.40 to 0.69 <0.40 % False positives in top 30 sites <10% 11% to 25% 26% to 40% >40% % False positives in top 50 sites <7.5% 7.6% to 20% 21% to 40% >40% % False positives in top 100 sites <5% 6% to 15% 15% to 40% >40% Table 14. Sensitivity analysis for network screening evaluation guidance (GOF measure evaluation guidance). Joint Example ApplicationâScenario 3, Case A, and Scenario 3, Case B Scenario 3, Case A Reliability of CPM estimates with a focus on the number of variables in CPM (for design applications or evaluation of countermeasures), and Scenario 3, Case B Reliability of CPM estimates with a focus on the number of variables in CPM (for network screening). Question: An agency desires to assess the significance of the addition or absence of one or more variables of a CPM developed for run-off-road crashes on two-lane rural roads in California. This CPM (i.e., full CPM) was sourced for the FHWA Study of Improvements related to pavement friction projects (Merritt et al. 2015). Further, which CPM (with some but not all variables of the full CPM) is the recommended one in the context of the variables that are worth the cost of data collection and the reliability of its estimates. The CPM (full CPM) is as follows: Crashes/mile-year = exp (â4.3617 + 0.2162 * Urbrur + 0.1872 * Surftype â 0.0448 * Avgshldwid â 0.0852 * Lanewid + Terrain) (AADT)0.5560 where AADT = annual average daily traffic Urbrur = zero if rural environment; 1 if urban Surftype = 1 if asphalt; zero if concrete Avgshldwid = average of left and right shoulder width in feet Lanewid = lane width in feet Terrain = â0.3181 if flat, 0.0000 if rolling, 0.3464 if mountainous Overdispersion parameter k = 0.7667
Quantifying the Reliability of CPM Estimates for How the Number of Variables in Crash Prediction Models Affects Reliability 53  is evaluated for each GOF measure for design applications or evaluation of countermeasures. Using Table 14, each CPM is evaluated for each GOF measure for network screening applica- tions. For these measures, the CPM is evaluated by examining each GOF measure relative to that for the full CPM with all pertinent variables included. Each guidance table classifies each GOF measure for reliability as High, Medium, Low, or Critically Low. The most reliable rating is High, while the worst is Critically Low. The overall rating for the CPM is defined by the lowest reliability rating of all the measures in each table. Outline of Solution Step 1A/B. Assemble the Data Needed to Apply the Procedure The data used to develop the CPM for run-off-road crashes on two-lane rural roads in California were collected to use in this procedure. Step 2A/B. Select CPMs and Their Respective Variables A series of CPMs were applied to the same data starting with only AADT as a predictive vari- able and adding one variable at a time. When removing a variable from the full CPM, the average value for that variable from the dataset used to develop the full CPM is applied to all sites for the CPM with fewer variables. For non-continuous variables such as terrain, the average value of that variable multiplied by its parameter estimate was used to remove that variable from the CPM. The average values were as follows: Urbrur = value of 0.0231 Surftype = value of 0.1831 Avgshldwid = 9.8 feet Lanewid = 11.8 feet Terrain = value of 0.0046 Five CPMs with fewer variables than the full CPM (CPM 6 follows) were developed. They are as follows: CPM 1: AADT CPM 2: AADT, AREA TYPE CPM 3: AADT, AREA TYPE, TERRAIN CPM 4: AADT, AREA TYPE, TERRAIN, LANEWIDTH CPM 5: AADT, AREA TYPE, TERRAIN, LANEWIDTH, SHOULDERWIDTH CPM 6: AADT, AREA TYPE, TERRAIN, LANEWIDTH, SHOULDERWIDTH, SURFACE TYPE Step 3A/B. Calibrate CPMs The Calibrator (https://safety.fhwa.dot.gov/rsdp/toolbox-content.aspx?toolid=150) was used to calibrate the CPMs. Step 4A. Estimate the Modified R2, MAD, Dispersion Parameter, CURE Plot, and the Percent Exceeding the Confidence Limits There are two sub-steps for this computation: 1. For each CPM, the modified R2, MAD, dispersion parameter, and the percent of observa- tions outside of 2Ï limits for the CURE plot for the fitted values were estimated. These results are shown in Table 15 Columns 2, 4, and 6, and Table 16 Column 2, respectively.
54 Reliability of Crash Prediction Models: A Guide for Quantifying and Improving the Reliability of Model Results 2. Each of the measures estimates in sub-step 1 was divided by the respective value for the full CPM. These results are shown in Table 15 Columns 3, 5, and 7, and Table 16 Column 3, respectively. The GOF measures shown in Table 15 and Table 16 indicate a noticeable improvement for CPM 3 (addition of terrain) and CPM 5 (addition of shoulder width). For the other variables, the GOF measures did not indicate much change, thus, their impacts on reliability are less. Step 4B. Sensitivity Analysis for Network Screening: Compute the EB Expected or EB Excess Estimate for Each Site The practitioner selected the past 3 years of observed run-off-road crash data for two-lane rural roads. This dataset was used to calculate the EB Expected estimates. All sites were divided by segment length to normalize by length. Step 5B. Rank All Sites For each CPM applied, all sites were ranked by EB Expected estimates. 1 0.30 0.57 2.88 1.14 1.03 1.32 2 0.30 0.57 2.90 1.15 1.05 1.35 3 0.42 0.79 2.70 1.07 0.89 1.14 4 0.44 0.83 2.66 1.05 0.87 1.12 5 0.53 1.00 2.53 1.00 0.78 1.00 6 0.53 1.00 2.53 1.00 0.78 1.00 CPM Modified R2 Modified R2 related to CPM 6 (i.e., full model) MAD MAD related to CPM 6 (i.e., full model) Overdispersion Overdispersion relative to CPM 6 (i.e., full model) Table 15. Sensitivity analysis of predicted crash values (GOF measures). CPM % Values outside of CUREplot versus fitted values % Values outside of CURE plot versus fitted values relative to CPM 6 (i.e., full model) 1 96 2.46 2 97 2.49 3 38 0.97 4 35 0.90 5 39 1.00 6 39 1.00 Note: The Calibrator was used to generate the GOF measures in this step and the subsequent ones. Table 16. Sensitivity analysis of predicted crash values (GOF measures).
Quantifying the Reliability of CPM Estimates for How the Number of Variables in Crash Prediction Models Affects Reliability 55  Step 6B. Determine Spearmanâs Correlation Coefficient and Compare the Rankings For each ranked list, Spearmanâs correlation coefficient (Rho) was computed and compared with the Rho of CPM 6 (full CPM). These results are shown in Table 17, Column 2. Step 7B. Tabulate the Top 30, 50, and 100 Sites For each ranked list of EB Expected, the percentage of false positives was calculated, meaning the percentage of those sites not included in the ranked lists using each of the alternate CPMs is tabulated for the top 30, 50, and 100 sites ranked using the CPM with all variables (full CPM). These results are shown in Table 17 Columns 3, 4, and 5, respectively. Step 5A/8B. Evaluate the Alternate CPMs Relative to the Full CPM Using the GOF measures calculated in Step 4A, and Step 6B and Step 7B, the alternate CPMs were evaluated using the guidance in Tables 13 and 14, respectively. The results are shown in Table 18 and Table 19, respectively. The worst rating for all GOF measures in each table is used to rate the reliability of the alternate CPMs. These are highlighted in the tables for each CPM. 3 years CPM Rho 30 50 100 1 0.87 33 16 21 2 0.87 27 20 23 3 0.95 23 12 14 4 0.96 17 14 15 5 1.00 0 0 1 6 1.00 0 0 0 Table 17. Sensitivity analysis for network screening evaluation (Rho and the percentage of false positives for top 30, 50, and 100 sites ranked lists). CPM Modified R 2 related to CPM 6 MAD related to CPM 6 Overdispersion related to CPM 6 % Values outside of CURE plot versus fitted values related to CPM 6 1 Low High Medium Critically Low 2 Low High Medium Critically Low 3 Medium High High High 4 Medium High High High 5 High High High High Table 18. Results of the sensitivity analysis of predicted crash values.
56 Reliability of Crash Prediction Models: A Guide for Quantifying and Improving the Reliability of Model Results The results indicate that among the alternate CPMs evaluated, CPM 1 and CPM 2 should not be applied as their reliability ratings are Critically Low for the predicted crash values GOF mea- sures and Low for the network screening GOF measures. CPM 3 and CPM 4 are rated Medium for predicted crash values GOF measures and network screening GOF measures, while CPM 5 is rated High for both sets of measures. Based on these results, the practitioner decides to investigate the cost of the data collection effort and to consider the importance of accuracy in the applications. Further, the practitioner will decide which CPM (3, 4, or 5) will be used based on the viability of collecting the neces- sary variable data. The practitioner has also decided that since CPM 5 is rated High by both sets of measures, the jurisdiction will not need to consider the extra effort in data collection to use CPM 6. Table 19. Results of the sensitivity analysis for network screening evaluation. Measure CPM 1 2 3 4 5 Spearmanâs correlation coefficient Medium Medium High High High % False positives in top 30 sites Low Low Medium Medium High % False positives in top 50 sites Medium Medium Medium Medium High % False positives in top 100 sites Low Low Medium Medium High