Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
1  This Guide presents the results of NCHRP Project 17-78, âUnderstanding and Communi- cating Reliability of Crash Prediction Models.â The project was conducted by the Highway Safety Research Center at the University of North Carolina; Kittelson and Associates, Inc.; Persaud and Lyon, Inc.; and NAVIGATS Inc. Objectives The objectives of NCHRP Project 17-78 were to assist practitioners working in transporta- tion and road safety analysis in the following areas: ⢠How to quantify the impact of selecting or neglecting certain data parameters in the safety estimate predictions ⢠How to estimate, interpret, and improve the reliability of predictions and use of crash modi- fication factors (CMFs) in the various analyses (i.e., safety programmatic decisions, or design or operational decisions) ⢠How to communicate the results to managers, other practitioners, and the general public Audience The primary audience for this Guide is practitioners working in transportation and road safety. These practitioners should be familiar with the AASHTO Highway Safety Manualâs (HSMâs) (2010) Chapter 3: Fundamentals and Part C. NCHRP Web-Only Document 303: Understanding and Communicating Reliability of Crash Prediction Models and this Guide provide researchers with pertinent information for their future research publication on crash prediction models (CPMs) and CMFs. This information is needed to support the required estimation of reliability of their CPMs, CMFs, and other safety analyses. How to Use This Guide NCHRP Web-Only Document 303 includes comprehensive descriptions of several scenarios encountered by practitioners in their analyses using CPMs and CMFs. These scenarios include safety estimate predictions; methods and procedures to quantify the impact of selecting or neglecting certain data parameters in the safety estimate predictions; estimating, interpreting, and improving the reliability of their predictions; and the use of CMFs in the various analyses. C H A P T E R 1 Introduction and Background
2 Reliability of Crash Prediction Models: A Guide for Quantifying and Improving the Reliability of Model Results In this Guide, each scenario contains an overview of the method and the procedure used to assess the reliability of specific conditions, followed by an example illustrating the procedure and an explanation of how to communicate the results to managers, other practitioners, and the general public. The following scenarios are presented: ⢠Quantifying the Reliability of CPM Estimates for Mismatches Between Crash Modification Factors and SPF Base Conditions (Chapter 2) ⢠Quantifying the Reliability of CPM Estimates for Error in Estimated Input Values (Chapter 3) ⢠Quantifying the Reliability of CPM Estimates for How the Number of Variables in Crash Prediction Models Affects Reliability (Chapter 4) ⢠Reliability Associated with Using a Crash Prediction Model to Estimate Frequency of Rare Crash Types and Severities (Chapter 5) ⢠Reliability Associated with Predicting Outside the Range of Independent Variables (Chapter 6) ⢠Reliability Associated with Crash Prediction Models Estimated for Other Facility Types (Chapter 7) Definitions Definitions of statistical concepts used in this Guide are given here. These concepts are used in the methods and procedures described in the following chapters. Reliability Reliability of a CPM can be described in terms of bias, variance, and repeatability as follows: ⢠Bias represents the difference between the CPM estimate (i.e., the estimated average crash frequency) and the true value. ⢠Variance describes the extent of uncertainty in the CPM estimate due to unexplained or random influences. ⢠Repeatability describes the extent to which multiple practitioners using the same CPM with the same training, data sources, and site of interest obtain the same results (as measured by the number of significant figures showing agreement among results). A more reliable CPM estimate has little bias, a smaller variance, and is likely to have results that show several significant figures in agreement (should there be repeated independent applications). Goodness-of-Fit Measures The goodness of fit (GOF) of a statistical model describes how well it fits into a set of observa- tions. GOF measures indicate the discrepancy between the true values and the values predicted using a statistical model. GOF measures allow for a quantitative assessment of the degree of bias and variance introduced due to issues such as uncertainty in estimated input values, calibration of safety performance functions (SPFs), and use of CMFs with different base conditions to SPFs. Some GOF measures can be used to compare the relative performance of potential SPFs for a given application while others can be used to determine acceptable thresholds when assessing the adequacy of a single SPF. Subjective judgment may be required to supplement some of the GOF measures. In this Guide, the following GOF measures are used: ⢠Overdispersion parameter ⢠Increased root mean square error
Introduction and Background 3  ⢠Coefficient of variation (CV) for the increased root mean square error ⢠Percent bias ⢠Root mean square difference (RMSD) ⢠Mean absolute difference ⢠Extreme value ⢠Spearmanâs correlation coefficient (Rho) ⢠Percentage of false positives ⢠Modified R2 ⢠Mean absolute deviation (MAD) ⢠Coefficient of variation of the calibration factor [CV(C)] ⢠CURE Plots and Related Measures Overdispersion Parameter The overdispersion parameter f(k) in the negative binomial distribution is reported from the variance equation expressed as follows: ( ){ } { } { }= + 2Var m E m f k E m or ( ) { } { } { } = â 2f k Var m E m E m where f(k) = estimate of the overdispersion parameter (can be a constant or as a function of site characteristics) Var{m} = estimated variance of average crash frequency E{m} = estimated average crash frequency The estimated variance increases as dispersion increases, and, consequently, the standard errors of estimates are inflated. As a result, all else being equal, an SPF with lower dispersion parameter estimates [i.e., smaller values of f(k)] is preferred to an SPF with more dispersion. Note that f(k) can be specified as a constant or as a function of site characteristics. When f(k) is a constant or a constant per length, it may be used to easily compare multiple CPMs. Increased Root Mean Square Error The increased root mean square error measure is an indication of the overall reliability of the CPM estimate (i.e., the estimated average crash frequency). It is a measure of the uncertainty added to the estimate due to the use of a biased value of the overdispersion parameter, predicted crash frequency, or both. [ ]Ï = Ï + ee I abs, 2 2 0.5 with ( ) ( )Ï = à â à = â k N k N e N N abs reported p p true p true p p true 2 2 , , 2 ,
4 Reliability of Crash Prediction Models: A Guide for Quantifying and Improving the Reliability of Model Results where Ïe,I = increased root mean square error e = error in predicted crash frequency Ï2abs = absolute difference of the change in variance of the predicted value kreported = reported overdispersion parameter for CPM kp,true = predicted true overdispersion parameter Np = predicted crash frequency from CPM, crashes/year Np,true = predicted true crash frequency, crashes/year Coefficient of Variation for the Increased Root Mean Square Error The increased root mean square error can be normalized by dividing it by the predicted true crash frequency. This division produces a coefficient of variation (CV) that facilitates the relative comparison of alternative CPMs and alternative applications of a given CPM. = Ï , , CV NI e I p true where CVI = the coefficient of variation for the increased root mean square error Ïe,I = increased root mean square error and all other variables are as previously defined Np,true = predicted true crash frequency, crashes/year A CV value of zero indicates that there is no bias or additional uncertainty in the predicted crash frequency obtained from the CPM. As the CV value increases, the prediction becomes less reliable because the bias has increased, the uncertainty has increased, or both. Values over 0.20 are considered to be unreliable for most applications. Percent Bias The percent bias measure indicates the relative error in the prediction. This measure is expressed as a percentage because the magnitude of the error is often correlated with the true (i.e., unbiased) value of the estimate. This characteristic facilitates the relative comparison of alternative CPMs and alternative applications of a given CPM. The percent bias measure is computed using the following equation: = Ã â 100 , , Bias N N N p p true p true where Bias = percent bias in reported value Np = predicted crash frequency from CPM, crashes/year Np,true = predicted true crash frequency, crashes/year The percent bias measure can be used to describe the predicted crash frequency obtained from the CPM, for a given application case. A percent bias value of zero indicates there is no bias in the reported overdispersion parameter or the predicted crash frequency. As the absolute value of bias increases, the overdispersion parameter or CPM prediction becomes less reliable. An absolute value of bias in excess of 10% is considered to be unreliable for most applications.
Introduction and Background 5  Root Mean Square Difference The root mean square difference (RMSD) is a measure of the variability of the difference between the CPM prediction with error in input values and the CPM prediction with the esti- mated values bias. â ( )= â  ï£ ï£¬    2 0.5 RMSD PRED PRED n error esti where PREDerror = predicted value from CPM with measurement error PREDest = predicted value from CPM with estimated value n = data sample size If expressing on a per year basis, RMSD is divided by the number of years of data. Mean Absolute Difference The mean absolute difference is a measure of the average absolute difference between the CPM prediction with error in input values and the CPM prediction with the estimated values. â= âMean Absolute Difference PRED PRED n error esti where PREDerror = predicted value from CPM with measurement error PREDest = predicted value from CPM with estimated value n = data sample size If expressing on a per year basis, the mean absolute difference is divided by the number of years of data. Extreme Value The extreme value is a measure of the magnitude of a high value of the mean absolute devia- tion (MAD). It is recommended to use the 85th percentile value although any percentile value desired by the practitioner may be selected. The calculation is based on an assumed gamma distribution of the values of the absolute difference and uses the methods of moments to deter- mine the alpha and theta parameters of the gamma distributions using the following equations: =  ï£ï£¬   2 alpha Mean Absolute Difference RMSD =theta RMSD Mean Absolute Difference 2 The value of the absolute difference at the desired percentile level can be determined using online calculators, such as https://homepage.divms.uiowa.edu/~mbognar/applets/gamma.html, or using statistical textbooks. For example, using the 85th percentile and estimated gamma
6 Reliability of Crash Prediction Models: A Guide for Quantifying and Improving the Reliability of Model Results distribution parameters, the practitioner estimates the value of absolute deviation that 85% of sites would be expected to be less than or equal to, or conversely, the value that 15% of sites may exceed. Spearmanâs Correlation Coefficient The Spearmanâs correlation coefficient (Rho) is used to compare network screening (Step 1 of Road Safety Management ProcessâChapter 4, HSM) rankings using the CPMs with measure- ment error with the ranking using the CPM with the original estimated values. It is noted that the same sites must be represented on both ranked lists. â ( ) ( ) ( ) = â â â 1 6 1 2 2 Spearmanâs correlation coefficient Rho Rank Rank n n error esti where Rankerror = rank number using CPM with measurement error Rankest = rank number using CPM with estimated value(s) n = number of sites in ranked list Percentage of False Positives For network screening (Step 1 of Road Safety Management ProcessâChapter 4, HSM), the percentage of those sites not included in the ranked lists using the other CPM with measurement error is tabulated for the top 30, 50, and 100 sites ranked using the CPM with estimated values. Modified R2 Even in a perfect SPF, some variation in observed crash counts would be observed due to the random nature of crashes. The modified R2 value is a GOF measure that âdiffers from the ordinary R2 statistic only in that the amount of normal random variation has been subtracted from the total sample variation appearing in the denominator of the equationâ (Fridstrom et al. 1995). As a result, the amount of systematic variation explained by the SPF is measured using this GOF. Larger values indicate a better fit to the data in comparing two or more competing SPFs. Values greater than 1.0 indicate that the SPF is overfit, and some of the expected random variation is incorrectly explained as the systematic variation. â â â â ( ) ( ) = â â µ â â 2 2 2 2R y y y y y ii ıi ii ıi where yi = observed crash counts yı = predicted crash values from the SPF yâ = sample average µı = yi â yı Mean Absolute Deviation MAD is a measure of the average value of the absolute difference between observed and pre- dicted crashes.
Introduction and Background 7  â = â MAD y y n i ii where yï¤i = predicted crash values from the SPF yi = observed crash counts n = validation data sample size Coefficient of Variation of the Calibration Factor The coefficient of variation of the calibration factor [CV(C)] is the standard deviation of the calibration factor divided by the estimate of the calibration factor, as shown in the following equation. ( ) ( )=CV C V C C where CV(C) = coefficient of variation of the calibration factor C = estimate of the calibration factor V(C) = variance of the calibration factor can be calculated as follows: â â( ) ( ) ( )= â â 2 2 V C y k y y i ii ıi where yi = observed crash counts yï¤Ä± = uncalibrated predicted values from the SPF k = overdispersion parameter (recalibrated) A value above 0.15 for the CV(C) is considered a measure of an unsuccessful calibration. CURE Plots and Related Measures A CURE plot is a graph of the cumulative residuals (observed minus predicted crashes) against a variable of interest sorted in ascending order [e.g., major road annual average daily traffic (AADT)]. CURE plots provide a visual representation of GOF over the range of a given variable, and help to identify potential concerns such as the following: ⢠Long trends: Long trends in the CURE plot (increasing or decreasing) indicate regions of bias that practitioners should rectify through improvement to the SPF. This can be seen from the CURE plots. ⢠Percent exceeding the confidence limits [outside 95% confidence interval (CI) (%)]: Cumu- lative residuals outside the 95% confidence limits indicate a poor fit over that range in the variable of interest. Cumulative residuals frequently outside the confidence limits indicate possible bias in the SPF. ⢠Vertical changes (Max_Cure): Large vertical changes in the CURE plot are potential indi- cators of outliers, which require further examination.
8 Reliability of Crash Prediction Models: A Guide for Quantifying and Improving the Reliability of Model Results ⢠Maximum value exceeding 95% confidence limits (Max_DCure): This measures the dis- tance between the cumulative residuals and the 95% confidence limits if cumulative residuals are outside the confidence limits. The bigger the values, the poorer the fit. ⢠Average value exceeding 95% confidence limits (Avg_DCure): While Max_DCure mea- sures the maximum difference between cumulative residuals and the 95% confidence limits, Avg_DCure measures the overall distance between the cumulative residuals and the 95% confidence limits for those outside the confidence limits. Similar to Max_DCure, a smaller average value exceeding 95% indicates less bias in the SPF. Further information about CURE plots can be found in Chapter 7 of The Art of Regression Modeling in Road Safety (Hauer 2015) and in The Calibrator: An SPF Calibration and Assess- ment Tool: User Guide (Lyon et al. 2016).