Read "Improved Prediction Models for Crash Types and Crash Severities" at NAP.edu

« Previous: 1. Background

Page 19

Suggested Citation:"2. Analysis Approaches." National Academies of Sciences, Engineering, and Medicine. 2021. Improved Prediction Models for Crash Types and Crash Severities. Washington, DC: The National Academies Press. doi: 10.17226/26164.

Page 20

Page 21

Page 22

Page 23

Page 24

Page 25

Page 26

Page 27

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

3Â 2 ANALYSIS APPROACHES 2.1 SCOPE OF REPORT WeÂ reportÂ hereÂ twoÂ typesÂ ofÂ crashÂ frequencyÂ modelsÂ byÂ crashÂ typeÂ andÂ crashÂ severity.Â BaseÂ conditionÂ modelsÂ areÂ estimatedÂ usingÂ onlyÂ sitesÂ thatÂ meetÂ theÂ âbaseÂ conditionâÂ andÂ includeÂ onlyÂ trafficÂ volumeÂ asÂ anÂ explanatoryÂ variable;Â theseÂ modelsÂ supportÂ theÂ HSMÂ PartÂ CÂ predictiveÂ methodology.Â AverageÂ conditionÂ modelsÂ areÂ estimatedÂ usingÂ allÂ sitesÂ andÂ containÂ exposureârelatedÂ variables,Â suchÂ asÂ averageÂ annualÂ dailyÂ trafficÂ (AADT)Â andÂ driveways;Â theyÂ applyÂ forÂ averageÂ conditionsÂ ofÂ nonâexposureÂ variables.Â Â ForÂ mostÂ facilityÂ types,Â weÂ reportÂ baseÂ conditionÂ modelsÂ toÂ keepÂ theseÂ modelsÂ compatibleÂ withÂ theÂ methodologyÂ ofÂ theÂ currentÂ HSM.Â ForÂ aÂ fewÂ facilityÂ types,Â weÂ neededÂ toÂ relaxÂ someÂ ofÂ theÂ baseÂ conditionÂ definitionsÂ toÂ achieveÂ aÂ largeÂ enoughÂ sampleÂ sizeÂ toÂ estimateÂ significantÂ models.Â ForÂ aÂ fewÂ facilityÂ types,Â theÂ totalÂ sampleÂ sizeÂ wasÂ muchÂ smaller,Â soÂ weÂ hadÂ toÂ useÂ allÂ casesÂ toÂ estimateÂ significantÂ models;Â weÂ reportÂ averageÂ conditionÂ modelsÂ forÂ theseÂ facilityÂ types,Â asÂ wellÂ asÂ forÂ theÂ restÂ ofÂ theÂ facilityÂ typesÂ inÂ AppendixÂ A.Â Â ThisÂ reportÂ doesÂ notÂ containÂ probabilisticÂ crashÂ severityÂ modelsÂ orÂ modelsÂ thatÂ includeÂ bothÂ exposureÂ andÂ nonâexposureÂ covariates.Â AsÂ willÂ beÂ discussedÂ later,Â ourÂ effortsÂ toÂ estimateÂ theseÂ typesÂ ofÂ modelsÂ wereÂ unsuccessful.Â ThisÂ sectionÂ ofÂ theÂ reportÂ documentsÂ ourÂ crashÂ typeÂ definitions,Â ourÂ estimationÂ approachÂ forÂ crashÂ countÂ models,Â ourÂ explorationÂ ofÂ probabilisticÂ crashÂ severityÂ models,Â andÂ ourÂ explorationÂ ofÂ improvementsÂ forÂ theÂ modelÂ calibrationÂ procedure.Â Â Â 2.2 CRASH TYPE DEFINITIONS Crash Types TheÂ selectionÂ ofÂ crashÂ typesÂ forÂ whichÂ modelsÂ wouldÂ beÂ developedÂ wasÂ basedÂ onÂ severalÂ criteria:Â 1. TheÂ crashÂ typesÂ includedÂ inÂ theÂ currentÂ HSMÂ chaptersÂ forÂ whichÂ proportionsÂ ofÂ totalÂ crashesÂ areÂ providedÂ 2. TheÂ crashÂ typesÂ identifiableÂ fromÂ electronicÂ crashÂ recordsÂ inÂ theÂ datasetsÂ usedÂ forÂ theÂ projectÂ 3. TheÂ crashÂ typesÂ representedÂ inÂ theÂ estimationÂ andÂ validationÂ datasetsÂ 4. TheÂ crashÂ typesÂ toÂ whichÂ availableÂ CMFsÂ inÂ theÂ HSMÂ applyÂ forÂ eachÂ siteÂ typeÂ WhileÂ weÂ triedÂ toÂ maintainÂ consistencyÂ ofÂ theÂ crashÂ typesÂ estimatedÂ amongÂ allÂ facilityÂ types,Â considerationÂ ofÂ theseÂ criteriaÂ didÂ resultÂ inÂ someÂ differencesÂ inÂ theÂ finalÂ arrayÂ ofÂ crashÂ typeÂ modelsÂ fromÂ oneÂ facilityÂ typeÂ toÂ another.Â Â NoteÂ thatÂ modelsÂ forÂ pedestrianÂ andÂ bicycleÂ crashesÂ haveÂ notÂ beenÂ estimatedÂ dueÂ toÂ veryÂ smallÂ sampleÂ sizesÂ inÂ theÂ availableÂ data.Â TheseÂ crashÂ typesÂ mayÂ stillÂ beÂ analyzedÂ usingÂ theÂ existingÂ HSMÂ approach.Â Â NoteÂ alsoÂ thatÂ animalÂ collisionsÂ areÂ notÂ includedÂ inÂ anyÂ ofÂ theÂ crashÂ typesÂ (theyÂ areÂ mostÂ likelyÂ toÂ beÂ identifiedÂ asÂ singleâvehicleÂ crashes).Â OurÂ rationaleÂ forÂ theÂ omissionÂ isÂ thatÂ animalÂ crashesÂ haveÂ moreÂ toÂ doÂ withÂ environmentalÂ factorsÂ thanÂ roadÂ characteristics.Â SinceÂ theÂ HSMÂ predictiveÂ methodsÂ areÂ focusedÂ moreÂ onÂ providingÂ guidanceÂ forÂ selectingÂ safetyÂ treatmentsÂ orÂ predictingÂ expectedÂ crashÂ countsÂ relatedÂ toÂ roadÂ

4Â characteristics,Â itÂ isÂ notÂ clearÂ howÂ modelsÂ predictingÂ animalÂ collisionsÂ wouldÂ fitÂ intoÂ theÂ modelÂ framework.Â WeÂ noteÂ theÂ existenceÂ ofÂ aÂ largeÂ bodyÂ ofÂ researchÂ intoÂ animalâvehicleÂ collisionsÂ andÂ suggestÂ thatÂ bodyÂ ofÂ workÂ beÂ consultedÂ forÂ considerationÂ ofÂ thisÂ collisionÂ typeÂ inÂ safetyÂ managementÂ procedures.Â Â WeÂ haveÂ definedÂ theÂ crashÂ typesÂ shownÂ inÂ FigureÂ 2â1Â toÂ estimateÂ models:Â Â FigureÂ 2â1:Â GeneralÂ TaxonomyÂ ofÂ CrashÂ TypesÂ Â TheÂ taxonomyÂ shownÂ inÂ FigureÂ 2â1Â providesÂ forÂ severalÂ levelsÂ ofÂ disaggregationÂ ofÂ theÂ crashÂ typesÂ accordingÂ toÂ theÂ numberÂ ofÂ vehiclesÂ involved,Â theirÂ directionÂ ofÂ travel,Â andÂ theÂ mannerÂ ofÂ theÂ collision.Â TheÂ justificationÂ forÂ creatingÂ theseÂ categoriesÂ isÂ asÂ follows:Â ï· EachÂ crashÂ typeÂ withinÂ eachÂ categoryÂ involvesÂ vehiclesÂ collidingÂ inÂ theÂ sameÂ wayâthatÂ is,Â frontÂ toÂ front,Â frontÂ toÂ rear,Â frontÂ toÂ side,Â andÂ soÂ on.Â ThisÂ resultsÂ inÂ similarÂ crashÂ severityÂ profiles,Â asÂ confirmedÂ byÂ ZhangÂ etÂ al.Â (2007).Â ï· EachÂ crashÂ typeÂ withinÂ eachÂ categoryÂ isÂ associatedÂ withÂ aÂ similarÂ distributionÂ ofÂ contributingÂ factors,Â asÂ assignedÂ byÂ investigatingÂ officersÂ (ZhangÂ etÂ al.Â 2007).Â ThisÂ suggestsÂ commonÂ covariatesÂ andÂ exposureÂ functionsÂ forÂ theseÂ associatedÂ collisionÂ types.Â Â ï· SingleâvehicleÂ andÂ oppositeâdirectionÂ crashesÂ haveÂ veryÂ differentÂ relationshipsÂ withÂ exposureÂ (IvanÂ 2004),Â soÂ whileÂ theirÂ collisionÂ patternsÂ andÂ contributingÂ factorsÂ areÂ similar,Â theyÂ couldÂ haveÂ veryÂ differentÂ modelÂ forms.Â Â ï· ExperienceÂ withÂ crashÂ typeÂ predictionÂ suggestsÂ thatÂ splittingÂ theÂ crashÂ countÂ intoÂ tooÂ manyÂ categoriesÂ cripplesÂ theÂ estimationÂ process,Â asÂ theÂ crashÂ countÂ forÂ eachÂ typeÂ getsÂ smallerÂ andÂ smaller.Â TheÂ aggregationÂ categoriesÂ definedÂ hereÂ permitÂ findingÂ aÂ balanceÂ thatÂ maximizesÂ differencesÂ inÂ crashÂ severityÂ andÂ likelyÂ causalÂ factorsÂ betweenÂ groupsÂ andÂ minimizesÂ themÂ withinÂ groups.Â Â TheÂ dataÂ didÂ notÂ supportÂ successfulÂ estimationÂ ofÂ modelsÂ forÂ allÂ ofÂ theseÂ crashÂ typesÂ forÂ eachÂ facilityÂ type,Â suchÂ thatÂ coefficientsÂ onÂ theÂ AADTÂ variablesÂ wereÂ notÂ significantÂ orÂ receivedÂ negativeÂ coefficients,Â thereÂ wereÂ insufficientÂ numbersÂ ofÂ observedÂ crashesÂ orÂ theÂ modelsÂ didÂ notÂ converge.Â Also,Â forÂ theÂ urban/suburbanÂ segmentÂ models,Â multipleâvehicleÂ crashesÂ wereÂ classifiedÂ asÂ âdrivewayÂ relatedâÂ (MVD)Â andÂ âmultipleâvehicleÂ nonâdrivewayÂ otherâÂ (MVN).Â InÂ theseÂ cases,Â MVDÂ includedÂ theÂ followingÂ subtypes:Â SameÂ DirectionÂ (SD) â¢ RearÂ EndÂ (RE) â¢ SideswipeÂ SameÂ DirectionÂ (SSD) â¢ TurningÂ SameÂ DirectionÂ (TSD) IntersectingÂ DirectionÂ (ID) â¢ AngleÂ (ANG) â¢ TurningÂ IntersectingÂ DirectionÂ (TID) OppositeÂ DirectionÂ (OD) â¢ HeadÂ OnÂ (HO) â¢ SideswipeÂ OppositeÂ DirectionÂ (SOD) â¢ TurningÂ OppositeÂ DirectionÂ (TOD) SingleÂ VehicleÂ (SV) â¢ OverturnÂ orÂ RollÂ OverÂ (RO) â¢ FixedÂ ObjectÂ (FO) â¢ MovingÂ ObjectÂ (MO)

5Â turningÂ sameÂ directionÂ (TSD),Â allÂ intersectingÂ directionÂ (ID)Â types,Â andÂ turningÂ oppositeÂ directionÂ (TOD).Â MVNÂ includedÂ rearÂ endÂ (RE),Â headâonÂ (HO),Â sideswipeÂ sameÂ directionÂ (SSD),Â sideswipeÂ oppositeÂ directionÂ (SOD),Â andÂ MVNÂ otherÂ (thatÂ is,Â crashesÂ codedÂ asÂ parkedÂ vehicleÂ orÂ angle,Â thoughÂ notÂ atÂ drivewaysÂ orÂ intersections).Â InÂ additionÂ toÂ theÂ aboveÂ taxonomy,Â weÂ estimatedÂ nighttimeÂ crashesÂ (Night)Â forÂ someÂ facilityÂ typesÂ (Urban/suburbanÂ segments).Â TableÂ 2â1Â listsÂ theÂ baseÂ conditionÂ crashÂ typeÂ modelsÂ thatÂ wereÂ estimatedÂ forÂ eachÂ facilityÂ type.Â Â TableÂ 2â1:Â BaseÂ ConditionÂ CrashÂ TypeÂ ModelsÂ EstimatedÂ forÂ EachÂ FacilityÂ TypeÂ Â Â FacilityÂ TypeÂ MVDÂ MVNÂ MVNÂ OTHERÂ SDÂ REÂ SSDÂ IDÂ ODÂ HOÂ HOÂ +Â SODÂ SVÂ NIGHTÂ TwoâlaneÂ ruralÂ 2UÂ Â Â Â XÂ Â Â Â XÂ Â Â XÂ Â 3STÂ Â Â Â XÂ Â Â XÂ XÂ Â Â XÂ Â 4STÂ Â Â Â XÂ Â Â XÂ XÂ Â Â XÂ Â 4SGÂ Â Â Â XÂ Â Â XÂ XÂ Â Â XÂ Â MultilaneÂ ruralÂ 4UÂ Â Â Â XÂ Â Â XÂ XÂ Â Â XÂ Â 4DÂ Â Â Â XÂ Â Â XÂ XÂ Â Â XÂ Â 3STÂ Â Â Â XÂ Â Â XÂ XÂ Â Â XÂ Â 4STÂ Â Â Â XÂ Â Â XÂ XÂ Â Â XÂ Â 4SGÂ Â Â Â XÂ Â Â XÂ XÂ Â Â XÂ Â Urban/SuburbanÂ arterialsÂ 2UÂ XÂ XÂ XÂ Â XÂ XÂ Â Â Â XÂ Â XÂ 3TÂ XÂ XÂ XÂ Â XÂ XÂ Â Â Â XÂ XÂ XÂ 4UÂ XÂ XÂ XÂ Â XÂ XÂ Â Â Â XÂ XÂ XÂ 4DÂ XÂ XÂ XÂ Â XÂ XÂ Â Â Â XÂ XÂ XÂ 5TÂ XÂ XÂ XÂ Â XÂ XÂ Â Â Â XÂ XÂ XÂ 3STÂ Â Â Â XÂ Â Â XÂ XÂ Â Â XÂ Â 4STÂ Â Â Â XÂ Â Â XÂ XÂ Â Â XÂ Â 3SGÂ Â Â Â XÂ Â Â XÂ XÂ Â Â XÂ Â 4SGÂ Â Â Â XÂ Â Â XÂ XÂ Â Â XÂ Â Notes:Â FacilityÂ typeÂ codesâ2UÂ =Â twoâlaneÂ undividedÂ segments;Â 3TÂ =Â twoâlaneÂ segmentsÂ withÂ twoâwayÂ leftâturnÂ lane;Â 4UÂ =Â fourâlaneÂ undividedÂ segments;Â 4DÂ =Â fourâlaneÂ dividedÂ segments;Â 5TÂ =Â fourâlaneÂ segmentsÂ withÂ twoâwayÂ leftâturnÂ lane;Â 3STÂ =Â 3Â legÂ stopâcontrolledÂ intersections;Â 4STÂ =Â fourâlegÂ stopâcontrolledÂ intersections;Â 3SGÂ =Â threeâ legÂ signalâcontrolledÂ intersections;Â 4SGÂ =Â fourâlegÂ signalâcontrolledÂ intersections.Â CrashÂ typeÂ codesâMVDÂ =Â multipleâvehicleÂ drivewayÂ related;Â MVNÂ =Â multipleâvehicleÂ nonâdrivewayÂ related;Â MVNÂ OTHERÂ =Â multipleâvehicleÂ other;Â SDÂ =Â sameÂ directionÂ (allÂ severityÂ levels);Â REÂ =Â rearÂ end;Â SSDÂ =Â sideswipeÂ sameÂ direction;Â IDÂ =Â intersectingÂ direction;Â ODÂ =Â oppositeÂ directionÂ (allÂ severityÂ levels);Â HOÂ =Â headâon;Â HO+SODÂ =Â sideswipeÂ +Â oppositeÂ direction;Â SVÂ =Â singleÂ vehicleÂ (allÂ severityÂ levels);Â NIGHTÂ =Â nighttime.Â Â Â Â Â Â Â

6Â Delineation of Intersection Versus Segment Crashes InÂ theÂ HSMÂ methodology,Â roadwayÂ segmentÂ modelsÂ areÂ usedÂ toÂ predictÂ allÂ crashesÂ thatÂ occurÂ onÂ portionsÂ ofÂ roadwayÂ segmentsÂ thatÂ areÂ moreÂ thanÂ 250Â feetÂ fromÂ anÂ intersectionÂ andÂ nonâintersectionârelatedÂ crashesÂ thatÂ occurÂ onÂ portionsÂ ofÂ roadwayÂ segmentsÂ thatÂ areÂ withinÂ 250Â feetÂ ofÂ anÂ intersection.Â IntersectionÂ modelsÂ areÂ usedÂ toÂ predictÂ allÂ intersectionÂ andÂ intersectionârelatedÂ crashesÂ thatÂ occurÂ withinÂ 250Â feetÂ ofÂ theÂ intersection.Â TheÂ modelsÂ forÂ twoâlaneÂ ruralÂ roadsÂ andÂ forÂ urbanÂ andÂ suburbanÂ andÂ suburbanÂ arterialsÂ apparentlyÂ wereÂ developedÂ toÂ facilitateÂ thisÂ applicationÂ directly.Â Â ForÂ multilaneÂ ruralÂ roadsÂ inÂ statesÂ whereÂ theÂ crashÂ recordsÂ doÂ notÂ indicateÂ âintersectionâÂ orÂ âintersectionâ related,âÂ allÂ crashesÂ occurringÂ withinÂ 250Â feetÂ ofÂ theÂ middleÂ ofÂ anÂ intersectionÂ areÂ assignedÂ toÂ thatÂ intersection.Â TheÂ calibrationÂ procedureÂ isÂ expectedÂ toÂ allowÂ modelsÂ developedÂ forÂ suchÂ casesÂ toÂ beÂ appliedÂ toÂ casesÂ specifiedÂ inÂ theÂ HSMÂ methodology,Â andÂ viceÂ versa.Â TheseÂ modelsÂ wereÂ developedÂ toÂ beÂ asÂ consistentÂ withÂ theÂ HSMÂ methodologyÂ asÂ possible.Â InÂ theÂ OhioÂ databaseÂ usedÂ forÂ urbanÂ andÂ suburbanÂ arterialsÂ andÂ theÂ CaliforniaÂ databaseÂ usedÂ forÂ multilaneÂ ruralÂ roads,Â however,Â crashesÂ cannotÂ reliablyÂ beÂ identifiedÂ asÂ intersectionÂ orÂ intersectionârelated.Â Thus,Â theÂ intersectionÂ modelsÂ beingÂ developedÂ forÂ thoseÂ twoÂ databasesÂ andÂ facilityÂ typesÂ willÂ pertainÂ toÂ allÂ crashesÂ occurringÂ withinÂ 250Â feetÂ ofÂ theÂ centerÂ ofÂ anÂ intersection,Â andÂ theÂ segmentÂ modelsÂ willÂ applyÂ toÂ crashesÂ occurringÂ outsideÂ thisÂ boundary.Â AsÂ notedÂ previously,Â theÂ calibrationÂ procedureÂ willÂ allowÂ theseÂ modelsÂ toÂ applyÂ toÂ casesÂ whereÂ intersectionÂ andÂ intersectionârelatedÂ crashesÂ canÂ beÂ identifiedÂ inÂ accordanceÂ withÂ theÂ HSMÂ methodology.Â 2.3 MODEL ESTIMATION APPROACH Crash Count Models BecauseÂ crashÂ frequencyÂ isÂ aÂ countÂ phenomenon,Â negativeÂ binomialÂ (NB)Â regressionÂ models,Â orÂ otherÂ countÂ distributionÂ estimationÂ methods,Â areÂ commonlyÂ usedÂ toÂ buildÂ crashÂ predictionÂ models.Â EvenÂ thoughÂ theÂ NBÂ modelÂ hasÂ someÂ limitationsÂ (forÂ example,Â itÂ cannotÂ overcomeÂ potentialÂ underdispersionÂ problems,Â andÂ theÂ dispersionÂ parameterÂ mayÂ beÂ biasedÂ forÂ smallÂ sampleÂ sizes),Â thisÂ modelÂ isÂ stillÂ theÂ oneÂ mostÂ commonlyÂ usedÂ inÂ univariateÂ crashÂ frequencyÂ dataÂ analysis.Â TheÂ NBÂ modelÂ alsoÂ providesÂ theÂ dispersionÂ parameterÂ thatÂ isÂ requiredÂ forÂ theÂ empiricalÂ BayesÂ weightingÂ ofÂ modelÂ predictionsÂ andÂ observedÂ crashesÂ inÂ theÂ HSM.Â InÂ thisÂ research,Â theÂ NBÂ modelÂ hasÂ beenÂ appliedÂ forÂ allÂ countÂ modelsÂ developed.Â Â Â TheÂ NBÂ model,Â alsoÂ calledÂ theÂ PoissonâGammaÂ model,Â isÂ wellÂ knownÂ toÂ beÂ ableÂ handleÂ theÂ issueÂ ofÂ overdispersionÂ inÂ countÂ data,Â whereÂ theÂ varianceÂ exceedsÂ theÂ meanÂ inÂ violationÂ ofÂ theÂ definitionÂ ofÂ theÂ PoissonÂ distribution.Â InÂ theÂ NBÂ model,Â theÂ meanÂ parameterÂ forÂ eachÂ site,Â i,Â isÂ ð ð ð½ð exp ð Â Â (2â1)Â whereÂ ÎµiÂ isÂ aÂ gammaâdistributedÂ disturbanceÂ term,Â XiÂ isÂ aÂ vectorÂ ofÂ explanatoryÂ variables,Â andÂ ð½Â isÂ aÂ vectorÂ ofÂ estimableÂ parametersÂ (coefficientsÂ onÂ Xi).Â TheÂ mostÂ commonÂ relationshipÂ betweenÂ theÂ explanatoryÂ variablesÂ andÂ Î»iÂ isÂ ð ð½ð exp ð½ð Â orÂ ln ð ð½ð ð½ð .Â Â (2â2)Â WithÂ thisÂ form,Â theÂ relationshipÂ isÂ alsoÂ calledÂ aÂ logâlinearÂ model.Â OneÂ reasonÂ theÂ logâlinearÂ modelÂ isÂ popularÂ forÂ countsÂ isÂ thatÂ itÂ ensuresÂ theÂ dependentÂ variableÂ (thatÂ is,Â theÂ expectedÂ numberÂ ofÂ crashesÂ

7Â duringÂ aÂ certainÂ timeÂ period)Â isÂ alwaysÂ positiveÂ orÂ zero.Â AnotherÂ reasonÂ isÂ thatÂ takingÂ theÂ logÂ ofÂ bothÂ sidesÂ ofÂ theÂ equationÂ resultsÂ inÂ aÂ linearÂ combinationÂ ofÂ theÂ predictorÂ variablesÂ (thatÂ is,Â theÂ Xâs)Â onÂ theÂ rightâhandÂ side.Â ThisÂ modelÂ formÂ belongsÂ toÂ aÂ categoryÂ calledÂ generalizedÂ linearÂ modelsÂ (GLMs).Â InÂ aÂ GLM,Â theÂ regressionÂ coefficientsÂ andÂ theirÂ standardÂ errorsÂ areÂ typicallyÂ estimatedÂ byÂ maximizingÂ theÂ likelihoodÂ orÂ logÂ likelihoodÂ ofÂ theÂ parametersÂ forÂ theÂ dataÂ observed.Â Â TheÂ varianceÂ ofÂ theÂ NBÂ modelÂ canÂ beÂ estimatedÂ asÂ VAR ð¦ E ð¦ ð¼ E ð¦ ,Â Â (2â3)Â whereÂ yÂ isÂ theÂ crashÂ frequencyÂ dataÂ andÂ Î±Â isÂ theÂ dispersionÂ parameter.Â Â Alternatives for Model Form SPFsÂ forÂ roadwayÂ segmentsÂ areÂ formulatedÂ asÂ Â ð ðð¥ð ð ð ln ð´ð´ð·ð ln ð¿ Â Â Â (2â4)Â whereÂ NÂ =Â expectedÂ averageÂ crashÂ frequencyÂ perÂ yearÂ forÂ aÂ roadwayÂ segment;Â AADTÂ =Â annualÂ averageÂ dailyÂ trafficÂ (vehiclesÂ perÂ day)Â onÂ aÂ roadwayÂ segment;Â LÂ =Â lengthÂ ofÂ roadwayÂ segmentÂ (miles);Â andÂ b0,Â b1Â =Â regressionÂ coefficients.Â TheÂ valueÂ ofÂ theÂ overdispersionÂ parameterÂ associatedÂ withÂ NÂ isÂ determinedÂ asÂ aÂ functionÂ ofÂ segmentÂ lengthÂ forÂ twoâlaneÂ andÂ multilaneÂ ruralÂ facilityÂ segmentsÂ asÂ follows:Â ð 1 ðð¥ð ð ln ð¿ (2â5)Â TheÂ followingÂ functionÂ wasÂ usedÂ forÂ theÂ overdispersionÂ parameterÂ forÂ urban/suburbanÂ facilityÂ segmentsÂ (exceptÂ asÂ notedÂ forÂ individualÂ models):Â Â ð ð ð¿ Â Â (2â6)Â ForÂ intersections,Â twoÂ alternativeÂ functionalÂ formsÂ wereÂ considered:Â ð ðð¥ð ð ð ln ð´ð´ð·ð ð ln ð´ð´ð·ð (2â7) andÂ ð ðð¥ð ð ð ln ð´ð´ð·ð , Â (2â8) whereÂ NÂ =Â baseÂ totalÂ expectedÂ averageÂ crashÂ frequencyÂ perÂ yearÂ forÂ anÂ intersection;Â AADTmajÂ =Â AADTÂ (vehiclesÂ perÂ day)Â forÂ majorâroadÂ approaches;Â AADTminÂ =Â AADTÂ (vehiclesÂ perÂ day)Â forÂ minorâroadÂ approaches;Â AADTtotalÂ =Â AADTÂ (vehiclesÂ perÂ day)Â forÂ minorâÂ andÂ majorâroadÂ approachesÂ combined;Â andÂ b0,Â b1,Â b2,Â b3Â =Â regressionÂ coefficients.Â InÂ thisÂ research,Â onlyÂ AADTmaj,Â AADTminÂ orÂ AADTtotalÂ wereÂ usedÂ forÂ exposureÂ forÂ theÂ SPFs,Â toÂ beÂ consistentÂ withÂ theÂ HSM.Â Nevertheless,Â itÂ isÂ possibleÂ thatÂ differentÂ combinationsÂ ofÂ exposureÂ variablesÂ canÂ betterÂ

8Â explainÂ theÂ numberÂ ofÂ crashesÂ (WangÂ etÂ al.Â 2017).Â ForÂ someÂ facilityÂ types,Â otherÂ modelÂ formsÂ wereÂ used;Â thisÂ isÂ explainedÂ inÂ detailÂ inÂ theÂ relevantÂ sectionsÂ below.Â Â Model Estimation and Fit Statistics SPFsÂ forÂ allÂ facilityÂ typesÂ andÂ crashÂ categoriesÂ wereÂ estimatedÂ usingÂ standardÂ statisticalÂ packages,Â suchÂ asÂ SASÂ®.Â AsÂ indicatedÂ above,Â theÂ negativeÂ binomialÂ distributionÂ wasÂ usedÂ toÂ start.Â WhenÂ theÂ negativeÂ binomialÂ overdispersionÂ parameterÂ estimatedÂ byÂ maximumÂ likelihoodÂ (k)Â isÂ foundÂ toÂ beÂ 0,Â whichÂ happenedÂ forÂ severalÂ intersectionÂ models,Â thisÂ indicatesÂ aÂ PoissonÂ distributionÂ isÂ moreÂ appropriateÂ (IDREâ UCLA,Â SASÂ UserÂ Guide).Â WeÂ reâestimatedÂ theÂ modelsÂ withÂ aÂ PoissonÂ distributionÂ inÂ thoseÂ casesÂ andÂ reportÂ bothÂ models.Â Â InÂ additionÂ toÂ theÂ parameterÂ estimatesÂ andÂ standardÂ errorsÂ andÂ theÂ overdispersionÂ parameter,Â theÂ tablesÂ alsoÂ provideÂ theÂ AkaikeâsÂ InformationÂ CriterionÂ (AIC)Â andÂ theÂ BayesianÂ InformationÂ CriterionÂ (BIC).Â Â BothÂ consistÂ ofÂ aÂ goodnessâofâfitÂ termÂ (logÂ likelihood),Â alongÂ withÂ aÂ penaltyÂ toÂ controlÂ forÂ overfitting,Â andÂ thisÂ penaltyÂ isÂ aÂ functionÂ ofÂ theÂ numberÂ ofÂ parametersÂ estimated.Â Â WithÂ bothÂ theÂ AICÂ andÂ BIC,Â lowerÂ isÂ better.Â ForÂ aÂ discussionÂ ofÂ AICÂ andÂ BIC,Â readersÂ areÂ referredÂ toÂ DziakÂ etÂ al.Â (2012);Â sufficeÂ toÂ sayÂ hereÂ thatÂ BICÂ providesÂ aÂ largerÂ penaltyÂ forÂ theÂ numberÂ ofÂ parameters.Â DziakÂ etÂ al.Â (2012)Â indicateÂ that,Â whileÂ theÂ BICÂ isÂ moreÂ likelyÂ toÂ leadÂ toÂ aÂ moreÂ parsimoniousÂ modelÂ withÂ someÂ riskÂ ofÂ underfitting,Â theÂ AICÂ couldÂ leadÂ toÂ aÂ modelÂ withÂ goodÂ futureÂ predictionÂ withÂ someÂ riskÂ ofÂ overfitting,Â andÂ theÂ useÂ ofÂ AICÂ versusÂ BICÂ mayÂ dependÂ onÂ theÂ application.Â TheÂ meanÂ absoluteÂ deviationÂ (MAD)Â givesÂ aÂ measureÂ ofÂ theÂ averageÂ magnitudeÂ ofÂ variabilityÂ ofÂ prediction.Â SmallerÂ valuesÂ areÂ preferredÂ toÂ largerÂ inÂ comparingÂ twoÂ orÂ moreÂ competingÂ SPFs.Â TheÂ MADÂ isÂ theÂ sumÂ ofÂ theÂ absoluteÂ valueÂ ofÂ predictedÂ crashesÂ minusÂ observedÂ crashes,Â dividedÂ byÂ theÂ numberÂ ofÂ sites.Â TheÂ valuesÂ ofÂ predictedÂ andÂ observedÂ crashesÂ areÂ fromÂ theÂ calibrationÂ data:Â ðð´ð· â | |,Â (2â9)Â whereÂ ð¦ Â =Â observedÂ counts;Â ð¦ Â =Â predictedÂ valuesÂ fromÂ theÂ SPF;Â andÂ nÂ =Â validationÂ dataÂ sampleÂ size.Â TheÂ meanÂ squaredÂ predictionÂ errorÂ (MSPE)Â isÂ theÂ sumÂ ofÂ squaredÂ differencesÂ betweenÂ observedÂ andÂ predictedÂ crashÂ frequencies,Â dividedÂ byÂ sampleÂ size.Â MSPEÂ isÂ typicallyÂ usedÂ toÂ assessÂ errorÂ associatedÂ withÂ aÂ validationÂ orÂ externalÂ dataÂ set:Â Â ðððð¸ â ,Â (2â10)Â whereÂ ð¦ Â =Â observedÂ counts;Â ð¦ Â =Â predictedÂ valuesÂ fromÂ theÂ SPF;Â andÂ nÂ =Â validationÂ dataÂ sampleÂ size.Â

9Â WashingtonÂ etÂ al.Â (2005)Â givesÂ guidelinesÂ forÂ interpretingÂ fitÂ statisticsÂ andÂ evaluatingÂ theÂ suitabilityÂ ofÂ crashÂ predictionÂ models.Â Â Crash Severity Modeling InÂ general,Â crashesÂ areÂ classifiedÂ intoÂ fiveÂ severityÂ levels:Â fatalÂ injuryÂ (K);Â incapacitatingÂ injuryÂ (A);Â nonâ incapacitatingÂ injuryÂ (B);Â possibleÂ injuryÂ (C);Â andÂ noÂ injuryÂ orÂ propertyÂ damageÂ onlyÂ (O).Â CumulativeÂ valuesÂ ofÂ theseÂ levelsÂ areÂ commonlyÂ defined,Â buildingÂ fromÂ theÂ highestÂ level,Â e.g.,Â KAÂ indicatesÂ KÂ andÂ AÂ levelÂ crashes,Â KABÂ indicatesÂ K,Â AÂ andÂ BÂ crashes,Â etc.Â ForÂ analyzingÂ crashÂ severities,Â theÂ researchÂ teamÂ consideredÂ severalÂ methodologies.Â First,Â weÂ consideredÂ orderedÂ logitÂ andÂ probitÂ models,Â usingÂ eachÂ crashÂ asÂ anÂ observation.Â TheseÂ modelsÂ wouldÂ haveÂ beenÂ usedÂ toÂ splitÂ crashÂ countsÂ intoÂ categoriesÂ ofÂ severity.Â InÂ theÂ preliminaryÂ results,Â someÂ roadwayÂ geometricÂ characteristicsÂ wereÂ foundÂ toÂ beÂ statisticallyÂ significant.Â TheyÂ showedÂ thatÂ higherÂ maximumÂ speedÂ limitsÂ andÂ pavedÂ shouldersÂ decreaseÂ theÂ severityÂ ofÂ aÂ crash,Â whereasÂ widerÂ lanesÂ increaseÂ it,Â whichÂ isÂ clearlyÂ counterintuitive.Â Consequently,Â weÂ suspectedÂ omittedÂ variableÂ biasÂ occurredÂ inÂ theÂ modelsÂ causingÂ theseÂ erroneousÂ results,Â asÂ theyÂ didÂ notÂ includeÂ individualÂ orÂ crashÂ characteristicsÂ (suchÂ asÂ driver,Â passenger,Â vehicle,Â andÂ soÂ on),Â whichÂ areÂ usuallyÂ foundÂ mostÂ valuableÂ forÂ predictingÂ theÂ severityÂ ofÂ individualÂ crashes.Â Â Consequently,Â weÂ consideredÂ anÂ alternativeÂ approachÂ toÂ investigatingÂ crashesÂ byÂ severityÂ onÂ anÂ aggregateÂ basis.Â ThisÂ betterÂ suitedÂ theÂ availableÂ dataÂ asÂ wellÂ asÂ theÂ implementationÂ contextÂ forÂ theÂ HSM,Â inÂ whichÂ predictionÂ byÂ roadÂ segmentÂ orÂ intersectionÂ isÂ required,Â andÂ demographicÂ informationÂ aboutÂ travelersÂ isÂ notÂ available.Â Specifically,Â weÂ consideredÂ aÂ fractionalÂ splitÂ modelingÂ approach,Â inÂ whichÂ theÂ proportionÂ ofÂ crashesÂ byÂ severityÂ levelÂ isÂ predictedÂ forÂ eachÂ segmentÂ orÂ intersection.Â TheÂ methodologyÂ andÂ modelingÂ resultsÂ areÂ excerptedÂ fromÂ YasminÂ etÂ al.Â (2016)Â andÂ summarizedÂ inÂ AppendixÂ B.Â TheÂ restÂ ofÂ thisÂ sectionÂ summarizesÂ theÂ fractionalÂ splitÂ approachÂ andÂ ourÂ findingsÂ andÂ recommendationsÂ regardingÂ crashÂ severityÂ prediction.Â Â Traditionally,Â theÂ transportationÂ safetyÂ literatureÂ hasÂ evolvedÂ alongÂ twoÂ majorÂ streams:Â crashÂ frequencyÂ analysisÂ andÂ crashÂ severityÂ analysis.Â InÂ crashÂ frequencyÂ analysis,Â theÂ focusÂ isÂ onÂ identifyingÂ attributesÂ thatÂ resultÂ inÂ trafficÂ crashesÂ andÂ effectiveÂ countermeasuresÂ toÂ improveÂ theÂ roadwayÂ design,Â andÂ operationalÂ attributesÂ areÂ proposed.Â CrashÂ severityÂ analysis,Â onÂ theÂ otherÂ hand,Â isÂ focusedÂ onÂ examiningÂ crashÂ events,Â identifyingÂ factorsÂ thatÂ affectÂ theÂ outcome,Â andÂ providingÂ solutionsÂ toÂ reduceÂ theÂ consequencesâinjuriesÂ andÂ fatalitiesâinÂ theÂ unfortunateÂ eventÂ ofÂ trafficÂ crashes.Â Recently,Â researchÂ inÂ transportationÂ safetyÂ hasÂ begunÂ toÂ bridgeÂ theÂ gapÂ betweenÂ crashÂ frequencyÂ andÂ crashÂ severityÂ models.Â Specifically,Â researchersÂ areÂ examiningÂ crashÂ frequencyÂ levelsÂ byÂ severityÂ whileÂ recognizingÂ that,Â forÂ theÂ sameÂ observationÂ record,Â crashÂ frequenciesÂ byÂ differentÂ levelsÂ ofÂ severityÂ areÂ likelyÂ toÂ beÂ dependent.Â Hence,Â asÂ opposedÂ toÂ adoptingÂ theÂ earlierÂ univariateÂ crashÂ frequencyÂ models,Â researchersÂ haveÂ developedÂ multivariateÂ models.Â Â InÂ multivariateÂ approachesÂ thatÂ areÂ aimedÂ atÂ studyingÂ frequencyÂ andÂ severity,Â theÂ impactÂ ofÂ exogenousÂ variablesÂ isÂ quantifiedÂ throughÂ theÂ propensityÂ componentÂ ofÂ countÂ models.Â TheÂ mainÂ interactionÂ acrossÂ differentÂ severityâlevelÂ variablesÂ isÂ soughtÂ throughÂ unobservedÂ effectsâthatÂ is,Â noÂ interactionÂ ofÂ observedÂ effectsÂ occursÂ acrossÂ theÂ multipleÂ countÂ models.Â WhileÂ thisÂ mightÂ notÂ beÂ aÂ limitationÂ perÂ se,Â itÂ mightÂ beÂ beneficialÂ toÂ evaluateÂ theÂ impactÂ ofÂ exogenousÂ variablesÂ inÂ aÂ frameworkÂ thatÂ directlyÂ relatesÂ aÂ singleÂ exogenousÂ variableÂ toÂ allÂ severityÂ countÂ variablesÂ simultaneously.Â ItÂ isÂ aÂ frameworkÂ whereÂ theÂ observedÂ propensitiesÂ ofÂ crashesÂ byÂ severityÂ levelÂ areÂ modeledÂ directly,Â whileÂ alsoÂ recognizingÂ theÂ inherentÂ orderingÂ ofÂ crashÂ severityÂ outcomes.Â

10Â TheÂ fractionalÂ splitÂ approachÂ isÂ notÂ withoutÂ limitations.Â InÂ fieldÂ data,Â thereÂ areÂ oftenÂ noÂ crashesÂ forÂ someÂ specificÂ crashÂ severitiesÂ inÂ aÂ givenÂ caseâforÂ example,Â fatalÂ injuryÂ crashes.Â WhenÂ thisÂ happens,Â suchÂ aÂ segmentÂ cannotÂ beÂ usedÂ forÂ modeling.Â ToÂ avoidÂ casesÂ withÂ zeroÂ crashesÂ forÂ anyÂ ofÂ theÂ severityÂ levels,Â theÂ researchÂ teamÂ aggregatedÂ roadwayÂ segmentsÂ intoÂ extendedÂ superâsegmentsÂ (orÂ arterials).Â ToÂ doÂ this,Â theÂ severityÂ proportionsÂ hadÂ toÂ beÂ assumedÂ toÂ beÂ consistentÂ overÂ allÂ segmentsÂ andÂ intersectionsÂ includedÂ inÂ eachÂ superâsegment,Â whichÂ wasÂ notÂ veryÂ practical.Â InÂ addition,Â onceÂ weÂ aggregatedÂ theÂ segments,Â informationÂ specificÂ toÂ themÂ wasÂ lost.Â ForÂ theseÂ reasons,Â theÂ researchÂ teamÂ decidedÂ notÂ toÂ adoptÂ theÂ fractionalÂ splitÂ modelÂ forÂ predictingÂ crashÂ severity.Â Instead,Â weÂ recommendÂ predictingÂ crashÂ severityÂ usingÂ countÂ models,Â asÂ weÂ doÂ forÂ crashÂ type.Â 2.4 ESTIMATION AND VALIDATION DATA EstimatingÂ crashÂ predictionÂ modelsÂ forÂ theÂ HSMÂ requiresÂ datasetsÂ withÂ adequateÂ size,Â qualityÂ andÂ scopeÂ ofÂ variables.Â VeryÂ fewÂ highwayÂ agenciesÂ haveÂ suchÂ dataÂ readilyÂ available.Â InÂ orderÂ toÂ limitÂ theÂ extentÂ ofÂ theÂ projectÂ budgetÂ expendedÂ onÂ dataÂ collection,Â existingÂ dataÂ sourcesÂ wereÂ acquiredÂ toÂ theÂ extentÂ possibleÂ forÂ eachÂ facilityÂ type.Â ItÂ wasÂ alsoÂ consideredÂ toÂ beÂ desirableÂ toÂ useÂ dataÂ fromÂ theÂ sameÂ statesÂ asÂ wereÂ usedÂ toÂ estimateÂ modelsÂ forÂ theÂ FirstÂ EditionÂ ofÂ theÂ HSMÂ forÂ consistency.Â TwoÂ sourcesÂ ofÂ readilyÂ availableÂ dataÂ wereÂ considered:Â ï· TheÂ HighwayÂ SafetyÂ InformationÂ SystemÂ (HSIS).Â HSISÂ isÂ aÂ multistateÂ databaseÂ thatÂ containsÂ crash,Â roadwayÂ inventory,Â andÂ trafficÂ volumeÂ dataÂ forÂ aÂ selectÂ groupÂ ofÂ states.Â Â WhenÂ HSISÂ wasÂ initiallyÂ established,Â participatingÂ statesÂ wereÂ selectedÂ basedÂ onÂ theÂ qualityÂ andÂ quantityÂ ofÂ dataÂ available,Â andÂ theirÂ abilityÂ toÂ mergeÂ dataÂ fromÂ variousÂ files.Â ForÂ estimatingÂ theÂ predictionÂ models,Â HSISÂ dataÂ fromÂ WashingtonÂ (twoâlaneÂ ruralÂ segments),Â MinnesotaÂ (twoâlane,Â multilaneÂ ruralÂ intersectionsÂ andÂ urbanÂ andÂ suburbanÂ segments)Â andÂ CaliforniaÂ (multilaneÂ fourâlaneÂ dividedÂ segments)Â wereÂ used.Â ï· OhioÂ DepartmentÂ ofÂ TransportationÂ (ODOT).Â OhioÂ isÂ partÂ ofÂ HSIS.Â Â However,Â inÂ additionÂ toÂ theÂ OhioÂ dataÂ thatÂ isÂ partÂ ofÂ HSIS,Â OhioÂ embarkedÂ uponÂ aÂ comprehensiveÂ projectÂ toÂ collectÂ dataÂ forÂ implementationÂ ofÂ theÂ HSMÂ andÂ graciouslyÂ providedÂ theÂ dataÂ theyÂ haveÂ assembled.Â Â InÂ orderÂ toÂ validateÂ theÂ estimatedÂ modelsÂ itÂ wasÂ necessaryÂ toÂ haveÂ dataÂ fromÂ atÂ leastÂ oneÂ moreÂ jurisdiction.Â TheÂ aboveÂ datasetsÂ wereÂ sufficientÂ forÂ twoâlaneÂ ruralÂ highways,Â butÂ additionalÂ dataÂ sourcesÂ hadÂ toÂ beÂ identifiedÂ andÂ inÂ mostÂ casesÂ dataÂ elementsÂ collectedÂ inÂ orderÂ toÂ formÂ validationÂ datasets.Â TableÂ 2â2Â listsÂ theÂ sourceÂ ofÂ theÂ dataÂ forÂ estimationÂ andÂ validationÂ forÂ segmentsÂ andÂ intersectionsÂ forÂ eachÂ facilityÂ type.Â TheÂ subsequentÂ chaptersÂ discussÂ theÂ datasetsÂ inÂ moreÂ detail,Â butÂ aÂ fewÂ overallÂ notesÂ aboutÂ theÂ selectionÂ ofÂ dataÂ areÂ inÂ orderÂ atÂ thisÂ stage:Â ï· ForÂ 4âlegÂ signalizedÂ (4SG)Â intersectionsÂ onÂ twoâlaneÂ andÂ multilaneÂ ruralÂ highways,Â theÂ OhioÂ datasetÂ isÂ usedÂ forÂ modelÂ estimationÂ becauseÂ itÂ hasÂ moreÂ casesÂ thanÂ theÂ MinnesotaÂ dataset.Â InÂ theÂ FirstÂ EditionÂ ofÂ theÂ HSM,Â MinnesotaÂ dataÂ wereÂ usedÂ toÂ estimateÂ thoseÂ models.Â Consequently,Â theÂ baseÂ predictionsÂ forÂ theseÂ modelsÂ willÂ beÂ quiteÂ differentÂ fromÂ thoseÂ madeÂ byÂ theÂ FirstÂ EditionÂ models.Â Â ï· ForÂ fourâlaneÂ undividedÂ segmentsÂ onÂ multilaneÂ ruralÂ highways,Â onlyÂ oneÂ stateÂ (Texas)Â couldÂ provideÂ aÂ usefulÂ dataset.Â Consequently,Â threeÂ yearsÂ ofÂ theÂ dataÂ wereÂ usedÂ forÂ estimationÂ andÂ theÂ fourthÂ yearÂ usedÂ forÂ validation.Â

11Â ï· ForÂ fourâlaneÂ dividedÂ segmentsÂ onÂ multilaneÂ ruralÂ highways,Â dataÂ fromÂ twoÂ statesÂ areÂ usedÂ forÂ validationÂ asÂ allÂ noneÂ ofÂ theÂ threeÂ stateÂ databasesÂ wereÂ asÂ largeÂ asÂ wouldÂ haveÂ beenÂ preferred,Â andÂ havingÂ twoÂ statesÂ toÂ validateÂ againstÂ helpedÂ toÂ betterÂ testÂ theÂ resultingÂ models.Â Â TableÂ 2â2:Â DataÂ UsedÂ forÂ EstimationÂ andÂ ValidationÂ FacilityÂ TypeÂ SegmentsÂ EstimationÂ SegmentsÂ ValidationÂ IntersectionsÂ EstimationÂ IntersectionsÂ ValidationÂ TwoâlaneÂ ruralÂ highwaysÂ WashingtonÂ OhioÂ 3ST:Â MinnesotaÂ 4ST:Â MinnesotaÂ 4SG:Â OhioÂ 3ST:Â OhioÂ 4ST:Â OhioÂ 4SG:Â MinnesotaÂ MultilaneÂ ruralÂ highwaysÂ 4U:Â TexasÂ (2009â11)Â Â 4D:Â CaliforniaÂ 4U:Â TexasÂ (2012)Â 4D:Â IllinoisÂ &Â WashingtonÂ 3ST:Â MinnesotaÂ 4ST:Â MinnesotaÂ 4SG:Â OhioÂ 3ST:Â OhioÂ 4ST:Â OhioÂ 4SG:Â MinnesotaÂ Urban/suburbanÂ arterialsÂ OhioÂ MinnesotaÂ OhioÂ NorthÂ CarolinaÂ Notes:Â FacilityÂ typeÂ codesâ2UÂ =Â twoâlaneÂ undividedÂ segments;Â 3TÂ =Â twoâlaneÂ segmentsÂ withÂ twoâwayÂ leftâturnÂ lane;Â 4UÂ =Â fourâlaneÂ undividedÂ segments;Â 4DÂ =Â fourâlaneÂ dividedÂ segments;Â 5TÂ =Â fourâlaneÂ segmentsÂ withÂ twoâwayÂ leftâturnÂ lane;Â 3STÂ =Â 3Â legÂ stopâcontrolledÂ intersections;Â 4STÂ =Â fourâlegÂ stopâcontrolledÂ intersections;Â 3SGÂ =Â threeâ legÂ signalâcontrolledÂ intersections;Â 4SGÂ =Â fourâlegÂ signalâcontrolledÂ intersections.Â Â

Next: 3. Models for Two-Lane Rural Highways »

Improved Prediction Models for Crash Types and Crash Severities (2021)

Chapter: 2. Analysis Approaches

Welcome to OpenBook!

Get Email Updates