National Academies Press: OpenBook

Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems (2021)

Chapter: 6 Roadmap to Comprehensive Measurement of Serious Injuries Through Linkage

« Previous: 5 Near-Term Solutions to Measuring Serious Injury
Page 29
Suggested Citation:"6 Roadmap to Comprehensive Measurement of Serious Injuries Through Linkage." National Academies of Sciences, Engineering, and Medicine. 2021. Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems. Washington, DC: The National Academies Press. doi: 10.17226/26305.
×
Page 29
Page 30
Suggested Citation:"6 Roadmap to Comprehensive Measurement of Serious Injuries Through Linkage." National Academies of Sciences, Engineering, and Medicine. 2021. Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems. Washington, DC: The National Academies Press. doi: 10.17226/26305.
×
Page 30
Page 31
Suggested Citation:"6 Roadmap to Comprehensive Measurement of Serious Injuries Through Linkage." National Academies of Sciences, Engineering, and Medicine. 2021. Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems. Washington, DC: The National Academies Press. doi: 10.17226/26305.
×
Page 31
Page 32
Suggested Citation:"6 Roadmap to Comprehensive Measurement of Serious Injuries Through Linkage." National Academies of Sciences, Engineering, and Medicine. 2021. Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems. Washington, DC: The National Academies Press. doi: 10.17226/26305.
×
Page 32
Page 33
Suggested Citation:"6 Roadmap to Comprehensive Measurement of Serious Injuries Through Linkage." National Academies of Sciences, Engineering, and Medicine. 2021. Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems. Washington, DC: The National Academies Press. doi: 10.17226/26305.
×
Page 33
Page 34
Suggested Citation:"6 Roadmap to Comprehensive Measurement of Serious Injuries Through Linkage." National Academies of Sciences, Engineering, and Medicine. 2021. Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems. Washington, DC: The National Academies Press. doi: 10.17226/26305.
×
Page 34
Page 35
Suggested Citation:"6 Roadmap to Comprehensive Measurement of Serious Injuries Through Linkage." National Academies of Sciences, Engineering, and Medicine. 2021. Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems. Washington, DC: The National Academies Press. doi: 10.17226/26305.
×
Page 35
Page 36
Suggested Citation:"6 Roadmap to Comprehensive Measurement of Serious Injuries Through Linkage." National Academies of Sciences, Engineering, and Medicine. 2021. Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems. Washington, DC: The National Academies Press. doi: 10.17226/26305.
×
Page 36

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

29 6 Roadmap to Comprehensive Measurement of Serious Injuries Through Linkage Although we present two near-term options for measurement of serious injury using a medical-diagnosis-based metric, the most comprehensive, long-term solution is to link crash data to medical outcome data at the state level. In addition, in setting up data linkage systems for this purpose, it is useful to consider linkage of state data systems broadly. Additional linkages will aid in answering additional questions. Thus, this section describes a roadmap to comprehensive data linkage at the state level. We define “comprehensive” as both adding datasets and as including larger swaths of the crash-involved population. At many points in the process, decisions might be made that will facilitate future linkages and increase comprehensiveness. However, some of these paths may result in additional expense, and states will need to decide whether the goal of comprehensiveness is cost-effective at that point. These decision points will be discussed where they appear. It is worth noting that the center of this work is the crash. Thus, we will not consider datasets or linkages that do not involve crash. That said, the principles of linkage and computing infrastructure discussed here would apply for any set of linkages among state datasets (e.g., roadway inventory linked to road repair datasets, etc.). This report is divided into three major sections. The first presents a set of benefits of linkage beyond just meeting the requirements of Map-21. The second identifies the basic requirements for a linked datasets at the state level. The third describes a series of steps for states to develop such a system. The set of steps, or roadmap, includes alternatives where possible, but in many cases, basic systems (such as complete statewide databases) have to be put in place before linkage can be successful. 6.1 Why Link? This report began by recommending a medical-outcome-based metric for serious injury. From that, data linkage is logically required. However, it is worth considering the broader set of benefits of data linkage beyond simply being able to count serious injuries for MAP-21 reporting purposes. The following provides a sample list of activities that are enabled by data linkage: • Calculation of comprehensive costs of crashes • Cost-benefit analyses around interventions (e.g., infrastructure improvements) that take injury costs into account • Cost-benefit analyses that take actual state costs into account (fatalities often cost states less than serious injuries in real dollars, even though cost-benefit analyses generally use a large number for each life lost) • Optimized resource management for EMS and trauma around crashes • Development of better triage models • Prediction of high-cost crash locations for better interventions • Improved data sharing between agencies that need the same datasets • Identification of crash under-reporting through comparison between reported crashes and trauma/EMS • Identification of injury hotspots and mechanisms • Availability/Existence of linked data resources that might aid other entities (e.g., car companies) in understanding their fleet performance; this may become more important as automated vehicles enter the fleet

30 • Development of hidden-injury prediction models that could aid EMS and hospital treatment of crash-involved patients • Better validation of EMS practices through the ability to link to diagnosis general, linked data systems enable richer analyses, and they can make data collection and access more efficient. By collecting each piece of information only once and sharing across data systems, information needed in several datasets can be obtained more efficiently. The greatest barrier to a fully linked data system is the up-front cost to set up the data collection, linked storage, and access infrastructure. There is a need for a rationale for linkage that goes beyond “MAP-21 says so” and demonstrates real benefit to the state to make the investment. 6.2 Requirements for Linkage A good performance management framework that incorporates serious injury metrics requires data that has a set of key qualities (Cambridge Systematics, 2013). These include consistency, comparability, and comprehensiveness. Consistency is internal agreement among data elements and databases with overlapping data. Comparability means that performance metrics provided by different states mean the same thing and can be compared directly. Finally, comprehensiveness means that data systems can be used to answer the widest possible range of questions and include the widest possible range of cases. These overarching principles will be considered in the details discussed in Section 7. To achieve the ideal data qualities described above, a system of data linkage between crash, EMS, hospital, and other datasets at the state level requires certain basic elements. First, there must be statewide databases with good coverage in each area. These datasets must conform to a common schema whenever there are data elements in common. There must be one or more identifiers that specify rows in each dataset that refer to the same individual, and there must be rules for access and a mechanism for secure access. These requirements are discussed in more detail below. 6.3 Dataset Quality To measure serious injuries in crashes in a state using data linkage, the datasets themselves must be in good enough condition to support the linkage process. Problems in datasets will tend to compound—missing data in any dataset will break a link, so even if each dataset is of reasonable quality, the linked dataset may still be marginal. Thus, attention to the quality of each of the key state datasets is important. Several aspects of dataset quality must be considered. 6.3.1 Coverage Coverage indicates the percentage of services that are reliably sending data to the state. Within a state, data come from a set of smaller organizations that are providing the on-the- ground service being measured. For example, EMS data are typically collected from a large number of small, independent ambulance and rescue services spread around a state. In many cases, these services are staffed by volunteers in rural areas, and data collection may not be the highest priority for limited time and money. An ideal database will collect from all of the relevant services in the state. That said, coverage need not be 100% to support good estimates of serious injury incidence. What is more problematic is biased coverage, where certain kinds of services or areas (e.g., urban areas) are more likely to be included in the state database than others. Because

31 geography affects crash type, it also affects injury risk, so geographical bias in coverage of one database (e.g., EMS) will introduce bias into the linked dataset as well. 6.3.2 Schema Consistency A database schema is essentially a codebook of variables collected and possible values. It is important to have consistent use of a common schema across all agencies or services providing data. Inconsistencies can occur in different ways. For example, certain data elements may be included in data from some units and not others. Alternatively, different units may use code values differently. The first issue might arise in hospital data if the hospitals themselves have individual data collection practices and then pull data to satisfy state reporting requirements. One common example of this revolves around the use of the AIS injury coding system. In the U.S., hospitals are required to use the International Classification of Disease version 9, Clinical Modification (ICD-9-CM) for coding medical records (WHO, 1992). Starting in October 2015, they will be required to use version 10 (ICD-10-CM). The ICD-CM is a general-purpose classification system for diagnoses of all health conditions and includes codes for both the nature of the injury and causes of injury. Coding is done by trained medical coders who work from hospital records. Unlike AIS, ICD-CM does not include an explicit ranking of injury severity. To be used to identify seriously injured crash victims, ICD-CM must either be mapped to AIS or some other ranking system must be imposed on the coded injuries to assess severity. Only some hospitals (typically larger trauma centers) employ trained AIS coders to recode records into the AIS system. This means that within a state hospital database, there may be some hospitals that consistently provide AIS codes and others that provide only ICD-CM codes. ICD codes can be translated into AIS codes, but the computer-based translation will be different from a human-coder-based translation. Thus, it is important to maximize the consistency and completeness of the incoming data from the set of independent sources across the state and to document inconsistencies in the way data are handled by different sources. 6.3.3 Quality Control QC is a process of checking consistency and completeness at the individual data-element level. One basic QC issue is missing data. When data are missing in small numbers (e.g., <5% for any one data element), inferences can still be made reliably based on the remaining data, especially if missingness is arguably random. However, as the missingness rate increases, the reliability of inferences based on the remainder decreases. Moreover, the likelihood that data are missing at random also decreases. QC should also check for consistency among related data elements. For example, if a crash is labeled as single-vehicle, there should only be one vehicle description included on the crash report. These types of redundancies are often built into the crash report form and can be used to test key data elements for consistency. Quality checking is built into many state data systems. These datasets have typically been developed individually, and in the case of EMS, there are established national schemas with built-in QC rules. While good quality in component datasets is necessary for a good-quality linked dataset, it is not sufficient. QC rules should be developed for the linked dataset as well.

32 6.3.4 Timeliness State crash datasets are used for planning purposes, and highway safety plans must be completed each year in July. Since crash-related fatalities may occur anytime within 30 days of a crash, the annual state crash dataset cannot be finalized until the end of January the following year. However, faster data entry and QC allow a finalized planning dataset to be released as early as possible, giving planners more time to use the data for their process. Depending on the approach used, the linkage process itself may occur as data are collected or after datasets are finalized. In either case, timeliness in the linked data is dependent on timeliness in the original datasets. Moreover, QC processes using the linked dataset add to the time required to finalize datasets for use in highway safety planning. The linkage approach itself must produce a linked dataset in close to the same timeframe required for the original crash dataset. 6.4 State Datasets This section provides a brief description of state dataset that might be linked within a comprehensive system. Where possible, each has some indication of available national schemas and the purpose of the linkage. By definition, this can be an ever-expanding list, so the databases discussed below should not be thought of as limiting scope. 6.4.1 Crash The Federal Highway Administration’s (FHWA) Crash Data Improvement Program (CDIP) provides a roadmap for assessing and improving state crash data quality. In addition, the MMUCC, represents a minimum data standard for state crash data. The MMUCC 3rd edition was published in 2008 (DOT, 2008) and is used in most states that use MMUCC. States are encouraged to update to the 4th edition, which notably has changed the definitions of the KABCO data elements in an attempt to better standardize injury reporting. Since MMUCC is a minimum standard, states will generally have a larger number of variables. However, all variables should be collected and reported in the same way from all police units in the state. 6.4.2 Emergency Medical Services (EMS) The National EMS Information System (NEMSIS), funded through the Office of EMS within the National Highway Traffic Safety Administration (NHTSA), has provided for a common national dataset, database schema and a national EMS registry. Although EMS data do not contain the diagnostic specificity necessary to provide a MAIS-type measure of serious injuries occurring in crashes, the well-developed national schema, close ties between EMS and hospital personnel and records, and the presence of crash location and other key data elements make NEMIS an ideal intermediate dataset for linking crash to hospital outcome. Other valuable measures of injury severity are present in EMS data including the Centers for Disease Control (CDC) Trauma Triage Criteria and the RTS. Moreover, the NEMSIS schema includes standard data elements from the National Trauma Data Standard (NTDS) to enable linkage between EMS and hospital trauma registry datasets (see section 6.4.3). Information about the occupant’s condition immediately after the crash as well as the transit time are included in these datasets, providing additional information about patient condition and care. One aspect of EMS datasets that affects linkage is the fact that each entry in the dataset is a patient run, or a particular point-to-point transport of a particular patient. In most EMS datasets, a patient who is transported a second time (e.g., transferring between hospitals after initial evaluation) will appear in a different record and there will not be a common identifier to

33 link the two records. Thus, when linkage is made from crash-to-EMS and EMS to medical outcome data, cases with transfers may be less likely to link to final discharge diagnosis (possibly made at the second hospital) and treatment. There are several proven approaches to handling this problem. 6.4.3 Trauma State trauma databases are collected in a majority of states, though states vary in whether trauma data collection is mandatory or voluntary. They also vary in which hospitals or trauma centers are required to participate. In general, trauma databases include patients whose diagnosis falls into a group of categories defined as “trauma,” defined by the American College of Surgeons (ACS) as At least one ICD-9-CM injury diagnostic code in the range: 800–959.9, excluding superficial injuries and at least one of the following: 1. The patient is admitted to the hospital with at least a 24 hour stay; 2. Patient transfer via EMS transport (including air ambulance) from one hospital to another hospital; 3. Death resulting from the traumatic injury. These diagnoses are generally more severe, on average, than those in a hospital discharge database (see Section 6.4.4) or an ED database (Section 6.4.5). The ACS has developed the Trauma Quality Improvement Program (TQIP) to assist states and trauma centers in collecting high-quality trauma data. They have also developed the NTDS, which should be used by state trauma databases, but is similar to MMUCC in being a minimum dataset standard that is used with varying levels of compliance. The TQIP program, like CDIP, aims to improve compliance in all states. It should be noted that NTDS and NEMSIS are mutually compliant and integrated, facilitating data linkage between EMS and trauma databases in states. One advantage of statewide trauma registries is that they typically include AIS codes in addition to ICD codes. Trauma registrars are commonly trained AIS coders and will incorporate coding into their data entry activities. Thus, these datasets are easier to use with an MAIS 3+ definition of serious injury. 6.4.4 Hospital Discharge Statewide hospital discharge datasets include all patient records at discharge for anyone admitted to a hospital within a state. Compared to state trauma registry systems, state hospital discharge data systems are more inclusive (contain all injured patients treated in all acute care facilities), but provide less detail regarding the patients’ injuries (e.g., severity), mechanism of injury and treatment details. Discharge datasets are based upon a universal billing standard (UB- 04), while registry systems employ trained abstractors to review medical charts and record specific injury-related information. However, these datasets may be better for injury surveillance in general because they include all injuries that require hospitalization, rather than only the more serious injuries included in the trauma definition. In 2003, the State and Territorial Injury Prevention Directors Association published a report of the Injury Surveillance Workgroup containing recommendations for using hospital discharge data for injury surveillance (Injury Surveillance Workgroup, 2003). The report not only covered the use of such data for injury surveillance, but made recommendations for standardized reporting and analysis that would facilitate the use of such data. At the time, the group reported that over 40 states collected some data on hospital discharge, though it is unknown how complete the coverage was for these states.

34 A focus of the Injury Surveillance Workgroup’s (2003) report was the use of externalExternal-cause codes in the patient discharge report. External-cause codes are part of ICD coding system and are used to code external causes of injury. The specific codes have been updated in moving from ICD-9-CM to ICD-10-CM. However, the principle of externalExternal- cause codes is that they identify causes, which allow data related to MVC to be identified and separated from the larger dataset. For data linkage, it is important to focus linkage on the cases related to MVC because these cases are a small subset of all hospital data. Although over 40 states were collecting state hospital discharge data in 2003, many did not reliably include externalExternal-cause codes. The Healthy People 2020 program (see: http://www.healthypeople.gov/2020/topics-objectives/topic/injury-and-violence- prevention/objectives?topicId=24) includes objectives to increase the use of externalExternal- cause codes in both emergency and hospital discharge databases. An evaluation of the use of externalExternal-cause codes for injury surveillance in 2007 (Lawrence et al., 2007) indicated that over half of states either mandated externalExternal-cause codes or obtained over 85% compliance voluntarily. A more recent evaluation (Barrett & Steiner, 2014) indicated 92% compliance for inpatient discharge and 94% for ED data. Thus, a prerequisite to linkage between crash and hospital discharge is wide coverage of hospitals within a state and reliable use of externalExternal-cause codes for all participating hospitals. The Healthcare Cost and Utilization Project (HCUP), sponsored by the Agency for Healthcare Research and Quality, works with states to standardize and make available state inpatient discharge data (SID) (Barrett & Steiner, 2014). They report that 47 states participate in the HCUP SID, though the specific coverage of hospitals within a state is determined by the state and not clearly reported on the HCUP website. The SID provides a common data standard to facilitate comparison across states. 6.4.5 Emergency Department ED data represent a further expansion of the sample available for linkage between crash and medical outcome. ED data include everyone seen in the Emergency Department, of whom most are never admitted to the hospital. Relative to trauma and hospital discharge datasets, ED datasets include patients who are much less severely injured, but who are much more numerous. The HCUP program also works with states to standardize and make available state ED data through the State Emergency Department Databases (SEDD). The SEDD includes only those who were not admitted to the hospital, and as with the SID, hospital inclusion is determined by the state agency submitting data. Thirty-one states participate in the SEDD. The data and need for E-coding is similar for ED and hospital discharge datasets. These datasets include ICD externalExternal-cause codes, but generally not AIS codes. Consistent use of externalExternal-cause codes is necessary for linkage to the ED dataset as well. 6.4.6 Roadway Databases Roadway data include a variety of characteristics of the road and traffic in it, which are geographically referenced. These datasets may include physical characteristics (e.g., number of lanes, shoulder width), access control (e.g., public/private, toll), intersections (e.g., traffic control, intersection type), inventory (e.g., signs, pavement within a road segment), traffic characteristics (e.g., volume), structures (e.g., bridges), railroad crossing (e.g., signal type), pavement management (e.g., condition, repair history), and assets (e.g., guardrails, signs) (DeLucia et al., 2012).

35 The Roadway Data Improvement Program (RDIP) assists states in improving the quality of their roadway data and its management. This includes data elements and linear referencing systems. Safety-related data elements for roadway databases are specified in the Model Inventory of Roadway Elements (MIRE; Council et al., 2007) and the RDIP program encourages states to comply with the MIRE standard. Linkage to crash records is a key part of roadway safety analysis. Linkage between crash and roadway is done on the basis of crash location (tied to the roadway referencing system) and is relatively straightforward compared to linkage between crash and medical outcome. This linkage, and achieving a certain level of location precision, is essential to planning roadway safety improvements. Moreover, linkage from crash to medical outcome can be carried into the roadway dataset to enhance safety analyses. 6.4.7 Driver Licensing Driver license databases contain an inventory of all driver’s license numbers, along with the demographic and other information about the driver that is contained on the license. This information includes name, address, birthdate, sex, and self-reported height and weight, among other elements. These can be useful in both linkage (via name, address and birthdate) and safety analysis (using age, sex, height, and weight). State driver license files are generally complete, so coverage is not an issue. In addition, police reports routinely include the driver’s license number, so linkage is generally very simple. Privacy is usually the only hurdle to overcome in linking crash and license data, though this issue should be considered carefully. 6.4.8 Driver History State driver history datasets are also indexed by license number but contain citations, arrests, and adjudications. These are critical datasets for understanding safety-related issues such as recidivism among drunk drivers. Like driver license files, the driver history database should be readily linkable to crash via license number. Similarly, privacy is an issue to be considered. The Governor’s Highway Safety Association (GHSA) recommends that states develop a single information system for driver licensing and history data (GHSA, 2014). They also support exchange of such information between states and development of a national driver database. At this time, no such database exists, and driver history is generally not shared between states. 6.5 Common Identifiers To link people in any pair of datasets, one or more common identifiers must be present in both datasets. The gold standard of identifiers is a single, unique, permanent, person-specific, identification code (ID) used in all datasets and assigned to all people in all datasets. However, several less ambitious forms of linking variables can also be used effectively. The permanent person-specific ID code allows for analysis of events and treatments that occur outside of the time frame of a single crash event. This is ideal because analysis of follow- up treatment and long-term outcomes require this. However, achieving a statewide person- specific ID is logistically very challenging. A person-specific ID code that is only used for a given event, but is available in all datasets is much easier to implement and will allow effective analysis of serious injuries in crashes. Using a single ID code across all datasets is also not necessary if each pair of datasets has a common person-specific identifier. In addition, not all people in any one database need the common ID code. For example, only a small number of crash-involved occupants are transported by EMS, so only these people need an ID code for linkage to the EMS dataset. Other occupants

36 may be assigned a code, but they will not appear in the EMS dataset and their code will not be used. Linkage from crash to hospital would then pass through two stages (crash linked to EMS, and EMS linked to hospital) to get to the subset of people who were transported by EMS and admitted to the hospital. Finally, identifiers do not have be single or in code. The method of probabilistic linkage will be discussed in detail later in this report, but the key idea is that when a unique common identifier is not available in a pair of datasets, it may be possible to use a set of non-unique common variables to estimate the probability that a person in one dataset is the same as a person in the other dataset. Variables used for this process often include name, birthdate, age, sex, date/time of event, and location of event. Although these identifiers are not unique or single, they must still be common among the datasets to be linked. 6.6 Access Rules & Permissions Privacy and data protection requirements for any data with personally identifying information (PII) are set by a combination of state-specific laws and the HIPAA. HIPAA applies to health data, so once crash data are linked to health data, the HIPAA rules will apply. It is important to note that HIPAA does not prevent health data from being used for research or public health purposes or from being linked to other datasets. Instead, it sets out the conditions under which such uses may be made. These conditions include definitions of de- identification, requirements for security, and rules for access. State-specific laws, however, may prevent linkage or use of a linked dataset, or may set additional conditions for permission and access. In some cases, these laws have had to be changed to facilitate data linkage. In any case, they must be known and their requirements addressed. A good statewide data linkage system requires rules for access and software to allow appropriate access and prevent inappropriate access. Rules for access must comply with HIPAA and state law, and the level of access may be different for different individuals. It is even possible that state laws that impede linkage may need to be changed. A de-identification method that complies with HIPAA allows for a much wider range of individuals to have access (to the de-identified data).

Next: 7 Roadmap to Linkage »
Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems Get This Book
×
 Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The Moving Ahead for Progress in the 21st Century Act (MAP-21) requires a set of performance metrics to include assessment of serious injuries in crashes.

The TRB National Cooperative Highway Research Program's NCHRP Web-Only Document 302: Development of a Comprehensive Approach for Serious Traffic Crash Injury Measurement and Reporting Systems presents a roadmap for states to develop comprehensive crash-related data linkage systems, with special attention to measuring serious injuries in crashes.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!