National Academies Press: OpenBook

Traffic Forecasting Accuracy Assessment Research (2020)

Chapter: Chapter 1 - Introduction

« Previous: Part I - Guidance Document
Page 15
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 15
Page 16
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 16
Page 17
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 17
Page 18
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 18
Page 19
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 19
Page 20
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 20
Page 21
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 21
Page 22
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 22
Page 23
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 23
Page 24
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 24
Page 25
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 25
Page 26
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 26
Page 27
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 27
Page 28
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 28
Page 29
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 29
Page 30
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 30
Page 31
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 31
Page 32
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 32
Page 33
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 33
Page 34
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 34
Page 35
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 35
Page 36
Suggested Citation:"Chapter 1 - Introduction." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 36

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

I-3 1.1 Purpose of the Guidance Document Accurate traffic forecasts for highway planning and design help ensure that public dollars are spent wisely. Forecasts inform discussions about whether, when, how, and where to invest public resources to manage traffic flow, widen and remodel existing facilities, and where to locate, align, and how to size new ones. It is in the interest of transportation planners and policy makers to base such decisions on the most accurate possible traffic forecasts; however, it is important also to recognize that no forecasts will be perfectly accurate. It is prudent to quantify the expected inaccuracy around traffic forecasts and consider that uncertainty in making and communi- cating decisions. Together, more accurate traffic forecasts and a better understanding of the uncertainty around traffic forecasts can lead to a more efficient allocation of resources and build public confidence in the agencies that produce those forecasts. NCHRP Project 08-110, “Traffic Forecasting Accuracy Assessment Research,” sought to develop a process and methods by which to analyze and improve the accuracy, reliability, and utility of project-level traffic forecasts. For purposes of this study, the terms accuracy and reliability addressed how well the forecasting procedures estimated what actually occurred; the term utility encompassed how well a particular projected outcome informed a decision; and project was used in reference to either a single project or a bundle of closely related projects. An important aspect of this research was that it aimed not only to analyze the accuracy of existing traffic forecasts, but also to develop a process for continuing to do so. In light of this dual objective, this report is organized in three parts: • Part I: Guidance Document. Part I of this report provides the guidance and instructions agencies need to improve the accuracy, reliability, and utility of traffic forecasts as applied to transportation planning, design, and operation. Agencies that will benefit from this guidance include metropolitan planning organizations (MPOs), state departments of transportation (DOTs), and other agencies with responsibilities for traffic forecasting. Specifically, the guidance describes how to use the measured accuracy of past traffic forecasts to estimate the uncertainty around new traffic forecasts. It goes on to describe tools that engineers and planners can use to archive forecasts for use in the future and to track the accuracy of those forecasts. Finally, it describes how to apply the findings of forecast accuracy evaluations to improve the traffic models used to generate those forecasts. The chapters in Part I break down the guidance as follows: – Chapter 1 describes the research conducted for this project and provides an overview of the four key recommendations that resulted from the research. After reading Chapter 1, users should understand the basis for the recommendations, what the key recommendations are, and why they might wish to implement those recommendations. Ensuing chapters focus in more detail on the steps agencies can take to implement each of the four recommendations. C H A P T E R 1 Introduction

I-4 Traffic Forecasting Accuracy Assessment Research – Chapter 2 describes how to use measured forecast accuracy to communicate uncertainty around new forecasts. The person preparing a traffic forecast would take these steps to iden- tify and communicate the uncertainty in terms of a range at the time the forecast is made. – Chapter 3 describes ways to archive traffic forecasts. An important finding of this research was that it is easier to analyze forecast accuracy if the forecasts are archived at the time the forecast is made, rather than trying to dig them up after-the-fact. The project team advises that the person at the organization who is responsible for the forecasts set up the archival structure. – Chapter 4 discusses methods used to report forecast accuracy. This activity would be conducted using the archived forecasts (discussed in Chapter 3) and accompanying observed data, and would be used to inform estimates of the uncertainty around new forecasts (as discussed in Chapter 2). It also would be used to improve traffic forecasting models (as discussed in Chapter 5). – Chapter 5 describes ways to use forecast accuracy data to improve or enhance new forecast- ing models. Agencies would engage in such activities based on the measured accuracy of the models. – Chapter 6 discusses the reasons to implement the recommendations and offers directions for future research. • Part II: Technical Report. The technical report presents the detailed findings and results of the completed research and analysis on which the guidance has been based. • Part III: Appendices. Eight appendices provide additional resources and information, including links to access electronic resources and other downloadable resources; additional information about the development of the Forecast Card datasets; the project team’s recom- mendations for implementing and extending the results of this research; a literature review that summarizes past efforts to document and analyze forecasting accuracy and focuses on a selection of past studies to consider the methods used to analyze forecast accuracy and the issues cited as causes of inaccuracy; additional details about the project’s data exploration as part of the Large-N analysis; and a more in-depth look at five of the six deep dives conducted in this project. Most of the techniques and tools presented in this report are geared toward highway traffic forecasts but can be modified with modest efforts for other modes. The expected audience is engineers and planners who are involved in generating traffic forecasts. Decision makers and planners who may be involved as consumers of traffic forecasts and wish to have a broad under- standing of their typical accuracy will find the report summary and presentation file useful. This guidebook will help agencies “jump start” forecast accuracy archiving and analysis. It provides information to help forecasters communicate the importance of instituting internal archiving procedures and introduces ready-to-use tools found in the electronic appendix. 1.1.1 Notes on Forecast Accuracy, Reliability and Utility The objective of this study was to develop a process to analyze and improve the accuracy, reliability and utility of project-level traffic forecasts. For the purpose of this project, these three terms can be defined as follows: • Accuracy is the measure of how well the forecast estimates actual project outcomes. Accuracy includes, but is not necessarily limited to, forecast versus actual traffic volumes. Because accuracy is dependent on knowing the actual outcome, it can only be measured after a project opens. • Reliability is the likelihood that a process applied to multiple, similar projects will generate a forecast similarly accurate for all such projects. In common language, reliability and accuracy

Introduction I-5 sometimes are used interchangeably, but in scientific contexts, reliability has a distinct mean- ing. In a scientific context, reliability is the likelihood that someone repeating an experiment will obtain the same result. A forecast might be considered reliable if a different analyst making a different set of reasonable assumptions (or using a slightly different version of the model inputs) would get nearly the same result. In this way, reliability is closely related to sensitivity analysis or risk and uncertainty analysis, in which the analyst tests a range of assumptions to see whether the projected outcome changes by a large or small amount. In this study, forecast reliability is based on a comparison of multiple forecasts to each other or of multiple observations to each other. In contrast to accuracy, forecast reliability can be evaluated at any point after the forecast is made, and does not require waiting for the project to open. One important place reliability shows up is in the use of traffic counts. Traffic counts collected at the same location on different days generally have some variation. If counts on different days are close in value, they are reliable; if there is a large variation, they are unreliable. The best possible outcome is that the accuracy of the forecast exceeds the reliability of the traffic counts. • Utility is how well the projected outcome informs a decision. Utility has to do with the value of the information provided. Consider, for example, developing a traffic forecast with a straight-line projection from traffic count growth versus with a four-step travel model. Either approach could be accurate and reliable, but if a key question faced by decision makers is how much diversion there is from parallel roads, then the straight-line forecast has no utility for answering the question. The same issue can arise for a model that has fixed time-of-day factors if the project requires the consideration of peak spreading. In this way, the utility of a forecast inherently depends on the questions being asked. One of the key justifications offered for the development of advanced models is that they are sensitive to a broader range of policy questions (Donnelly et al. 2010). For this project, the team chose to focus its efforts largely on the question of accuracy as an important unanswered question that could be addressed effectively. That focus is reflected in the content of this guidance document and the technical report. Reliability and utility are also important in traffic forecasting, and several of the recommendations related to forecast accuracy also will promote greater reliability in forecasting. Specifically, the recommendations related to archiving forecasts and documenting important assumptions promote a level of transparency that will make it easier to replicate forecasts and test their reliability, even before the project opens. Forecast utility is an important consideration that may be related to accuracy. For exam- ple, a forecast may be so inaccurate that it is not useful. However, the project team has elected to leave it largely to the judgment of the forecaster and to other researchers to consider how to enhance the utility of traffic forecasts as it relates to the questions they answer. 1.2 Research Summary This section provides an overview of the study’s research approach and findings. Further detail is provided in Part II. 1.2.1 Background For several decades, scholars and critics of transportation planners and policy makers have focused international attention on forecast accuracy, documenting observed inaccuracy levels to decision makers and the public. Documented reasons for forecast inaccuracy have included poor data on which forecasts are based, inappropriate assumptions about future conditions, the use of overly simplistic forecasting procedures, and political motivations that sometimes lead

I-6 Traffic Forecasting Accuracy Assessment Research to intentional distortion of forecasts. The project team’s review of the literature revealed that the length of the forecast horizon, the nature of the facility for which forecasts are made, stability or volatility in population growth and economic factors, all influence forecast accuracy in predi- cable ways, and that forecasts will always be influenced by some factors that are unexpected, unpredictable, and therefore difficult to anticipate. Most past studies have been analytical rather than prescriptive. They report what was observed but offer little advice to planners, engineers, or policy makers as to how they may improve forecasting practice by transportation agencies and their consultants. In very general terms, forecast accuracy is easy to understand as the extent to which a forecast of estimated traffic volumes at a single point in time (a date in the future) matches the traffic that is counted when that future date arrives. The issue of forecast accuracy becomes far more complicated when the topic is addressed in appropriate depth. To conduct meaningful analysis, it is necessary to make many assumptions and to adopt conventions requiring judgments for which no obvious standards of correctness or precision exist. Traffic volumes can be forecast for short or long periods of time, varying between peak 15-minute periods or entire days; they may or may not include weekends along with weekdays, and forecasts may be differentiated or aggregated across seasons of the year. Volumes can be forecast for each direction of traffic flow on a highway or for both directions combined. Flows can be forecast by short segments of roadways or averaged over long expanses. Just as traffic volumes can be expressed in different ways, accuracy also can be expressed using various metrics. Absolute differences between forecast and measured traffic volumes can result in differing estimates of accuracy, depending on how they are aggregated across road segments and time periods, or on whether errors are expressed as percentages of forecast flows or of measured traffic volumes. Extreme values (outliers) affect assessments of accuracy. Outliers can be included in comparisons if it is believed they are valid measurements, or they can be excluded from forecast accuracy assessments if, in the judgment of the analysts, they result from measurement or recording errors. This study fills gaps in the literature by proposing tools and techniques by which agencies, including state DOTs and MPOs, can improve the accuracy of their traffic forecasts by adopting new practices based on what others have done with some success. Other fields have demonstrated the effectiveness of reviews that have led to the adoption of improved forecast- ing practice, including the National Oceanic and Atmospheric Administration (NOAA) which adopted a highly successful Hurricane Forecasting Improvement Program. Informed by a growing literature critiquing transportation forecasts and emulating the success of other fields, this study addresses ways transportation agencies can improve documentation and assessment of traffic forecasting experience to improve future applications. While being attentive to insights and explanations of forecast accuracy, the emphasis of this report is on practice. It addresses what agencies can do to improve the accuracy of their traffic forecasts. Assumptions and compromises were necessarily made about the many sources of error and after considering differences of opinion about best practices for comparing accuracy and measuring traffic flows, but the objective was always to improve the applicability of knowledge to agency practice. 1.2.2 Research Approach To meet the study objective, the project team: • Analyzed traffic forecasting accuracy and usefulness using information from various sources including state DOTs, MPOs, counties, and other transportation agencies actively involved in forecasting travel demand in competitive modes; • Assessed transportation agency experience with respect to accuracy of various forecasting approaches;

Introduction I-7 • Identified methods for improving flexibility and adaptability of available forecasting tech- niques to changing assumptions and input data; • Considered alternative ways of incorporating risk and uncertainty into forecasts; and • Identified potential methods that can help the traffic forecasting industry improve forecast- ing usefulness and accuracy while improving their ability to communicate and explain these forecasts to affected communities. A review of the literature revealed that most prior studies of traffic forecasting accuracy had adopted either of two complementary approaches to assess the accuracy of the forecasts. The project team decided to employ both approaches in this study. The first approach relies on gathering a large sample of forecasts for which data were collected and the forecasts made sufficiently long ago that the horizon year of the forecasts has come, making it possible to compare the forecasts of traffic with measured traffic flows on the facilities for which the forecasts were made. With a large enough sample of such forecasts, statistical analysis can be used to examine correlations between forecast accuracy and data inputs, facility types, methods used to conduct the forecasts, and factors exogenous to the forecasts that influ- enced their accuracy. Because this analysis involved a large sample of cases, it will be referred to throughout this report as the Large-N analysis. The second approach identified in the literature review consisted of developing case studies of particular facilities in which forecasts were made at some date in the past, the projects were planned in detail and built, and resulting traffic flows were observed. Most often, such case studies have examined a single project or a very small number of projects, using customized data collection that included review of historical documents, before-and-after surveys of travelers, and interviews with decision makers who participated in the projects. The depth of the before- and-after analysis may lead researchers to identify with confidence specific sources of forecast errors (e.g., errors in inputs, incorrect assumptions, model specification, changes in the project definition, and others). The FTA has conducted such before-and-after analyses of patronage and cost forecasts for major capital investments in public transit. Many more such investments occur nationally for highway construction projects, but a much lower proportion of forecasts for highway projects have been studied in this manner. The advantage of detailed case studies is that they allow a complex set of issues and interactions to be thoroughly investigated. This approach reveals the importance of assumptions made by modelers in relation to available data and the strengths or weaknesses of particular models that were used. The disadvantage of using case studies is that it is very difficult or impossible to generalize from particular case studies. This research project included five complete case studies and one incomplete case study that was limited by the nature of the available data and documentation. These case studies are referred to throughout this report as deep dives. Large-N studies and deep dives clearly complement one another by shining different lights on the same problem—the inaccuracy of traffic forecasts. Taken together, the two types of studies provide analysts with insights that add to forecasters’ overall understanding. Accordingly, the project team applied both approaches and compared findings from the Large-N analysis and deep dives to reach the conclusions of this study. Conducting the Large-N analysis required compiling a database of forecasts and counted traffic for about 1,300 projects from six states and four European countries. The resulting forecast accuracy database—the largest available database for assessing traffic forecast accuracy— allowed for the development of distributions of forecast errors, analysis of relationships between measured traffic volumes and the forecast traffic volumes as well as a variety of potentially descriptive variables. Where systematic errors were found in the forecasts, the project team could study whether errors were functions of factors such as the type of project, the time between the forecast and the opening year, and the forecasting methods that had been used.

I-8 Traffic Forecasting Accuracy Assessment Research Combining the Large-N analysis with the deep dives was intended to reveal as many sources of error in traffic forecasts as possible; however, it was understood that some sources of error would remain unexplained simply because it is impossible to account for every deviation between fore- casts and measured traffic. This constraint results in part because the historical data drawn from numerous, dissimilar sources has been collected with varying levels of precision. It also reflects the uncertainty that is inherent in forecasting and ultimately the inability to anticipate and specify all of the numerous sources of potential disparities between assumptions and reality. This research was intended to help agencies like state DOTs and MPOs improve their fore- casts. Given this purpose, generating findings about forecast accuracy and model improvement alone was not sufficient. The findings also had to address the capacity of and challenges facing the organizations that collect the data, calibrate and operate the models, and report their findings in highly politicized environments. For these reasons, and others, the findings of this research address agencies and their processes as much as or more than they do the technical characteristics of forecasts. The project team asked transportation agencies about their standard practices and their greatest challenges. The most important findings included those about what organizations can do to strengthen their forecasts, and this guidebook is meant to be used by practitioners for this purpose. To be sure that the recommendations would be useful in practice, the team made every effort to learn from the agencies that had produced the forecasts. For example, forecasters consistently identified inadequate data availability as their greatest obstacle to accurate fore- casting. Whenever possible, the team tried to replicate what agency forecasters had done, including actually running the travel demand when possible. Conclusions from this analysis led to judgments about what forecast and project outcome data should be collected and archived, including outcome data that may be measured and collected years or decades later. Considerations of agency culture, capacity, and practice also were considered as the project team developed guidelines for future deep dives or in-depth case studies of traffic forecasts. In the following sections, the project team has summarized the methods, data, findings and conclusions from the Large-N and deep dive analyses. Based on these findings, this guidebook was prepared to enable agencies to build data archives, to develop methods that will enable them to carry out their own assessments of forecast accuracy, and hopefully to systematically improve the accuracy and utility of their forecasts in the future. 1.2.3 Large-N Analysis: Data and Methodology The Large-N analysis conducted for this study compared traffic forecasts made at the time projects were planned with flows that were measured after the projects were completed. To carry out this analysis, a database was assembled containing both traffic forecasts made when projects were planned and later traffic count data as provided by agencies in six U.S. states and four European countries. The states that provided data were Florida, Massachusetts (one project), Michigan, Minnesota, Ohio, and Wisconsin; and the four European countries were Denmark, Norway, Sweden, and the United Kingdom. Agencies in Virginia and Kentucky also contributed data that were not included in the database but could be added at a future date if additional resources are made available for that purpose. The records were compiled from databases maintained by DOTs and complemented by project reports, traffic studies, and environmental impact statements, as well as other databases assembled by researchers for prior studies. Three types of data were incorporated: • Information identifying and describing each project, • Information about the traffic forecasts, and • Information about traffic flows that were eventually counted.

Introduction I-9 The forecast accuracy database contains unique project identification numbers and each project’s improvement type, facility type, location, and length. For traffic forecasts, the database also includes information about who made the forecast, the year each forecast was made, the forecast target year, and whether the forecast was for the opening year, a middle year in the life of the project, or a distant future year as far as 20 years after opening. Also recorded were the forecasting methods and traffic count information such as the dates of the counts, the method by which the counts were done, and the particular roadway segments where the counts were done. The forecast traffic and measured traffic in the target year were compared for each project, and several metrics were calculated to ascertain the level of inaccuracy in the traffic forecasts. Forecasts were recorded for elements of projects that were referred to as “segments.” For example, forecasts for an interchange improvement project could include segment-level esti- mates for each direction of travel on the freeway, for both directions of the crossing arterial, and for each of the ramps. The database includes at least some information about 16,360 segments of 2,348 projects, but at the time of this research, a substantial proportion of the projects in the forecast accuracy database had not yet opened to traffic, and some of the segments for which forecasts were made had no subsequent count data associated with them. Other projects did not pass quality control checks for inclusion in the statistical analysis. The statistical analy- sis of the Large-N data was done for a carefully filtered subset of 1,291 projects comprising 3,912 segments, and the records describing the remaining segments and projects are available for future analysis. The database is not a random sample of all highway projects, which limited the ability of the project team to generalize from the analysis. The years in which projects in the database opened to traffic range from 1970 to 2017, with about 90% of the projects opening to traffic in 2003 or later. Details about the exact nature and scale of each project were not available in every case, but almost half of the entries in the database were design forecasts for repaving projects. Earlier projects were more likely to be major infrastructure capital investment projects and more recent projects were more often routine resurfacing projects on existing roadways. This shift arose because some state agencies began routine tracking of all forecasts only within the past 10 to 15 years and, in earlier years, information was retained only for major investments. In addition to the mix of projects in the database, notable differences were found in the forecasting methods used across agencies. Because the traffic counts were of average daily traffic (ADT), comparisons could not be made of peak period traffic, traffic by day of the week, or traffic by season. The project team evaluated the accuracy of opening-year forecasts for the practical reason that the interim and design years had not yet been reached for a large majority of projects included in the database. When opening-year traffic counts were unavailable for some projects, the project team used the earliest available traffic counts and adjusted year-of-completion fore- casts to compare them with forecast volume. To do this, the project team scaled the forecast to the year of the first post-opening count by linear interpolation so that both data points were for the same year. Data for the European projects was obtained from a doctoral dissertation that already had been scaled to match the count year using a 1.5% annual traffic growth rate. The project team used this approach for the European projects and interpolated between opening and design year for the U.S. projects. The obvious way to represent the accuracy of project-level traffic forecasts is to compare them with counts of actual traffic after the projects have been in service, but varying methods exist for doing so. The literature review revealed that previous studies of multiple projects defined errors or differences between forecast and measured traffic in different ways. Some studies measured

I-10 Traffic Forecasting Accuracy Assessment Research error as the predicted volume minus the measured volume; using this metric, a positive result represents an overprediction. Other studies defined error as the measured volume minus the predicted volume; using this metric, a positive value represents an underprediction. A popular metric used to determine the accuracy of traffic forecasts is the half-a-lane criterion. This criterion specifies that the forecast is accurate if the forecast volume varies from measured volume by less than half a lane’s capacity in relation to the constructed facility’s capacity. If the forecast volume is more than half a lane less than the facility’s measured capacity, the facility could have been constructed with one fewer lane in each direction. If the forecast volume is more than half a lane greater than the facility’s measured capacity, the facility likely needs one additional lane in each direction. Calculating whether a forecast traffic volume falls within half a lane of a facility’s capacity requires making several assumptions, such as the share of the daily traffic that occurs during the peak hour. The forecast error—represented by the difference between forecast traffic volume and the measured traffic volume—can be expressed as a percentage of the counted traffic or as a percent- age of the forecast traffic. An advantage of the former method is that the percentage is expressed in terms of a measured or “real” quantity (observed traffic). An advantage of the latter method is that, when the forecast is made, uncertainty in relation to the forecast value can be expressed (because it is accepted that the observed value will not be known until later). The project team also found a third approach used in some earlier studies that evaluates forecast accuracy by comparing the ratio of measured (actual) traffic to forecast traffic. For this research, the project team measured forecast accuracy as the percent difference from forecast (PDFF). The equation used to calculate the PDFF is expressed as follows: = − ∗PDFF Counted Forecast Volume Forecast Volume 100%, (I-1)i where PDFFi = the percentage difference from forecast for Project i. With the PDFF, negative values indicate overpredictions (the traffic flow counted on the actual project is lower than the forecast), and positive values indicate underpredictions (the measured or actual outcome is higher than the forecast). This way of measuring forecast accuracy is appealing because it expresses the error as a function of the forecast, which always is known earlier than traffic counts. The project team was able to use the distribution of the PDFF as measured across the projects in the dataset to portray the systematic performance of traffic forecasts. When expressing the sizes of errors, some researchers have used the mean percentage error and others have preferred to use the mean absolute percentage error (MAPE), disregarding the positive or negative sign associated with the error. The MAPE allows better understanding of the size of inaccuracies across projects because positive and negative errors tend to offset one another when calculating the mean percentage error. The project team continued in this tradition but used the PDFF, referring to the mean absolute percent difference from forecast (MAPDFF) instead of the MAPE. Thus, the project team measured the size of the errors as follows: ∑( ) = ∗ =Mean Absolute Percentage Difference from Forecast MAPDFF 1 n PDFF , (I-2)ii 1 n where n = the total number of projects. When assessing project forecast accuracy, it is important to distinguish between comparisons by segment and comparisons by project. Road projects typically are divided into several links,

Introduction I-11 or segments, within the project boundary. These links can be on different alignments or can carry traffic in differing directions. In the forecast accuracy database, each project was given a unique identification number, and the specific segments that made up the projects also were identified. Comparisons of accuracy could thus be made using segments or by aggregating them into projects. At a segment level, accuracy metrics are not independent. A project that contains multiple segments connected end-to-end likely has had traffic forecasts for all its segments made at the same time, using the same methods and employing the same external forecasts of popu- lation and economic growth. Across the various segments, the errors are likely to be highly correlated—and they are likely to be uniformly high or low. Whether the segments are treated as one combined observation or as several independent observations, the analyst would expect the average error to be similar. There would be a difference, however, in the measured t-statistics because the larger sample of segments could suggest significance whereas the smaller sample of projects might not. Although segment-level analysis might seem less valid than project comparisons, it has other merits: A few measures of inaccuracy are better represented in analysis of segments. When assessing forecast inaccuracy over roadways of different functional classes, for example, segment-level results provide better representation than aggregated results over the entire project. In previous studies, some researchers weighted segments within a project by their volumes when arriving at accuracy figures for project forecasts. A limitation of weighting the segments is that segments have different lengths and thus should probably not be weighted by traffic volumes alone. For this study, the project team had little data on the lengths of links and chose not to weight project segments by their volumes. To the extent possible, the project team reported the distribution of forecast errors at both the project level and segment level. For project-level comparisons, the project team averaged traffic volumes across all segments and measured the error statistics by comparing the average forecast and average measured or actual traffic. The variables with which the volumes were compared in further analysis also were aggregated. Improvement type, area type and functional class could differ by segment, but the most prevalent characteristic among the segments was used to clas- sify the projects. For example, if most of the segments in a project consisted of resurfacing and reconstruction with no major improvement, the project was considered to be of the resurfacing and reconstruction type even if one segment might include work otherwise classified as a major improvement. Forecast methods, unemployment rates, and years of forecast and observation were the same across all segments making up any given project and thus could be used directly for project-level analysis. Past research has used ordinary least squares (OLS) regression to compare the forecast traf- fic with the flows measured once the projects open to traffic. Such comparisons can be used to identify biases in estimates and are usually done by regressing the actual (counted) volumes of traffic as a function of the forecast value. To consider multiplicative effects as opposed to addi- tive effects, the project team scaled the regressors by the forecast value, expressed as follows: = α + β + γ + εˆ ˆ . (I-3)y y X yi i i i i In such a formulation, γ = 0 indicates no effect of that particular term, whereas positive values would increase the forecast by that amount and negative values would reduce the forecast by that amount. In addition to the estimation of biases, this study examined how the distribution of errors relates to the descriptive variables. For example, it could be the case that forecasts having longer time horizons are unbiased with respect to those having shorter time horizons. These forecasts

I-12 Traffic Forecasting Accuracy Assessment Research are not systematically higher or lower, but they could have larger errors as measured by the MAPE. To explore this issue, the project team extended the regression framework to use quantile regression instead of OLS regression. Whereas OLS regression predicts the mean value, quantile regression predicts the values for specific percentiles in the distribution. The project team estimated quantile regression models of the measured or counted traffic volumes as a function of the forecast volumes and other descriptive variables. This was done for the 5th percentile, 20th percentile, median, 80th percentile, and 95th percentile. The median value provides the expected value, whereas the 5th or 20th percentiles provide lower bounds on the expected value and the 80th and 95th percentiles provide upper bounds. This range of models allows for a comparison of the variability of the forecasts within ranges as well as the more usual and simple estimates of means alone. 1.2.4 Large-N Analysis: Results Part II of this report presents analysis of data from 3,912 segments which made up 1,291 sepa- rate projects. The analysis is graphically summarized in Figures I-1 and I-2. Figure I-1 shows that traffic forecasts overpredicted future flows more often than the forecasts underpredicted them (i.e., the distribution of errors shown in Figure I-1 is heavier on the negative side). The MAPDFF is 17.31% with a standard deviation of 24.93. Figure I-2 presents project forecast errors as a func- tion of forecast volumes, and shows that percentage errors decrease as traffic volumes increase. When expressed as a percentage, an error of a particular size is a smaller percentage of a larger forecast, so this trend was expected, and the project team also found that the distribution of errors became less dispersed around the mean error as forecast volumes increased. As discussed, quantile regression was used to explore the uncertainties inherent in forecasting traffic. It is useful to include that uncertainty as part of the information presented when develop- ing and communicating forecasts. For example, in Figure I-3, bands can be seen within which an observer can be increasingly certain the forecast traffic will lie. Using the data for the projects included in this study, the project team developed several quantile regression models to assess PDFF Forecast ∗ 100 (Actual – Forecast) Figure I-1. Distribution of PDFF (project level).

Introduction I-13 the biases in the forecasts based on the variables described in the previous section. The models were developed for the 5th, 20th, 50th (median), 80th, and 95th percentile values. As expected, the models demonstrate rather wide ranges of potential actual traffic at a given level of certainty as the mean volume forecasts grow. The lines in Figure I-3 that depict various percentile values can be interpreted as the range of actual traffic over a forecast volume. For example, it can be expected that 95% of all projects with a forecast ADT of 30,000 will have actual traffic below 45,578. Only 5% of the projects will experience actual traffic less than 17,898. Not considering other variables, this range (45,578 to 17,898 for the forecast volume of 30,000) includes 90% of the projects. Forecast Volume PDFF Forecast ∗ 100 (Actual – Forecast) Figure I-2. PDFF as a function of forecast volume (project level). 0 10000 20000 30000 40000 50000 60000 0 10000 20000 30000 40000 50000 60000 Ex pe ct ed A D T Forecast ADT Perfect Forecast 5th Percentile Median 95th Percentile 20th Percentile 80th Percentile Figure I-3. Expected ranges of actual traffic (base model).

I-14 Traffic Forecasting Accuracy Assessment Research The project team could make a number of observations from the Large-N analysis. These observations and the supporting data are described in more detail in Part II but are summarized in these 11 statements: 1. Traffic forecasts show a modest bias, with actual ADT about 6% lower than forecast ADT. 2. Traffic forecasts show a significant spread, with a mean absolute PDFF of 25% at the segment level and 17% at a project level. 3. Traffic forecasts are more accurate for higher-volume roads. 4. Traffic forecasts are more accurate for higher functional classes, over and above the volume effect described above. 5. The unemployment rate in the opening year is an important determinant of forecast accuracy. 6. Forecasts appear to implicitly assume that the economic conditions present in the year the forecast is made will perpetuate. 7. Traffic forecasts become less accurate as the forecast horizon increases, but the result is asymmetric, with actual ADT more likely to be higher than forecast as the forecast horizon increases. 8. Regional travel models produce more accurate forecasts than traffic count trends. 9. Some agencies have more accurate forecasts than others. 10. Traffic forecasts have improved over time. 11. Of the forecasts reviewed by the project team, 95% were found to be “accurate to within half of a lane.” An important limitation to these observations is worth noting. The projects analyzed were selected based on the availability of data, not in a way to create a random or representative sample of all projects. Therefore, a selection bias may exist that influences the above observa- tions. Specifically, although the opening year of the projects examined ranges from 1970 to 2017, about 90% of the projects opened in year 2003 or later. The project team did not always know the exact nature of a project, but inspecting the data revealed that the older projects tended to be major infrastructure projects, whereas the newer projects were more likely to be more routine work (e.g., improvements to existing facilities or repaving projects). Likewise, the information available about the type of project, the methods used, and the specific data recorded were all functions of choices made by the agencies that provided the data. These factors may interact in ways that make it difficult to draw firm conclusions about questions such as the effectiveness of various methods. For this reason, the project team elected to make all of the data underlying this report available for future researchers to analyze further (see the Forecast Cards repository and Forecast Card Data repository, further described in Part I, Chapter 3). The project team encourages agencies to develop their own datasets on projects similar to those for which they will create forecasts. Furthermore, the project team encourages agencies to upload data from their own project datasets to the Forecast Cards and Forecast Card Data repositories, further expanding this shared and sharable tool. 1.2.5 Deep Dives: Objectives, Data, Cases and Methodologies The statistical analysis of data for a large number of projects (the Large-N analysis) pro- vided useful indications of the magnitudes of errors and suggested factors that might in general be causes of forecast errors. But the aggregate analysis that was done on the large sample could not determine with certainty the causes of forecast errors. To the extent fea- sible, the project team tried to shed more light on factors that help to explain those errors more completely. The team conducted six fairly detailed case studies of traffic forecasts in different states for projects having distinct contexts and forecast results. The project team’s

Introduction I-15 ability to conduct these deep dives was limited by time, money, and, most critically, by the availability of data. Six cases is a small number, and the documented information available for one of the six cases was insufficiently complete to allow confident conclusions; however, the insights gained were substantial and richly complemented what had been learned from the Large-N analysis. The deep dives investigated projects that were large enough in scope and budget to repre- sent meaningful changes in the local transportation network. The projects also were chosen to represent a variety of types of highway improvements. These case studies also were chosen because the projects had been completed, were already open to traffic, and post-opening data was available to compare with the forecasts. For each case study, detailed information was available about the forecasts that had been conducted while the projects were being planned. In some cases, this included forecasting model runs that could be studied because they had been archived. Where model runs had not been preserved, detailed reports of the forecasts or indirect sources such as environmental impact reports were examined. In some cases, indirect sources, including environmental impact reports or other public documents, had been used to arrive at detailed forecasts. The six projects chosen for the deep dives included a new bridge, the expansion and exten- sion of an arterial on the fringe of an urban area, a major new expressway built as a toll road, the rebuilding and expansion of an urban freeway, and a state highway bypass around a small town. Part II of this report provides details of each case study project, including the location of each project, the most important design and contextual variables, sources of data collected for each, the methods and models used to make the forecasts, forecasts of traffic, and post-opening descriptions of conditions. Complementing the descriptions in Part II, Appendix H in Part III (titled “Deep Dives”) provides additional detailed technical information for each case study. The six projects also reflected a variety of locations, as follows: • Eastown Road Extension, Lima, Ohio; • Indian Street Bridge, Palm City, Florida; • Central Artery Tunnel, Boston, Massachusetts; • Cynthiana Bypass, Cynthiana, Kentucky; • South Bay Expressway, San Diego, California; and • US-41 (later renamed I-41), Brown County, Wisconsin. The number of deep dives was small partly because it was difficult to find suitable case studies. The project team did find a few agencies who carefully archived forecasts, but none had been doing so for more than a decade, and projects planned more recently were less likely to be completed and open to traffic. Where they still were present, long-time senior staff had kept their own records, and some staff recalled what was done in particular cases; however, staff turnover, retirements, and deaths made it rare to find such people available. Reflecting limited funding at agencies, a large proportion of relatively recent projects consisted of toll roads, making it difficult to identify toll-free highways for case studies. The deep dives enabled the team to explore which elements of forecasts could be identified clearly as sources of inaccuracies. In some cases, population forecasts and regional employ- ment forecasts had been used as the basis of forecasts of traffic growth, and the population had not grown as had been forecast, or an economic downturn had caused shortfalls in expected job growth. Where such departures from expected trends could be identified, their influences on traffic forecasts were estimated. If major changes had been made in the scope of a project after the preparation of forecasts (e.g., changes in roadway alignment, the number of lanes, or the elimination or addition of entrances and exits), those could explain differ- ences between forecasts and realized traffic. In some, but not all, cases the data and models

I-16 Traffic Forecasting Accuracy Assessment Research that had been used to make the forecasts were available to the team. In these cases, the proj- ect team was able to rerun the models using the data for the population and employment growth that had actually occurred and the roadway features that had changed. Rerunning these models using actual (after-the-fact) data allowed the team to determine whether the sources of errors in the initial forecasts related to the original (before-the-fact) data or to aspects of the forecasting models themselves. To the extent possible, the team endeavored to attribute errors identified in the case studies to their sources, but some of the differences between forecast and measured outcomes remained unexplained. Where clear patterns enabled the team to reach conclusions about sources of errors, those conclusions were used to formulate advice for future forecasters. 1.2.6 Deep Dives: Results and Interpretations The deep dive case studies were consistent with the Large-N analysis in that forecast traffic was generally higher than the actual traffic that was measured after the projects opened. The deep dives enabled the project team to explore in greater depth several factors that were likely to have contributed to this overestimation. The case studies demonstrated that the sources of errors in forecasts were the result of the varying contexts and conditions of particular projects, and that made it difficult to generalize; however, a few general patterns can be stated. A well-documented economic downturn and recovery occurred during the decade preced- ing this research, and it seemed likely that the forecasts examined had been based on erroneous expectations about underlying growth and price trends that likely would have been projected to continue but did not. The recession caused a loss of jobs and economic activity that could significantly impact work trips, leisure activities, and consumption patterns, which in turn would impact traffic flows. The deep dives revealed that traffic forecasts overestimated traffic in part because they were based on exogenous projections of future employment (jobs) and population that were too high and on petroleum fuel prices that were too low in comparison with what actually occurred. Projections based on past trends did not anticipate the dramatic economic perturbations that took place. In cases for which modeling software was available and the models were rerun with actual employment and population growth figures and true fuel prices, the forecasts that these models produced were much closer to the actual, measured traffic flows. In cases for which models were not available to be rerun, the calculation of elasticities (percent changes in traffic as a function of percentage changes in the input variables) led to similar conclusions. In one case, the economic downturn had—in addition to reducing employment, income, and travel— caused new development at the urban fringe to be delayed long enough that anticipated toll revenue was not realized. In some cases, errors in traffic forecasts resulted from errors that had been made in exogenous estimates of “external” traffic (traffic that originated or terminated outside the project area). Taken as a given when a bypass was designed, the rate of growth in external traffic departed from the projections to the extent that traffic on the bypass was incorrectly forecast. In another case, traffic projected to be crossing an international border was an input into traffic forecasts for a project, and that exogenous forecast similarly proved to be incorrect, in part because of its insensitivity to the economic downturn. 1.3 Conclusions and Recommendations Based on the research described in this chapter, the project team offers the following conclu- sions and accompanying recommendations for improving traffic forecasting practice.

Introduction I-17 1.3.1 Lessons Learned One of the most important and overarching conclusions from this study is that agencies should take far more seriously the analysis of their past forecasting errors so that they can learn from the cumulative record. It is tempting to assert that the future is always uncertain and thus forecasts of the future will always be wrong, but doing that is far too glib. Agencies also may believe that they avoid political embarrassments by not archiving past forecasts and by not addressing the sources of their past forecast errors. This approach may be easy and low in cost in the short run, but it explicitly prevents agencies from improving future forecasts and using their resources more efficiently in the future. Forecasts are essential elements in the creation of effective highway plans and project designs, and because forecasts are always subject to error it is essential that agencies document their forecasts and revisit them in order to identify the assumptions that lead to errors that are compounded over time. Project documentation often is insufficient to allow agencies to evaluate the sources of their forecast errors. In the deep dives, the project team found that the forecasting accuracy improved after accounting for several exogenous variables like employment rate and population. However, the effect of changes in other potentially important variables could not be ascertained for some of the projects. Improved documentation of the forecast methods would make such assessments more informative, particularly regarding the definition of the variables used in the model. Improved documentation can be achieved by adopting standardized approaches to documentation and through archiving of past forecasts so that they become accessible to agency staff in the future. In recent years, some transportation agencies have started archiving their forecasts, and practitioners are beginning to see the benefits of that foresight. The data used in the Large-N analysis were provided by several state DOTs and several researchers who have studied forecasts over time. Some of their efforts have involved creating databases of all project-level traffic forecasts in recent years. Many of these forecasts are for projects that have not yet opened to traffic. The project team based the analysis in this report on about 1,300 projects that have opened and are in service, but information about thousands more transportation projects are available in the forecast accuracy database, waiting to be evaluated after they open. (For a description of how to access the data and archive agency-specific data, see Part I, Chapter 3.) In this research project, forecast evaluation was most effective when archived model runs were available. The most successful deep dives, which resulted in the most insight into sources of forecast errors, were the three cases for which archived model runs and associated data inputs were available to the project team. These cases provided deeper understanding of the param- eters and methods used for forecasting traffic and allowed the team to directly test the effects of changes. Agencies frequently express skepticism about the usefulness of archived model runs; however, the project team was able to successfully run and learn from all three of the archived model runs that were provided. Some of these models had been used to make forecasts as far as 15 or more years ago, including some that were initially developed using software that has since been superseded by several later versions. When forecasting models and input data are systematically archived, it becomes possible to compare the accuracy of various forecasting methods by comparing competing forecasts for the same project. For some cases, the project team was able to compare traffic forecasts that were simple straight-line projections of past trends with forecasts made using a travel model. Through this comparison, it was possible to show that models generally provide better results than trend extrapolation. Nonetheless, it was difficult to arrive at meaning- ful comparisons of the accuracy of forecasts in which different models were employed. The difficulty occurred both because the details of the models and their specific features were

I-18 Traffic Forecasting Accuracy Assessment Research typically not recorded as part of the data made available to the project team, and because model differences vary even among agencies that state they are using the same models. The best way to compare types of models would be to produce competing forecasts for the same set of projects and compare their accuracy. This process would be equivalent to running a controlled experiment accounting for all relevant factors. A final observation that was well illustrated by the quantile regression of the data in the Large-N analysis is that, especially for complex and expensive capital investment projects, the most efficacious forecasting could involve the development and communication of ranges of future traffic. Instead of discussing forecasts simply as being “inherently subject to error,” agencies’ forecasts can be made more useful and more believable to the public if they embrace and express the uncertainty as an essential element of all forecasting. 1.3.2 Summary of Recommendations From the lessons learned during this project, as well as the project team’s experience across several decades of producing and using traffic forecasts, the authors of this report developed four major recommendations. The recommendations offered apply to agencies or organizations that are responsible for creating project-level traffic forecasts. Such agencies are expected to include MPOs and state DOTs, but also could include the contractors or private firms that engage in generating traffic forecasts for the public sector or for private investors. Recommendation 1: Use a range of forecasts to communicate uncertainty. Consistent with past research, the results of this study show a distribution of actual traffic volumes around each forecast volume. These distributions can provide a basic understanding of the uncertainty in outcomes surrounding a forecast. A goal of forecasting is to minimize the bias in this distribution and to reduce the variance such that the forecast traffic volumes more closely align with actual (counted) traffic volumes. The results of this study show that forecasts have tended to improve over time (subject to confounding with the types of project in more recent forecasts), but the project team does not expect perfection to be achievable in the realm of forecasting. Instead of perfection, the goal should be to achieve project forecasts that are good enough to make informed decisions about the project. A practical definition of a good enough forecast is a forecast that comes close enough to the project’s actual outcomes that the decision would have remained the same if the decision had been made with perfect knowledge. For example, if a forecast is used to make a decision about how many lanes to build on a roadway, the conventional wisdom is that the traffic forecast should be “accurate to within half a lane.” The project team’s analysis shows that 95% of the forecasts made for projects considered in this study met this threshold. A corollary definition of good enough is that—to a point—the decision makers are willing to accept the consequences of a sub-optimal forecast as a trade-off for the ability to move forward with a project, even with imperfect information. If the consequences of an imperfect decision are low, then fewer resources can be invested in forecasting, whereas more extensive study and more accurate forecasts may be warranted when the consequences are high. This approach to defining a good enough forecast naturally distinguishes between smaller and more routine projects and larger mega-projects or projects that are otherwise unique. Similarly, the effect of traffic forecasts on decisions depends on the information that is most relevant to the decision. This research focuses specifically on project-level forecasts of ADT, but the information relevant to projects aimed primarily at improving traffic safety may be different.

Introduction I-19 To evaluate whether a forecast is sufficient to inform the decision at hand, the authors of this report recommend that forecasters explicitly acknowledge the uncertainty inherent in forecasting by reporting a range of forecasts. If an actual outcome at the low end or high end of the range would not change the decision, then the project’s sponsors can safely proceed with little worry about the risk of the project. If an actual outcome at the low end or high end of the range would change the decision, that potential outcome should be considered a warning flag. In this situation, further study may be warranted to better understand the risks involved, or the decision makers may choose to instead select a project with lower risk. Multiple mechanisms are available for considering uncertainty in traffic forecasts. During the outreach efforts that were conducted to preview this recommendation, the project team talked to a number of analysts who preferred to consider uncertainty by running travel models multiple times with differing inputs (e.g., low, medium, and high employment growth forecasts). This approach has the advantage that it allows the travel model to account for the non-linear relationships between traffic volume and congested speed. A limitation of this approach is that the analyst must determine a reasonable range of inputs with which to run the travel model. In this report, the project team presents an alternative method for determining a range of forecasts derived from the historic accuracy of traffic forecasts. The quantile regression models provide this mechanism. The models are estimated from the actual traffic volume as a function of the forecast traffic volume, and thus provide a means of predicting the range of expected traffic from a single forecast traffic volume. The use of quantile regression models allows the observer to consider the characteristics of the project and the forecast. For exam- ple, forecasts with a shorter time horizon may have a narrower range of expected outcomes. The ranges are empirical, meaning that they consider the full set of possible errors that have occurred in the past, rather than leaving it up to the analyst to determine a reasonable range of inputs. Using such empirically derived ranges can be beneficial because they may implicitly incorpo- rate factors that an observer might not consider on his or her own; however, this approach can be limiting if the future looks very different from the past. For example, a risk of forecasts made in 2019 might be the unknown effects of car-sharing; for the existing projects that were already open in 2019, autonomous vehicles were not identified as a factor. One important advantage to the quantile regression approach is that it is very simple to apply. Obtaining a range of ADT forecasts is as simple as tracing lines on a chart or inputting values to a spreadsheet. Examples of ADT forecasts and their ranges are provided in this report and are derived from the full set of projects considered in this project. Part I, Chapter 2 describes the details of how to apply this method to obtain a range of forecasts and use measured accuracy to communicate levels of uncertainty around a particular forecast. Recommendation 2: Systematically archive traffic forecasts and collect observed data before and after the project opens. As was discussed under Recommendation 1, understanding the historic accuracy of forecasts has value in part because it provides an empirical means of communicating the uncertainty of the outcomes surrounding a forecast. The ability to assess and communicate this uncertainty is predicated on having the data to support the analyses. The research that went into this project was possible because of two things: the foresight of transportation agencies that had started collecting the necessary data some time ago, and their willingness to share the information. The data accompanying this report provide a snapshot for forecasts made by a specific set of agencies for a particular set of projects. These data will become dated, however, as new projects continue

I-20 Traffic Forecasting Accuracy Assessment Research to be planned and to open. In addition, these data are general, and the experience of individual forecasting agencies may vary. For example, consider an agency that has invested in a specific set of traffic forecasting methods. That agency may be interested in understanding how well their specific methods perform. Likewise, a particular agency may frequently build a particular type of project, and have an interest in understanding the range of actual outcomes as a function of the forecast traffic for similar projects. For these reasons, the project team recommends that agencies responsible for traffic forecasts systematically archive their traffic forecasts and collect data on the actual outcomes after the project opens. It is recommended that the archiving happen at the time the forecast is made, because the project team can confirm the experience of other researchers that it is much more difficult to assemble the data afterwards. Ideally, the archiving and the collection of associated data should be made systematic, such that it is a normal part of the forecasting process to ensure that it happens. The project team also recommends that the process be standardized so that comparable information will be collected for all projects. This standardization will make the data collection easier, because the forecaster does not need to decide what to record and can instead fill in a template. Standardization also will ensure that the data can be more readily compared across projects and across agencies. In this research, the project team was able to learn more from those projects for which more information was available. The basic project information available for use in the Large-N analy- sis allowed the team to create the overall distributions of forecast accuracy, consider the effect of different factors, and generate the quantile regression models. The more detailed information available in the deep dives allowed the team to better understand why the forecasts were right or wrong. Recognizing that effort is involved in archiving and data collection—and that more effort is required to compile and archive more detailed project data—it is appropriate that the effort expended relate both to the intended use of the data and the importance of the project. Accord- ingly, the research team recommends that agencies use a three-tiered approach to data collection and archiving, as follows: • Bronze. The first archiving level, termed Bronze, records basic information about the fore- cast and the actual traffic volume, as well as basic details about the type of project and the method of forecasting. Bronze-level archiving is recommended for all project-level traffic forecasts. • Silver. The second level, termed Silver, includes all elements of Bronze-level archiving but adds project-specific details and nuances and documents assumptions made in the forecast that would otherwise not be captured. Silver-level archiving need not be used with all proj- ects, but it is recommended for projects that are larger than typical projects or that represent new or innovative solutions that do not have a long track record of accurate forecasts. The team also recommends that Silver-level archiving be applied to a sample of typical projects in order to monitor forecast accuracy in relation to those projects that make up the largest number of forecasts. • Gold. The third level, termed Gold, includes all elements of Silver-level archiving and builds further by focusing on those details that make the traffic forecast reproducible with new input after project opening. The ability to reproduce the forecast (rerun the model) allows the sources of forecasting error to be definitively identified. Gold-level archiving requires the most effort, and is therefore recommended for unique projects and innovative projects for which forecasts were not generated previously. As with Silver-level archiving—and for similar reasons—the project team recommends that Gold-level archiving also be applied to a sample of typical projects.

Introduction I-21 Within the context of these three levels, Chapter 3 provides specific recommendations of what to archive and how to do so efficiently. Recommendation 3: Periodically report the accuracy of forecasts relative to observed data. The project team recommends that agencies responsible for producing traffic forecasts periodically report the accuracy of their forecasts relative to actual outcomes. Doing so will provide the empirical information necessary to uncertainty surrounding their traffic forecasts, as described in Recommendation 1. It also will ensure a degree of accountability and trans- parency in the forecasting process. For accountability, it removes any incentive to “nudge” the forecasts in one direction or another. For transparency, it makes it clear that the agency is acting in good faith to produce the best possible forecasts and has nothing to hide. Transparency may be especially important if the analysis shows the forecasts to be biased in one direction or another, because it provides an opportunity to explain the reasons for that bias, and to account for such bias, adjusting the expected range of future forecasts. For agencies that have a history of producing accurate forecasts, it provides an opportunity to demonstrate the effectiveness of their work. In such situations, those agencies could be justified in using a narrower range when estimating the uncertainty around future forecasts. Reporting on the accuracy of forecasts relative to observed data would rely on the data sys- tematically collected according to Recommendation 2. It is expected that such reports might include three main components: • Updates on the overall distribution of forecast error, with possible exploration of distribu- tions by project type and other dimensions. • Estimates (made using the quantile regression models and local data) for the purpose of generating more specific ranges of expected outcomes as a function of the forecast. • Specific deep dives aimed at understanding the sources of forecast error for either typical or specific, important projects. Chapter 4 discusses specific methods by which to report forecast accuracy and suggests a structure for practitioners to follow that is based on the research done in this project and aims to make the overall process less burdensome. Recommendation 4: Consider the results of past accuracy assessments in improving traffic forecasting methods. The traffic forecasts considered in this study were generated using multiple methods, the most common being the use of a travel demand model, and through extrapolating trends in traffic counts. Although extrapolating traffic count trends is a simple process, the analyst must still make choices about the details of the method, such as how long of a past trend to consider, and whether projections of growth should be linear. Travel demand models consider a broader set of factors in forecasting and come in a variety of forms. Often, the methods used to generate forecasts and the details used in those methods are selected based on the judgment of the analyst. They usually are estimated from travel survey data and are calibrated to match observed base- year traffic counts and other base-year data. Sometimes the models undergo sensitivity testing to ensure that they respond reasonably to changes in input, and sometimes they are tested in backcasting exercises to ensure they can reasonably replicate past conditions. However, forecasting is distinct from backcasting and from comparisons against base-year conditions because the future is unknown at the time of the forecast. At the time of this study, the project team was not aware of efforts to consider how well travel models perform in forecasting as a means to improve the next generation of travel models. The project team recommends that such research be undertaken.

I-22 Traffic Forecasting Accuracy Assessment Research The project team recommends that as they set out to improve their traffic forecasting methods or to update their travel demand models, transportation agencies consider the results of past forecast accuracy assessments. This assessment may take several forms: • If deep dives have revealed specific sources of error in past forecasts, those sources should be given extra scrutiny in developing new methods. Conversely, if deep dives have revealed that a particular process is not a major source of error, then additional resources do not need to be allocated to further refining that process. • Data collected on actual project outcomes (Recommendation 2) can be used as a benchmark against which to test a new travel model. Rather than focusing the validation only on the model’s fit against base-year data, this approach would test whether the new model is able to replicate the change that occurs when a new project opens. This is akin to testing a model in the way it will be used, and it offers a much more rigorous means of testing. • To the extent that Large-N analyses can be used to demonstrate better accuracy on the part of one method over another, that information should inform the selection of methods for future use. The project team was not able to demonstrate such differences in this research, largely because of the challenges faced in isolating the effect of the method on accuracy versus the effect of other factors, such as type of project. When conducting their own Large-N analyses, agencies can overcome these limitations by carefully recording the details of all projects and by controlling for these factors so they can be more fully considered, running comparative tests that show the outcomes of multiple methods for the same project. Chapter 5 describes the specific ways in which forecast accuracy data can be used to improve traffic forecasting methods. The project team recommends that these approaches be integrated into travel model development and improvement projects. 1.3.3 Reasons to Implement These Recommendations Traffic forecasts are used to inform important decisions about transportation projects, including the selection of which projects to build and certain design elements of those projects. Because decisions about transportation projects involve trade-offs between benefits and costs, those decisions inevitably have political impacts. Engineers and other technical experts have an ethical obligation to ensure that the accounting of benefits and costs is done in an objective manner, allowing the decision makers to focus on the trade-offs rather than on the quality of the information provided. Objective forecasts are not necessarily accurate, nor are they necessarily viewed as credible in the eyes of decision makers or the public. Objectivity, accuracy, and credibility are all important, and they are related. An inaccurate forecast may lead to a sub-optimal decision for a specific project, but it may also undermine trust in forecasts made for other projects. If decision makers begin to dismiss forecasts because they do not view them as credible, then the decision-making process risks becoming purely political without the benefit of objective information about the effects of the proposed project. One strategy to prevent this outcome is to avoid drawing attention to inaccurate forecasts, and instead ask decision makers and the public to trust the technical expertise of those pre- paring the forecasts. Several characteristics of forecasts enable this strategy (Wachs 1990): forecasts are technically complex, they require subjective assumptions, and they cannot be verified until the intended action is taken. The first two characteristics make it easy to ask for reliance on the experts, while the third characteristic makes it easy to dismiss a retrospective view as irrelevant because the decision has already been made. This strategy follows a certain logic, but it also carries some risk, because if trust is broken, no clear mechanism exists by which to rebuild it.

Introduction I-23 Even if a forecaster is very objective and careful in the analysis, the forecast can turn out to be inaccurate. For example, an unexpected recession could occur at about the time a project opens. Worse, the forecaster’s credibility could be undermined by competing forecasts created by others who are less objective or careful. For example, Flyvbjerg (2007) argues that some planners deliberately misrepresent the costs and benefits of large infrastructure projects to increase the likelihood that they will be built. Clearly, deliberate misrepresentation is worse than forecast error that results from an unforeseen factor. How can those forecasters who are careful and objective in their analysis distinguish themselves from those who are less so? The distinction can be made by reversing the above strategy and instead being deliberately transparent. Like objectivity, transparency does not ensure that all forecasts will be accurate; however, it does send three clear messages: 1. The agency preparing the forecasts has nothing to hide; 2. Any inaccuracies are the result of unexpected factors and not deliberate misrepresentation; and 3. The agency is legitimately interested in learning from those inaccuracies and using them to improve. If the agency can build a track record of accurate forecasts, it provides evidence with which to build trust in the abilities of its forecasters and establishes the credibility of future forecasts. At the same time, the agency gains the benefits associated with its ongoing efforts to use information to build more accurate forecasts. This approach aligns with the broader scientific process, which involves (1) making testable predictions (hypotheses or forecasts) based on theory and assumptions, (2) evaluating whether those predictions are correct, and (3) revising the theory or assumptions if they are not. In addi- tion to the process of prediction and testing, community-level components are necessary to ensure the validity and credibility of science (Galileo Galilei 1638), including: • Data recording and sharing, • Reproducibility, and • External review. Archiving forecasts and ensuring that the archived data are available for external review helps to build this broader validity and credibility in much the same way that publishing scientific results in peer-reviewed journals does more broadly for science. Even if individual predictions or hypotheses are later shown to be inaccurate, the process ensures the advancement of science because the testing of each prediction provides a learning opportunity. The recommended approach also facilitates the implementation of performance-based planning and programming (PBPP), which is a point of emphasis in federal transportation legislation, Moving Ahead for Progress in the 21st Century (MAP-21). PBPP aims to select transportation projects based on their expected ability to meet stated goals and objectives, then monitor and evaluate whether they are effective in doing so. Monitoring and evaluating the accuracy of traffic forecasting tools is a way of applying PBPP principles to the tools. In addition, monitoring transportation project outcomes enables PBPP methods to be applied at the project level. It can demonstrate value from those projects, and helps forecasters develop professional judgment by providing them with a library of projects similar to those for which they may currently be forecasting. This approach would enable what Flyvbjerg refers to as reference class forecasting (Flyvbjerg et al. 2006), and it is especially valuable for new forecasters who do not have a lifetime of experience from which to draw. Finally, the recommended approach is consistent with the standards for evidence-based decision-making put forth by the International Standards Organization (ISO) and codified in

I-24 Traffic Forecasting Accuracy Assessment Research their ISO 9000 standard (International Standards Organization n.d.). ISO 9000 focuses on developing a process for ensuring and monitoring quality, such as the process recommended in this guidebook, and is widely used in the private and public sectors in the United States and Europe. Specific reasons to implement the four recommendations are: • Recommendation 1: Use a range of forecasts to communicate uncertainty. Reporting a range of forecasts explicitly communicates the risk associated with forecasts, and it is possible that the range may result in a different decision or the introduction of strategies to manage the risk. If the project decision would be the same across the range of forecasts, communicating the range adds confidence that the decision is defensible. Acknowledging the uncertainty inherent in forecasting and reporting a range is a way for the forecasting agency to protect its own credibility. For example, consider a forecast that a road will carry 20,000 ADT. When the road opens, the actual traffic is only 16,000 ADT. It is easy to criticize the forecast as inaccurate because there is a -20% difference from the forecast. However, if the original forecast were to report an expected value of 20,000 ADT within a range of 15,000 to 25,000 ADT, then the actual traffic would fall within the range and the forecast could be considered accurate. • Recommendation 2: Systematically archive traffic forecasts and collect observed data before and after the project opens. In addition to enabling the remaining recommendations, systematically archiving forecasts and the associated project outcomes promotes transpar- ency. The very fact that details about the forecasts will be transparent encourages a higher level of care and quality in the forecasts. This is especially true for forecasts archived at the Gold level, when the full model runs are archived. Traffic forecasting models can be complex tools, and when applying them it is easy to miss details in terms of managing files and scenarios, setting input parameters, and other specific items or tasks. If the model runs will be stored and documented in such a way that another person can reproduce the results, the original analyst will likely be meticulous in the application of the model. This also prevents the files from becoming lost or disorganized on someone’s individual computer. • Recommendation 3: Periodically report the accuracy of forecasts relative to observed data. Updating and reporting forecast accuracy results using actual, local data provides a better indication of the performance of the tools that a specific agency is using. These reports also can document improvements or better-than-typical accuracy. Also, if an agency has a track record of accurate forecasts, using this data to update the quantile regression models will allow the ranges considered in Recommendation 1 to be narrower. • Recommendation 4: Consider the results of past accuracy assessments in improving traffic forecasting methods. Another reason to review the accuracy of forecasts is to understand the advantages of particular travel models compared with others. Efficient use of resources continues to be an overarching goal of every transportation agency. By comparing forecasting methods and models, forecasters can compare cost differences across methods as well as the differences in forecast accuracy and reasonableness in the real world. As a result, less costly forecasting techniques can be used for the types of projects that already are well understood. Cost savings accrued for more routine projects can allow more resources and more intensive techniques to be used to address more challenging situations.

Next: Chapter 2 - Using Measured Accuracy to Communicate Uncertainty »
Traffic Forecasting Accuracy Assessment Research Get This Book
×
 Traffic Forecasting Accuracy Assessment Research
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Accurate traffic forecasts for highway planning and design help ensure that public dollars are spent wisely. Forecasts inform discussions about whether, when, how, and where to invest public resources to manage traffic flow, widen and remodel existing facilities, and where to locate, align, and how to size new ones.

The TRB National Cooperative Highway Research Program's NCHRP Report 934: Traffic Forecasting Accuracy Assessment Research seeks to develop a process and methods by which to analyze and improve the accuracy, reliability, and utility of project-level traffic forecasts.

The report also includes tools for engineers and planners who are involved in generating traffic forecasts, including: Quantile Regression Models, a Traffic Accuracy Assessment, a Forecast Archive Annotated Outline, a Deep Dive Annotated Outline, and Deep Dive Assessment Tables,

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!