National Academies Press: OpenBook

Traffic Forecasting Accuracy Assessment Research (2020)

Chapter: Appendix F - Literature Review

« Previous: Appendix E - Implementation Plan
Page 157
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 157
Page 158
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 158
Page 159
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 159
Page 160
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 160
Page 161
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 161
Page 162
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 162
Page 163
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 163
Page 164
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 164
Page 165
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 165
Page 166
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 166
Page 167
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 167
Page 168
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 168
Page 169
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 169
Page 170
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 170
Page 171
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 171
Page 172
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 172
Page 173
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 173
Page 174
Suggested Citation:"Appendix F - Literature Review." National Academies of Sciences, Engineering, and Medicine. 2020. Traffic Forecasting Accuracy Assessment Research. Washington, DC: The National Academies Press. doi: 10.17226/25637.
×
Page 174

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

III-F-1 Literature Review A P P E N D I X F Contents III-F-2 1 Introduction III-F-2 2 A History of Forecast Evaluations III-F-5 3 Existing Systematic Review Programs III-F-6 4 Summary of Existing Outcomes III-F-9 5 Methods of Evaluation III-F-14 6 Identified Problems with Forecast Accuracy III-F-14 7 Gaps in Knowledge III-F-16 References

III-F-2 Traffic Forecasting Accuracy Assessment Research 1 Introduction The current assessment of traffic forecasting accuracy in NCHRP Project 08-110 builds upon past efforts. This document summarizes those efforts, and what can be learned from them for the current study. It begins by reviewing a history of important past forecast evaluations, and then considers several existing systematic review programs. Beyond these formal programs, there has been an increased interest in the topic among the research community over the past several years. Next, the best existing evidence on the accuracy of travel forecasts is reviewed and summarized from a meta- analysis by Nicolaisen and Driscoll (2014) . A selection of studies is then reviewed in further detail for the purpose of considering (1) the methods used to analyze forecast accuracy, and (2) the issues cited as causes of inaccuracy. 2 A History of Forecast Evaluations Table III-F-1. summarizes key aspects of previous studies evaluating forecast accuracy, providing a survey of the history of forecast evaluations. One of the first examples of an in-depth analysis of predictions of a major large-scale infrastructure investments was Webber’s 1976 study on San Francisco’s construction of the Bay Area Rapid Transit (BART) system (Webber 1976). BART was the first rail system constructed in a U.S. city whose dominant mode was the automobile. Webber analyzed virtually all forecast assumptions and predicted benefits. Other studies sponsored by research and government agencies also provided insights. Similar, but smaller scale, comparisons were made on other projects in the 1980s. A British study in 1981 examined the forecasts of 44 projects constructed between 1962–1971 (MacKinder and Evans 1981). The authors found no evidence that more recent or sophisticated modeling methods produced more accurate forecasts than earlier or more straightforward methods. In North America, the U.S. DOT produced a report in 1989 that examined the accuracy of 10 major transit investments funded by the federal government. This report (Pickrell 1989), which came to be known as the “Pickrell Report,” caused a public stir by its findings: most projects underachieved their projected ridership, while simultaneously accruing capital and operating costs larger than expected. While the Pickrell Report and a number of other accuracy evaluations are focused on transit projects, the resulting criticism often extends to travel forecasting in general. An aim of this research is to fully analyze roadway traffic forecast accuracy in its own right. The first examination into the reasons of travel forecast inaccuracy was an examination of the psychological biases in decision making under uncertainty in 1977. Kahneman and Tversky (1977) proposed the concept of the “inside view,” where intimate involvement with a project’s details during its planning and development phases leads to systemic overestimates of its benefits and underestimates of its costs. This was the first recognition of a systematic flaw in planning that is called “optimism bias” in today’s literature. Kahneman and Tversky suggested the use of reference classes to correct these biases. Reference class forecasting is the use of the base-rate and distribution results from similar situations in the past to improve forecast accuracy. The benefits of reference class forecasting were suggested in subsequent work by Bent Flyvbjerg (2007) and Schmitt (2016) to correct for biases in demand and cost forecasts. Because highways are a separate reference class than transit, it is necessary to build the body of observed project outcomes in the highway realm, as can be done through this research.

Appendix F: Literature Review III-F-3 The number of forecasting accuracy assessments have increased since the year 2000. Bent Flyvbjerg released his seminal work on forecasts for multiple modes (Flyvbjerg, Holm, and Buhl 2005). His article noted that demand forecasts were generally inaccurate and not becoming more accurate over time. His conclusions were based on over 210 transportation projects from across the world. He identified potential causes for this inaccuracy, including inaccurate assumptions and exogenous forecasts (tied to the concept of optimism bias), deliberately slanted forecasts, issues with the analytical tools and issues with construction or operation. Flyvbjerg suggested one way to improve forecast inaccuracy is to develop and apply reference classes to projects with large uncertainties. Table III-F-1. Summary of historic studies.

III-F-4 Traffic Forecasting Accuracy Assessment Research From 2002–2005, Standard & Poor’s publicly released annual reports on the accuracy of toll road, bridge, and tunnel projects worldwide. The 2005 report (Robert Bain and Polakovic 2005), the most recent report available publicly, analyzed 104 projects. They found that the demand forecasts for those projects were optimistically biased, and this bias persisted into the first 5 years of operation. They also found that variability of truck forecasts was much higher than lighter vehicles. The authors noted that their sample “undoubtedly reflects an over-representation of toll facilities with higher credit quality” and that actual demand accuracy for these types of projects is probably lower than documented in their report. In their 2004 report (Bain and Plantagie 2004), Standard and Poor’s noted optimism bias in forecasts of toll facilities versus non-tolled roadways. The accuracy of non-tolled roads, based on a sample of over 150 projects from Flyvbjerg’s database, was much better than toll road forecasts— with some projects heavily underforecasting demand. They found generally a 20–30 percentage point skew (optimism bias) between the two sets of forecasts and noted this was consistent with their previous studies. The NCHRP released a synthesis on estimated toll road demand and revenue in 2006 (Kriger, Shiu, and Naylor 2006). This study reported the accuracy of 26 toll road revenue forecasts, finding that forecast accuracy does not improve over time. It noted that “many of the problems that had been identified with the performance of traffic and revenue forecasts were related to the application of the model, less so to methods and algorithms.” More specifically, this finding is related to the assumptions needed to operationalize the models and not to the data or methods. It recommended analyzing the forecasting inputs and exogenous forecasts, and the improved treatment of uncertainties and risks. There have been a few recent studies examining the accuracy of non-tolled roadway forecasts. Buck and Sillence (2014) demonstrated the value of using travel demand models in Wisconsin to improve traffic forecast accuracy and provided a framework for future accuracy studies. Parthasarathi and Levinson (2010) examined the accuracy of traffic forecasts for one city in Minnesota. Giaimo and Byram (2013) examined the accuracy of over 2,000 traffic forecasts in Ohio produced between 2000–2012. They found the traffic forecasts slightly high, but within the standard error of the traffic count data. They did not find any systematic problems with erroneous forecasts. The presentation also described an automated forecasting tool for “low risk” projects. The tool relies on trendlines of historical traffic counts and adjustments following procedures outlined in NCHRP Report 255 (Pedersen and Samdahl 1982) and updated in NCHRP Report 765 (CDM Smith et al. 2014). The FTA has conducted two additional studies analyzing predicted and actual outcomes of large-scale federally funded transit projects, one in 2003 (U.S. DOT 2003) and another in 2007 (FTA and Vanasse Hangen Brustlin 2008). The FTA is finding that transit forecasts are becoming more accurate over time, and attribute that improvement to better scrutiny of travel forecasts and the analytical tools used to produce the forecasts. Schmitt (2016) presented the results of his analysis of all forecasts of New Starts projects built in the United States through 2011. The forecasts were incorporated into the Transit Forecasting Accuracy Database (TFAD). The database contained 65 large-scale transit infrastructure projects from around the country. The research found that transit project assumptions have historical bias toward overforecasting ridership. Using these data, Schmitt statistically identified three reference classes for transit forecasting.

Appendix F: Literature Review III-F-5 The research also investigated three commonly held beliefs regarding forecasting accuracy: More recent projects are more accurate than older ones (i.e., forecasts are getting more accurate as tools become more advanced), Forecasts are more accurate in later stages of project development than in earlier stages (i.e., the more that is known about the details of a project, the more accurate the forecast of demand), and Forecasts of smaller changes to the transit system are more accurate than larger changes (i.e., smaller changes are easier to predict than larger changes). This research found that only the first commonly held belief had merit. Transit forecasts, on average, are biased but have been—slowly and non-monotonically—becoming more accurate over time. It is important to note, though, that this research has been focused on transit. NCHRP Project 08-110 extends the research to highway projects. By the mid-2000s, some studies attempted to identify issues with forecasting practice or the associated analytical methods. TRB Special Report 288 (Transportation Research Board 2007) noted that “current practice in travel forecasting has several deficiencies that often diminish the value of these forecasts.” The report identified four areas of deficiency: inherent weaknesses of the models themselves, errors introduced by modeling and forecasting practice, the lack or questionable reliability of data, and biases arising from the institutional climate in which models are used. The Travel Model Improvement Program released two reports to assist with these areas in 2013: “Improving Existing Travel Models and Forecasting Processes: A White Paper” (RSG 2013b), and “Managing Uncertainty and Risk in Travel Forecasting: A White Paper” (RSG 2013a). 3 Existing Systematic Review Programs Although individual studies analyzing the accuracy of travel forecasts are becoming more and more prevalent today, programs of forecast reviews are still rare. There are only three well-known re- occurring programs dedicated to reviewing predicted and actual outcomes already in practice. The UK’s Highways England in the Department for Transport, through their Post-Opening Project Evaluation (POPE) program (Highways England 2015), is the only known regular analytical review of non-tolled roadway forecasts in North America and Europe. It is by far the most impressive review of roadway forecasts. Highways England conducts a regular review of roadway forecasts, assessing the accuracy of demand, costs, accident, and travel-time benefit forecasts. Over the past 11 years, Highways England has reviewed smaller roadway projects (i.e., those costing less than 10M British pounds). The Highways England also reviews large projects (i.e., those costing greater than 10M British pounds) 1 year and 5 years after each project’s opening. A meta-analysis across all recent large projects occurs every 2 years. The FTA’s Capital Investment Grant program, commonly known as the New Starts program, requires before-and-after studies for every major project funded through the program (FTA 2016). Project sponsors are directed to archive the predictions and details supporting the predictions at two planning stages and at the federal funding decision stage. Approximately 2 years after project opening, project sponsors are required to gather information about the actual outcomes of five major aspects of the project: physical scope, capital cost, transit service levels, operating and maintenance costs, and ridership. Project sponsors analyze the predictions and actual outcomes, and prepare a report that summarizes the differences between the predictions and the outcomes, documents the reasons for

III-F-6 Traffic Forecasting Accuracy Assessment Research those differences and highlights lessons learned that would inform FTA or other project sponsors about how methodologies or circumstances helped or hindered the predictions. The FTA’s New Starts program allows project sponsors to enumerate the uncertainties inherent in their travel forecasts and provide information on how those uncertainties may impact the project forecast. The FTA has presented the method of “build up” of uncertainties, with separate forecasts produced for individual sources of uncertainty, to help identify the key drivers of uncertainty from the travel model’s perspective. Similar approaches could be considered for highway projects. The National Oceanic and Atmospheric Administration’s Hurricane Forecasting Improvement Program (HFIP) is the only program that combines forecast accuracy evaluation with improved analytical methods, public communication of forecast uncertainty and societal benefits (National Oceanic and Atmospheric Administration 2010). The HFIP’s stated accuracy goals were hypothesized to require increased precision in data and analytical methods. The HFIP developed a process to justify and evaluate these investments by placing analytical methods into three streams: Stream 1 consists of existing analytical methods and is used for official, real-time forecasts; Stream 2 consists of advanced analytical methods that take advantage of increased computing power and increased data precision, but forecasts are made offline; and Stream 1.5 consists of elements of Streams 1 and 2 that seem to hold the most promise; forecasts are made in real-time but are not official. The same input data is fed to all three streams. Efforts that demonstrate increased accuracy and skill are elevated to Stream 1.5 and eventually to Stream 1. In this way, empirically proven methods are implemented very quickly. In 5 years, the HFIP has demonstrated a 10% improvement in tropical storm track and intensity forecasts (Toepfer 2015). The HFIP is the only known program that uses a forecast skill metric in addition to traditional accuracy metrics. Advanced analytical methods must both be accurate and provide better accuracy than simpler and more inexpensive methods. In this way, analytical methods proven to be better than simpler (“naïve”) methods are recommended for immediate implementation. Shortfalls in accuracy and skill are noted and used to prioritize future research efforts. The HFIP directly tied improvement goals in forecast accuracy to societal benefits. “Forecasts of higher accuracy and greater reliability are expected to lead to higher user confidence and improved public response, resulting in savings of life and property” (National Oceanic and Atmospheric Administration 2010). As the first years of the program produced many successes, the accuracy goals were increased to eventually provide residents a reliable 7 days’ advance warning of an impending storm. The estimated benefit of avoiding an unnecessary evacuation is $1,000 per person, and has been estimated to accumulate to $225–$380 million for larger storms (Toepfer 2015). In this way the HFIP sponsors are able to justify the cost of implementing more complex and expensive methods. 4 Summary of Existing Outcomes Nicolaisen and Driscoll (2014) provide a recent meta-analysis of the demand forecast accuracy literature. That meta-analysis is not repeated here, but it is summarized to provide an existing baseline estimate of expected forecast accuracy. Their analysis considers 12 studies that have a sizable database of completed road and/or rail projects, that provide distributions based on those projects, and that specify the sources of

Appendix F: Literature Review III-F-7 information considered. Table III-F-2 shows the studies included, and Table III-F-3 shows a summary of the results included. Both tables are presented directly as they appeared in the source paper. Table III-F-2. Summary of studies included in Nicolaisen and Driscoll (2014) meta-analysis. Table III-F-3. Summary of results included in Nicolaisen and Driscoll (2014) meta-analysis.

III-F-8 Traffic Forecasting Accuracy Assessment Research The main finding from this paper is that the observed inaccuracy of forecasts varies based on the type of project: For rail projects, the mean inaccuracy is negative, meaning that actual demand is less than the demand that was predicted. The general range is that actual demand is 16%–44% less than forecast demand. For toll road projects, the mean inaccuracy is also negative, indicating that actual demand is less than forecast. The meta-analysis considered two studies of toll roads, with Bain (2009) showing a mean of -23% for a global sample of toll roads, and Welde and Odeck (2011) showing a mean of -3% in Norway. For untolled road projects, the mean inaccuracy is positive, with most results showing 3%–11% more traffic in reality than was forecast. Nicolaisen and Driscoll (2014) also note that for all types of projects, there is considerable variation in the results, regardless of the mean. Limited studies are available here, particularly of untolled roads in the United States, so these results should be considered with a degree of caution. Nonetheless, it is interesting to note the difference in direction for untolled road projects relative to rail and toll road projects, with the forecasts predicting too little demand for untolled roads, and too much demand for rail and toll roads. One can hypothesize possible explanations for this difference. Some possible explanations may be: There could be a methodological difference such that transit and rail are more difficult to predict for technical reasons having to do with them being lower-share alternatives, the difficulty of estimating good values of time (the willingness to pay in order to save travel time), or the challenges associated with identifying transit markets or transit users. It may be that rail and toll road projects only get built when the forecasts show strong demand, whereas untolled road projects tend to get funded regardless. This could lead to optimism bias in the forecasts, as suggested by Flyvbjerg (2007) or it could lead to self-selection bias, as suggested by Eliasson and Fosgerau (2013), where projects with forecasts that happen to be too low don’t get built, and therefore don’t end up in the sample. It could also be that the long term trends over the past 40 years associated with growing auto ownership, the entry of women into the workforce, and high levels of suburbanization combined to create a future that was not anticipated at the time the forecasts were made but is systematically biased to push people towards using roads and away from transit. While it is easy to speculate on the possible sources of errors, it is difficult to know for certain what the issue is. As Nicolaisen and Driscoll (2014) note, “The studies that make the greatest effort to address this aspect are rarely able to provide more than rough indications of causal mechanisms.” They go on to point out that a key challenge is the lack of the necessary data to conduct such studies, in particular the infrequent availability of archived forecasts. Specifically, Nicolaisen and Driscoll point out the following: “The lack of availability for necessary data items is a general problem and probably the biggest limitation to advances in the field.” NCHRP Project 08-110 begins from this starting point—limited studies on untolled roads in the United States, little information on the sources of forecast errors, and a general lack of data to conduct such studies. We consider the best approach to how to improve upon the situation. To do so, we consider a selection of additional studies for two purposes: first, to consider the methods for how to conduct accuracy evaluations, and second, to identify the factors for consideration that may be sources of forecasting error.

Appendix F: Literature Review III-F-9 5 Methods of Evaluation The next question of particular relevance to this study is how to go about assessing forecast accuracy. For this question, we consider a selection of studies as summarized in Table III-F-4. Table III-F-4 summarizes the research data, analysis procedure, key results, and suggestions or identified sources of error. These studies span different types of projects, including untolled roads, toll roads, and rail or transit projects. These studies, as well as those identified by Nicolaisen and Driscoll, reveal two main methods of evaluating the accuracy of forecasts: deep dives and Large-N studies. Deep dives are examples in which a single project is analyzed in detail to determine what went right and what went wrong in the forecast. Individual before-and-after studies from the FTA Capital Investment Grant Program are classic examples of deep dives. These studies often involve custom data collection before and after the project, such as onboard transit surveys. The sources of forecast errors—such as errors in inputs, model issues, or changes in the project definition—are considered and identified. The advantage of deep dives is that they allow a complex set of issues to be thoroughly investigated. They also reveal the importance of assumptions made by modelers in relation to data and the particular models that were used. The disadvantage is that it is often unclear whether the lessons from one project can be generalized to others. In contrast, Large-N studies consider a larger sample of projects in less depth. Flyvbjerg (2005) extols the virtues of Large-N studies as the necessary means of coming to general conclusions. Often, Large-N studies include a statistical analysis of the error and bias observed in forecasts compared to actual data. Flyvbjerg et al. (2006) consider a Large-N analysis of 183 road and 27 rail projects, and Standard and Poor’s conducts a Large-N analysis with a sample of 150 toll road forecasts (Bain and Plantagie 2004). Other examples of Large-N studies are the Minnesota, Wisconsin, and Ohio analyses (Parthasarathi and Levinson 2010; Buck and Sillence 2014; Giaimo and Byram 2013). The two approaches are not mutually exclusive. For example, if enough deep dives are conducted, they can become the basis for a Large-N analysis. Schmitt provides an example of this with his analysis of FTA-funded rail projects (Schmitt 2016). NCHRP 08-110 applies both deep dives and Large-N analysis as complementary evaluation tools. Specifically, it has used Large-N analysis to measure the amount and distribution of forecast errors, including those segmented by variables such as project type and various risk factors. It has used deep dives to explore the sources of forecast error—if we got the wrong answer, why are we wrong? Two recent studies provide the most complete current thinking on how to approach each evaluation tool, and these served as a framework for the NCHRP study. Whereas most studies focus on reporting descriptive statistics of forecast errors, Odeck and Welde (2017) define and apply a formal econometric framework for evaluating traffic forecast accuracy. The descriptive statistics, typically the percentage error (PE), mean percentage error (MPE), and mean absolute percentage error (MAPE) are useful and will continue to be used for descriptive purposes. The econometric framework is advantageous because it provides a simple but statistically robust method for estimating the bias. It does so by estimating the following regression: = + ̂ + , where is the actual traffic on Project i, ̂ is the forecast traffic on project i, and is a random error term. (Note: and are estimated terms in the regression.) (Text continues on page III-F-13)

Table III-F-4. Summary of select studies and methods. Paper Title Research Data Analysis Procedure Results Suggestions/Sources of Error Odeck and Welde (2017) The accuracy of toll road traffic forecasts: An econometric evaluation 68 toll road projects in Norway Percentage Error, Mean Absolute Percentage Error over the 68 datasets. Formally defines the econometric structure for analyzing accuracy. Toll road traffic forecasts are underestimated but are close to accurate because the mean percentage error is a mere 4%. This result sharply contrasts international studies that resulted in large overestimations at more than −20%. The accuracy of forecasts has not improved since transport models became mandatory. “It can be argued that one major advantage of the Norwegian system that other countries can learn from is this accumulated experience with forecasting where only one organization is responsible for overseeing the forecasting and where a standard software / framework is used, combined with little or no incentives to exaggerate forecasts.” Gomez, Vassallo, and Herraiz (2016) Explaining light vehicle demand evolution in interurban toll roads: a dynamic panel data analysis in Spain Spanish Toll Road Network Dynamic Panel to Estimate Demand. Relative change in travel demand induced by a relative change in variable Explanatory variables are: GDP (provincial), GDP (national), Employment (prov. and national), GDP per Capita (prov. and national), Previous Year Demand, Toll Rates, Fuel Price and Fuel Cost (efficiency), and Location (coast or interior) Employment and GDP per capita are more consistent explanatory variables for travel demand elasticity; location Li and Hensher (2010) Toll Roads in Australia: An Overview of Characteristics and Accuracy of Demand Forecasts Australian Toll Road network OLS and Panel Random Effect Regression model Actual traffic about 45% lower than predicted during the first year of operation. All other factors remaining unchanged, that the percentage error in forecast reduces by 2.44 percentage points for every additional year since opening. Less toll road capacity (when opened, compared with forecast), elapsed time of operation (roads opened longer had higher traffic levels), time of construction (longer construction time delayed traffic growth and increased the error), toll road length (shorter roads attracted less traffic), cash payment (modern no-cash payment increased traffic), and fixed/distance-based tolling (fixed tolls reduced traffic) Flyvbjerg et al. (2006) Inaccuracy in traffic forecasts 183 real projects around the world Actual minus forecasted traffic in percentage of forecasted traffic, in opening year About half of those road projects have a forecasting error of more than ±20%, and 25% of them error of ±40%. Uncertainties about trip generation and land use development

Paper Title Research Data Analysis Procedure Results Suggestions/Sources of Error Bain (2011) On the reasonableness of traffic forecasts Survey of forecasters Surveyed forecasters to identify how accurate they expect forecasts to be Expected accuracy for an existing road is +/-15% 5 years out, +/-32.5% 20 years out. Expected accuracy for a new road is +/-25% 5 years out and +/-42.5% 20 years out. Projections of population, GDP, car ownership, households, employment, fuel price (and/or efficiency) Bain (2009) Error and optimism bias in toll road traffic forecasts 100 toll road projects Ratio of actual to forecast traffic, by years from opening On average, forecast/actual traffic has a mean of 0.77 and a standard deviation of 0.26. This means that traffic is underpredicted by 23% Experience with toll roads, tariff escalation, forecast horizon, toll facility details, surveys/data collection, private users, commercial users, micro-economics, traffic growth European Court of Auditors (2013) European Court of Auditors, Are EU cohesion policy funds well spent on roads? Special Report No. 5, Luxembourg. 24 road investment projects in Germany, Greece, Poland, and Spain. Comparison of forecast vs. actual Annual Average Daily Traffic (AADT) On average, the actual traffic was 15 % below the forecast traffic, but clearly improved safety and saved travel time. Consider travel time savings, safety, etc., as performance measures. Make improvements on the cost side. FTA (2016) Guidance on Before- and-After Studies of New Starts Projects Guidance for transit before-and- after studies Focus on trips-on-project, plus transit dependents and other measures Population and employment forecasts, housing trends and costs, global and local economic conditions, other planned transportation improvements, time-of-day assumptions, parking prices, fuel prices, and long-term changes in vehicle technology Anam, Miller, and Amanin (2017) A Retrospective Evaluation of Traffic Forecasting Accuracy: Lessons Learned from Virginia 39 studies from Virginia 1. Obtain forecast volumes from the Virginia studies 2. Obtain observed volumes corresponding to the forecast year and location 3. Measure accuracy by comparing forecast volumes to observed volumes 4. Document assumptions in assessing accuracy 5. Identify explanatory factors of forecast accuracy The average value of the median absolute percent error of all studies was about 40%. Forecast Method (trend-based method more accurate than activity-based method when number of economic recessions between base and forecast year is 2 or more and for long-term durations; long-term trend-based studies more accurate than short-term; forecast duration decreases and accuracy increases) (continued on next page)

Table III-F-4 (Continued). Paper Title Research Data Analysis Procedure Results Suggestions/Sources of Error Kriger, Shiu, and Naylor (2006) Estimating toll road demand and revenue 15 US toll roads opened between 1986 and 1999 Survey of practitioners. Report forecast vs actual revenues by year after opening. On average, that the actual traffic was 35 % below the predicted traffic Long range demographic and socioeconomic forecasts: land use (job and household growth rates according to a variety of national, state and regional third party sources), short term economic fluctuations (local-oil price and subsequent sharp regional economic downturn), Travel Demand inputs inaccuracy, Value of Time and willingness to pay Nunez (2007) Sources of Errors and Biases in Traffic Forecasts for Toll Road Concessions 49 worldwide toll road concessions Considers the behavior of forecasters, promoters and users, focusing on strategic decisions and possible overconfidence. Estimates decreasing marginal utility of transport and value of time. There is a strong “winner’s curse” in toll road concessions. Using a single average value of time can lead to overestimation. Further disaggregate values of time. Restructure concession process to minimize winners curse, especially with high uncertainty. Andersson, Brundell-Freij, and Eliasson (2017) Validation of aggregate reference forecasts for passenger transport Compares eight Swedish national forecasts for passenger traffic made between 1975 and 2009 Forecasts compared against simple trendline as reference. Research adjusts forecasts to correct for input errors in: pop growth, fuel price, fuel economy, car ownership and GDP. Does so by applying elasticities. Since the early 1990 s, forecasts for car traffic have generally predicted growth rates of around 1.5% per year on average, whereas actual growth rates have been around 0.8% per year. The model-based forecasts still out-predict trendlines, and the models with updated inputs greatly out-predict trendlines. Errors in input assumption: average income (usually taken to be equal to GDP/capita), GDP Growth (average absolute error over all forecasts is 3 percentage points), population, fuel price, car ownership (average absolute error of 3 percentage points), and vehicle fuel economy, how population growth will be distributed among different types of municipalities, license holding (it explains 2/3 of the error in the Samplan 1999 forecast)

Appendix F: Literature Review III-F-13 The null hypothesis is that the forecasts are unbiased, and in that case the estimated value of will be 0 and of will be 1. Odeck and Welde (2017) provide some minor variations on this approach, including a method to determine the efficiency of the forecasts, which are not repeated here. It is easy to see how this econometric framework can be extended to test additional segmentation or additional terms in the regression. For example, either and can be segmented by the type of project, the agency conducting the forecast, or the number of years between the forecast and the opening year. This provides a framework from which a wealth of factors can be explored with different levels segmentation depending on the number of observations in each segment. A second recent study provides a strong framework for how to approach deep dives. Andersson, Brundell-Freij, and Eliasson (2017) examine aggregate (not project-level) forecast of car traffic in Sweden. There are two elements of interest in their approach. First, they compare the forecasts to a reference forecast, which would be a simple trendline as would be projected at the time the forecast was made. They argue that the value of the forecast occurs when it is able to out-perform this simple trendline, and they find that the forecasts examined generally do out-perform. They also find that if errors in the input data are corrected, the forecasts out-perform the trendline to a greater degree. This approach provides a useful point of comparison, although it is more limiting in the case of project-level forecasts because it cannot be applied to evaluating forecasts of new facilities. Second, Andersson, Brundell-Freij, and Eliasson (2017) consider the forecast versus actual values of five important input variables: fuel price, fuel economy, car ownership per person, growth in GDP per capita, and population growth. Using elasticities for each, they estimate how having the correct input value for each of these terms would affect the forecast. Table III-F-5, taken directly from their paper, summarizes the results of their analysis. It shows that correcting for errors in these five inputs would reduce the root mean square error of the forecasts from 0.64 to 0.12, with the biggest benefit associated with getting the fuel price correct. This type of analysis is useful because it provides insight into why the forecasts are wrong, and where we should focus our efforts if we wish to improve the forecasts. The deep dives in NCHRP 08-110 will aim to provide a similar analysis for each project considered in detail (i.e., if we got this right, how much would we improve the forecast?). (Andersson, Brundell-Freij, and Eliasson 2017). Table III-F-5. Effect of correcting for input errors in forecasts of Swedish car traffic

III-F-14 Traffic Forecasting Accuracy Assessment Research 6 Identified Problems with Forecast Accuracy A component of the deep dives will be an effort to assess the sources of forecast error. A number of authors have proposed a range of hypotheses for what those sources may be. Generally, these can be grouped into three categories: technical problems, optimism bias and selection bias (Nicolaisen and Driscoll 2014; Flyvbjerg 2007; Eliasson and Fosgerau 2013). Technical problems include limitations of the data and the methods and assumptions made during the process. It has been noted that in some cases the impact of the assumptions on the forecast is greater than the forecasting model or method used. Optimism bias can be the result of the “insider’s view” or due to political pressure to achieve certain forecasts to justify the project. Selection bias could occur because projects with high forecasts are more likely to get built, even if the underlying forecasts for all projects considered are unbiased. These latter two issues may explain the discrepancy between untolled highway projects versus rail transit and toll roads—because the forecasts can play a larger role in whether the latter get built, there is more potential for both optimism bias and selection bias. The impact of the core assumptions on forecasts’ accuracy (or their lack of accuracy) is in support of William Ascher’s (1979) examination of forecasts in five areas: population, the economy (current dollar and real GNP), energy (electricity, petroleum consumption), transportation, and technology. Ascher (1979) found improvements in forecasting method to be a secondary precursor to achieving a higher degree of accuracy. According to Asher, failing to capture the reality of the future context leaves little to the methodology. He also found that the more distant the forecast target date is, the less accurate becomes the forecast. He further identified systematic biases associated with the institutional sites of forecasts. In examining the possible sources of error, we consider the possible explanations offered by a selection of studies, as summarized in Table III-F-6. The cells in the table indicate which of these studies has cited each of the issues in the column headers as a possible source of error. The most commonly cited sources all relate to the economy: employment, GDP and recession/economic conditions. Land use, population projections, and housing are also commonly cited. It is important to note that these studies generally hypothesize possible explanations rather than clearly demonstrate sources of error; nonetheless, they are useful in that they provide an enumeration of factors that can be considered in detail in the deep dives. 7 Gaps in Knowledge The research reviewed here provided the NCHRP 08-110 project team a starting point for understanding existing evidence on forecast accuracy, as well as a strong foundation of how to approach such studies and what factors may contribute to inaccuracy. A limitation was that the projects considered are not necessarily representative of forecasts in general. There is a strong representation of rail projects (U.S. DOT 2003; FTA and Vanasse Hangen Brustlin 2008), toll roads (Bain 2009; Odeck and Welde 2017; Kriger, Shiu, and Naylor 2006) and road projects in Europe (Andersson, Brundell-Freij, and Eliasson 2017; Welde and Odeck 2011; Highways England 2015), but limited studies of untolled traffic forecasts in the United States (Anam, Miller, and Amanin 2017; Buck and Sillence 2014; Parthasarathi and Levinson 2010). In fact, one report (Hartgen 2013) has called the unknown accuracy of U.S. urban road forecasts “the greatest knowledge gap in U.S. travel demand modeling.” NCHRP 08-110 seeks to close that gap.

Paper Employ- ment GDP Recession / Econ Condition Trip Gen / Travel Character- istics Land Use Pop- ulation Projection Housing Prediction Car Owner- ship Fuel Price Fuel Efficiency Time Savings Location Time of Operation Toll Road Capacity Length of Road Cash Payment / Value of Time Ramp Up Period Tolling Culture Time-of- day Traffic calc- ulations Forecast Duration Odeck and Welde 2017 Gomez, Vassallo, and Herraiz 2016 Y Y Y Li and Hensher 2010 Y Y Y Y Flyvbjerg et al. 2006 Y Y Bain 2011 Y Y Y Y Y Y Y Bain 2009 European Court of Auditors 2013 Chatterjee et al. 1997 Y Spielberg et al. 2007 Y Y FTA 2013 Y Y Y Y Y Anam, S. et al. 2016 Y Y NCHRP Synthesis 364 (2006) Y Y Y Y Y Y Y Nunez 2007 Andersson, Brundell- Freij, and Eliasson 2017 Y Y Y Y Y Y Yang, Li, and Wu 2017 Y Y Y Y Total Citing as Issue 5 5 4 4 2 2 3 2 4 2 1 1 1 1 1 2 0 1 0 1 1 Table III-F-6. Summary of select studies and factors cited as contributing to accuracy issues.

III-F-16 Traffic Forecasting Accuracy Assessment Research References Anam, S., J. S. Miller, and J. Amanin (2017). “A Retrospective Evaluation of Traffic Forecasting Accuracy: Lessons Learned from Virginia.” Presented at 96th Annual Meeting of the Transportation Research Board, Washington, D.C. Andersson, M., K. Brundell-Freij, and J. Eliasson (2017). “Validation of Aggregate Reference Forecasts for Passenger Transport.” Transportation Research Part A: Policy and Practice 96 (Supplement C): 101–18. https://doi.org/10.1016/j.tra.2016.12.008. Ascher, W. (1979). Forecasting: An Appraisal for Policy-Makers and Planners. Bain, R., and J. Plantagie (2004). “Traffic Forecasting Risk: Study Update 2004.” In Proceedings f The European Transport Conference. Strasbourg, France. https://trid.trb.org/view.aspx?id=841273. Bain, R. (2009). “Error and Optimism Bias in Toll Road Traffic Forecasts.” Transportation, 36(5): 469–82. https://doi.org/10.1007/s11116-009-9199-7. Bain, R. (2011). “The Reasonableness of Traffic Forecasts Findings from a Small Survey.” Traffic Engineering and Control (TEC) Magazine, May 2011. Bain, R., and L. Polakovic (2005). “Traffic Forecasting Risk Study Update 2005: Through Ramp-up and Beyond.” Standard & Poor’s, London. http://toolkit.pppinindia.com/pdf/standard-poors.pdf. Buck, K., and M. Sillence (2014). “A Review of the Accuracy of Wisconsin’s Traffic Forecasting Tools.” https://trid.trb.org/view/2014/C/1287942. CDM Smith, A. Horowitz, T. Creasey, R. Pendyala, and M. Chen (2014). NCHRP Report 765: Analytical Travel Forecasting Approaches for Project-Level Planning and Design. Transportation Research Board of the National Academies, Washington, D.C. Eliasson, J., and M. Fosgerau (2013). “Cost Overruns and Demand Shortfalls—Deception or Selection?” Transportation Research Part B: Methodological 57: 105–13. https://doi.org/10.1016/j.trb.2013.09.005. European Court of Auditors (2013). “Are EU Cohesion Policy Funds Well Spent on Roads?” Luxembourg: Publ. Office of the Europ. Union. https://www.eca.europa.eu/Lists/ECADocuments/SR13_05/SR13_05_EN.PDF. FTA (2016). “Guidance on Before-and-After Studies of New Starts Projects.” Text. Federal Transit Administration. April 4, 2016. https://www.transit.dot.gov/funding/grant-programs/capital- investments/guidance-and-after-studies-new-starts-projects. FTA and Vanasse Hangen Brustlin (2008). “The Predicted and Actual Impacts of New Starts Projects—2007: Capital Cost and Ridership.” Flyvbjerg, B., M. K. S. Holm, and S. L. Buhl (2005). “How (In)Accurate Are Demand Forecasts in Public Works Projects?: The Case of Transportation.” Journal of the American Planning Association, 71(2). https://trid.trb.org/view.aspx?id=755586. Flyvbjerg, B., M. K. S. Holm, and S. L. Buhl (2006). “Inaccuracy in Traffic Forecasts.” Transport Reviews, 26(1). https://trid.trb.org/view/2006/C/781962. Flyvbjerg, B. (2005). “Measuring Inaccuracy in Travel Demand Forecasting: Methodological Considerations Regarding Ramp up and Sampling.” Transportation Research Part A: Policy and Practice, 39(6): 522–30. https://doi.org/10.1016/j.tra.2005.02.003. Flyvbjerg, B. (2007). “Policy and planning for large infrastructure projects: problems, causes, cures.” Environment and Planning B: Planning and Design, 34(4): 578–597. https://doi.org/10.1068/b32111. Giaimo, G., and M. Byram (2013). “Improving Project Level Traffic Forecasts by Attacking the Problem from All Sides,” presented at the 14th Transportation Planning Applications Conference, Columbus, OH. Gomez, J., J. M. Vassallo, and I. Herraiz (2016). “Explaining Light Vehicle Demand Evolution in Interurban Toll Roads: A Dynamic Panel Data Analysis in Spain.” Transportation, 43(4): 677–703. https://doi.org/10.1007/s11116-015-9612-3.

Appendix F: Literature Review III-F-17 Hartgen, D. T. (2013). “Hubris or Humility? Accuracy Issues for the next 50 Years of Travel Demand Modeling.” Transportation, 40(6): 1133–57. https://doi.org/10.1007/s11116-01III-9497-y. Highways England (2015). Post Opening Project Evaluation (POPE) of Major Schemes: Main Report. Kahneman, D., and A. Tversky (1977). “Intuitive Prediction: Biases and Corrective Procedures.” DTIC Document. http://oai.dtic.mil/oai/oai?verb=getRecord&metadataPrefix=html&identifier =ADA047747. Kriger, D., S. Shiu, and S. Naylor (2006). NCHRP Synthesis 364: Estimating Toll Road Demand and Revenue. Transportation Research Board of the National Academies, Washington, D.C. https://trid.trb.org/view/2006/M/805554. Li, Z., and D. A. Hensher (2010). “Toll Roads in Australia: An Overview of Characteristics and Accuracy of Demand Forecasts.” Transport Reviews, 30(5): 541–69. https://doi.org/10.1080/01441640903211173. MacKinder, I. H., and S. E. Evans (1981). “The Predictive Accuracy of British Transport Studies in Urban Areas.” Transport and Road Research Laboratory. https://trid.trb.org/view.aspx?id=179881. National Oceanic and Atmospheric Administration (2010). “Hurricane Forecast Improvement Program: Five-Year Strategic Plan.” http://www.hfip.org/documents/hfip_strategic_plan_yrs1-5_2010.pdf. Nicolaisen, M. S., and P. A. Driscoll (2014). “Ex-Post Evaluations of Demand Forecast Accuracy: A Literature Review.” Transport Reviews, 34(4): 540–57. https://doi.org/10.1080/01441647.2014.926428. Nunez, A. (2007). Sources of Errors and Biases in Traffic Forecasts for Toll Road Concessions. PhD Thesis, Université Lumière Lyon 2. Odeck, J., and M. Welde (2017). “The Accuracy of Toll Road Traffic Forecasts: An Econometric Evaluation.” Transportation Research Part A: Policy and Practice, 101 (July): 73–85. https://doi.org/10.1016/j.tra.2017.05.001. Parthasarathi, P., and D. Levinson (2010). “Post-construction evaluation of traffic forecast accuracy.” Transport Policy, 17(6): 428–43. https://doi.org/10.1016/j.tranpol.2010.04.010. Pedersen, N. J., and D. R. Samdahl (1982). NCHRP Report 255: Highway Traffic Data for Urbanized Area Project Planning and Design. TRB, National Research Council, Washington, D.C. Pickrell, D. H. (1989). “Urban Rail Transit Projects: Forecast Versus Actual Ridership And Costs. Final Report.” https://trid.trb.org/view.aspx?id=299240. RSG (2013a). “Managing Uncertainty and Risk in Travel Forecasting: A White Paper.” FHWA-HEP-14-030. Travel Model Improvement Program (TMIP). Federal Highway Administration, Washington, D.C. RSG (2013b). “Improving Existing Travel Models and Forecasting Processes: A White Paper.” FHWA-HEP-14- 019. Travel Model Improvement Program (TMIP). Federal Highway Administration, Washington, D.C. Schmitt, D. (2016). “A Transit Forecasting Accuracy Database: Beginning to Enjoy the ‘Outside View.’” Presented at 95th Annual Meeting of the Transportation Research Board, Washington, D.C. Toepfer, F. (2015). Presentation to TRB Travel Demand Forecasting Committee Meeting, January 13. Transportation Research Board (2007). Special Report 288: Metropolitan Travel Forecasting: Current Practice and Future Direction. Transportation Research Board of the National Academies, Washington, D.C. U.S. DOT (2003). “Predicted and Actual Impacts of New Starts Projects: Capital Cost, Operating Cost and Ridership Data.” Federal Transit Administration, Washington, D.C. Webber, M. W. (1976). “The BART Experience—What Have We Learned?” Institute of Urban & Regional Development, October. http://escholarship.org/uc/item/7pd9k5g0. Welde, M., and J. Odeck (2011). “Do Planners Get it Right? The Accuracy of Travel Demand Forecasting in Norway.” EJTIR, 1(11): 80–95.

Next: Appendix G - Large-N Analysis »
Traffic Forecasting Accuracy Assessment Research Get This Book
×
 Traffic Forecasting Accuracy Assessment Research
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Accurate traffic forecasts for highway planning and design help ensure that public dollars are spent wisely. Forecasts inform discussions about whether, when, how, and where to invest public resources to manage traffic flow, widen and remodel existing facilities, and where to locate, align, and how to size new ones.

The TRB National Cooperative Highway Research Program's NCHRP Report 934: Traffic Forecasting Accuracy Assessment Research seeks to develop a process and methods by which to analyze and improve the accuracy, reliability, and utility of project-level traffic forecasts.

The report also includes tools for engineers and planners who are involved in generating traffic forecasts, including: Quantile Regression Models, a Traffic Accuracy Assessment, a Forecast Archive Annotated Outline, a Deep Dive Annotated Outline, and Deep Dive Assessment Tables,

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!