National Academies Press: OpenBook

Identification and Evaluation of Freight Demand Factors (2012)

Chapter: 3. Modeling Transportation Demand Analyzing the Results

« Previous: 2. Measuring and Estimating Transportation Demand
Page 34
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 34
Page 35
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 35
Page 36
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 36
Page 37
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 37
Page 38
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 38
Page 39
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 39
Page 40
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 40
Page 41
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 41
Page 42
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 42
Page 43
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 43
Page 44
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 44
Page 45
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 45
Page 46
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 46
Page 47
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 47
Page 48
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 48
Page 49
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 49
Page 50
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 50
Page 51
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 51
Page 52
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 52
Page 53
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 53
Page 54
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 54
Page 55
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 55
Page 56
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 56
Page 57
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 57
Page 58
Suggested Citation:"3. Modeling Transportation Demand Analyzing the Results." National Academies of Sciences, Engineering, and Medicine. 2012. Identification and Evaluation of Freight Demand Factors. Washington, DC: The National Academies Press. doi: 10.17226/22820.
×
Page 58

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

31 3. Modeling Transportation Demand – Analyzing the Results As detailed in the previous section, the research team began this analysis by first reviewing the historical literature and various recent studies on “What influences transportation demand?” and then developing a list of potential independent variables that were thought to have an influence on the volume of freight transportation. Through the analysis of various segments of the U.S. economy in Section 2 above, duplicate variables were reconsidered and the best measures were chosen with the requirement that the independent data be regularly generated, well understood and freely available. The considered group of independent variables was thus reduced to a set of 23, whose correlation with one another and with the nine measures of freight demand over the entire 1980 – 2007 time period was evaluated. Table 2, on the following page identifies the cross-correlations among the independent variables. Tables C-1 and C-2 in Appendix C identify how these independent variables correlate, and therefore potentially explain, the dependent measures of freight transportation demand. Correlation analysis enables a quick assessment of the quality of a linear relationship between two variables: the independent economic variables and freight demand.20 Ultimately, to understand the relative importance of the independent variables as predictors, regression analysis was used to develop weightings or factors to indicate the relative change in freight demand from a unit of change in an independent variable. Regression analysis works because the explanatory power of each candidate independent variable is tested against the hypothesis “Is the relative variability simply random, or is there some coincident activity?” to arrive at a model that uses a subset of the independent variables to explain much of the variation in the dependent variable of freight demand. The quality measures the strength of the relationship of the two quantities under the assumption that they are linearly related. These correlation analyses were useful in identifying how some of the relationships between independent influencing variables and the dependent data on f reight transportation demand changed over time. However, correlation measures give a poor understanding of the degree of impact, or weight, that a certain independent variable might have in influencing freight demand. Additionally, independent variables from the time period (year) preceding the freight demand were tested to determine whether correlations between the independent factors and the freight demand measures in subsequent time periods were greater than the correlations within the same period. 20 Pearson’s correlation coefficient can be written as Correlation Yx YX YXCov σσ ρ ),(, = in which the covariance of two variables is divided by the product of their standard deviations. The basic assumption for the correlation coefficient is that two quantities are linearly related, meaning that if the distributions of X and Y over time do not follow a similar path, then the correlation coefficient is likely to be weaker. In interpreting a correlation coefficient, values fall between +1 and -1 where +1 means indicates a perfectly positive correlation in which quantities move together in a direct relationship. The -1 indicates a perfectly negative correlation in which an increase in one quantity is associated with a proportional decrease in the other quantity.

32 Table 2 – Potential Cross-Correlation of Independent Variables Re al G D P in 2 00 5 Ch ai ne d $ Re al G D P in C ha in ed 2 00 5$ / Ca pi ta Re al P er so na l C on su m pt io n Ch ai ne d 20 05 $ Re al In co m e in C ha in ed 2 00 5$ / Ca pi ta To ta l H ou si ng S ta rt s In du st ri al P ro du ct io n In de x In du st ri al M an uf ac tu ri ng In de x Pu rc ha si ng M an ag er s In de x Tr ad e W ei gh te d Fo re ig n Ex ch an ge In de x (B ro ad T ra di ng P ar tn er s) Tr ad e W ei gh te d Fo re ig n Ex ch an ge In de x (M aj or T ra di ng P ar tn er s) Em pl oy m en t To ta l Em pl oy m en t in W ho le sa le In du st ry Re al E xp or ts in G oo ds (i n $) Re al Im po rt s in G oo ds (i n $) To ta l C ap ac it y U ti liz at io n (% o f T ot al ) In ve nt or y Sa le s Ra ti o, C ha in ed 2 00 5$ In ve nt or y Sa le s Ra ti o Re al G as P ri ce Re al R et ai l s al es IW TF G ra in Co al + G ra in T on na ge Co al P ro du ct io n Real GDP in Chained 2005$ 1.00 Real GDP in Chained 2005$/ Capita .998 1.00 Real Personal Consumption 2005 Chained $ .999 .995 1.00 Real Income in Chained 2005$ / Capita .997 .996 .998 1.00 Total Housing Starts .454 .455 .468 .476 1.00 Industrial Production Index .990 .990 .985 .983 .455 1.00 Industrial Manufacturing Index .992 .990 .987 .985 .463 1.000 1.00 Purchasing Managers Index .171 .181 .171 .174 .550 .160 .167 1.00 Trade Wghtd Foreign Exchg Indx (Broad Trading Partners) .914 .920 .902 .906 .444 .937 .933 .125 1.00 Trade Wghtd Foreign Exchg Indx (Major Trading Partners) -.565 -.577 -.553 -.549 .153 -.521 -.520 -.163 -.402 1.00 Employment Total .983 .989 .974 .976 .389 .988 .986 .139 .950 -.591 1.00 Employment in Wholesale Industry .937 .951 .921 .928 .341 .958 .955 .112 .921 -.571 0.977 1.00 Real Exports in Goods (in $) .939 .935 .928 .918 .215 .947 .945 .086 .853 -.662 .948 .935 1.00 Real Imports in Goods (in $) .979 .971 .982 .975 .472 .967 .971 .181 .831 -.539 .938 .895 .931 1.00 Total Capacity Utilization (% of Total) .008 .049 -.027 -.014 -.186 .037 .031 .264 .010 -.465 .114 .245 .195 .006 1.00 Inventory Sales Ratio, Chained $ -.890 -.894 -.885 -.886 -.694 -.903 -.906 -.502 -.847 .418 -.877 -.851 -.805 -.877 -.135 1.00 Inventory Sales Ratio -.956 -.952 -.954 -.947 -.508 -.947 -.949 -.342 -.846 .581 -.926 -.868 -.906 -.951 -.083 .929 1.00 Urban Gas Price (Real$) -.081 -.127 -.052 -.085 -.034 -.112 -0.101 -.177 -.361 .216 -.224 -.283 -.046 .091 -.396 .179 .017 1.00 Retail Sales (Real$) .997 .997 .995 .995 .497 .991 0.992 .195 .912 -.546 .981 .942 .928 .980 .027 -.909 -.956 -.090 1.00 (Lagged) IWTF .915 .916 .902 .899 .433 .939 0.936 .156 .960 -.485 .947 .929 .901 .859 .117 -.874 -.869 -.302 .915 1.00 Grain .676 .661 .671 .664 .253 .676 0.677 -.075 .600 -.277 .642 .616 .678 .697 -.045 -.542 -.646 .180 .659 .609 1.00 Coal + Grain Tonnage .912 .915 .900 .903 .308 .914 0.912 .032 .867 -.549 .923 .912 .908 .883 .146 -.778 -.854 -.128 .902 .870 0.86 1.00 Coal Production .930 .942 .915 .924 .299 .933 .929 .091 .909 -.638 .966 .966 .923 .875 .241 -.814 -.863 -.295 .926 .909 .654 .951 1.00 SOURCE: Developed by the Research Team

33 Results of this work include the statistical models that explain freight demand based on independent economic and demographic variables. A challenge in this research has been the high correlations between the independent variables themselves—generally referred to as multicollinearity. A statistically rigorous method of addressing multicollinearity is the use of Principal Component Analysis (PCA) to construct the weights such that the variation in the linear composite of these candidate demand factors is maximized. Essentially, PCA creates a weighted average of independent factors instead of the independent variables themselves. One consequence of PCA is that the individual components are therefore lost to a composite index or “blend” of the otherwise somewhat duplicative independent factors in freight demand. Finally, parsimonious regression models were developed using both independent variables and principal components. The regression models presented, particularly those based on P CA, indicate the importance of the independent variables that reflect economic activity, population, consumer or industrial sentiment, and currency exchange. The PCA-based models compare the log of the actual values with the log of the values of freight transportation demand, ensuring that relative changes and growth rates are properly compared. While the interpretation of the model coefficients is less obvious than for simpler regression models because the variables are blended together so that there is no intuitive connection, the results are generally very good at predicting freight transportation demand from the chosen independent factors. T he PCA-based models were then used to form “back-casts” to illustrate how well this overall method works in forecasting freight demand. Important Sub-Periods of Time Potentially affecting the predictive ability of the independent factors at influencing transportation demand is the diversity of the historical sample period (1980-2007). The relationship of transportation demand to the rest of the economy went through multiple changes during this period as regulations, cost and reliance on t ransportation changed. Splitting the sample period into three smaller, but more homogeneous, spans of six, eleven, and eleven years, respectively, provides a benefit by allowing the consideration of political and societal changes affecting the relationship between economic activity and transportation demand during these three time periods: i) 1980 to 1985 ii) 1986 to 1996 iii) 1997 to 2007. Between 1980 and 1985, significant regulatory reform (Staggers Act, Motor Carrier Act financial deregulation, and a lowering of marginal tax rates) took place, affecting previous long-term associations between the candidate influencer variables and freight transportation demand. Deregulation augured a new period of railroad mergers and free entry of new trucking companies. The U.S. economy experienced a double-dip recession, and then transitioned into a much healthier period of growth through most of the 1980’s. Besides the increased sales of

34 Japanese-manufactured automobiles, many new consumer electronics manufactured in Asia became affordable to a majority of American consumers. Between 1986 and 1996, with the exception of a moderate recession in 1990-1991, the U.S. economy experienced significant expansion along with rapidly expanding trade due to several significant geo-political events (fall of the Berlin Wall, NAFTA). Between 1992 and 1996, the real value of imported goods increased on a verage 5.2 pe rcent per year and the real value of exports increased even faster (an average of 7 percent per year). While economic growth was slower between 1997 and 2007, and structural changes in the trucking and railroad industry subsided, trade continued to expand rapidly, particularly containerized imports, with many years reporting double-digit growth, driving activities at ports and a rapid expansion in long-haul movement via railroads. Driven by imports, U.S. consumption, and the trade deficit, grew much faster than GDP. Collinearity Means Redundancy A primary goal of the study was to establish which candidate demand factors do the best job at explaining variations in freight demand. Because many of the independent factors measure similar economic activity, a certain amount of collinearity among these variables was expected. For example, real income per capita and personal consumption are likely to move together, because absent radical changes in savings rates, when people have more money, they spend it. Thus, a natural first step in the analysis was to investigate correlation among the variables that were candidates for use as independent factors and the nine measures of freight demand. A summary Table that documents the correlation of the independent influencing variables with the various measures of freight demand is shown in Table 3. Candidate independent variables exhibited a range of correlation with the nine selected dependent measures of freight demand. Those with correlations whose absolute value exceeds 0.75 are highlighted in Table C-1, but candidate variables with lower correlations might still be important since, in combination with other factors, they might still improve the ability to predict a particular transportation demand variable. A good example is the price for gasoline. Even though this variable has a relatively low correlation with the dependent measures of transportation demand (absolute value of correlations ranging from 0.43 to 0.59), it emerges as a helpful explanatory variable, in conjunction with other variables, in the regression models that predict truck ton-miles and truck vehicle miles. The goal is to build a regression model that is well-specified – i.e., that accounts for all important theoretical components. As discussed above, developing a w ell-specified model can pose challenges, especially when many of the demand factors tend to move similarly. When candidate demand factors have high correlations they present problems of multi-collinearity. If both are used in a regression model, it becomes impossible to reliably distinguish whether demand factor A or factor B explains the change in freight demand. The research team used multiple approaches to manage this potential duplication. Correlation tables like the one above were developed to help understand the various relationships. They were used in the effort to develop parsimonious yet highly explanatory regression models that would explain the variations in freight demand measures. Finally, as described above (and in

35 detail below), PCA was used to compensate for the high degree of multicollinearity among the remaining independent variables. Correlation measures were examined for the length of the whole sample, 1980 to 2007 (28 yearly measurements), to determine how and if these correlations changed during the sub-periods when different social, political, and economic factors were affecting transportation demand. The regression models that were ultimately developed utilize a subset of the independent variables to predict freight transportation demand over the entire time period. Correlations between the Dependent and Independent Variables This section presents the results of the first step in the statistical evaluation of the 23 independent factors in freight demand (as expressed by the nine selected measures of modal freight demand). The summary tables referred to below are presented in Appendix C. These correlations were explored based on t he logarithm transformed actual values of the independent influencing variables. As discussed below, additional exogenous factors were considered (e.g. deregulation), but since these factors are represented by dummy variables, an analysis of correlation with other independent or dependent variables was not meaningful. As in any statistical modeling exercise, it is important to bear in mind that correlation does not imply causation. “What we can conclude when we find two variables with a strong correlation is that there is a relationship between the two variables, not that a change in one causes a change in the other.”21 Table 3 below shows summary correlation ranks based on how well the candidate factors explain the nine measures of freight transportation demand via rail, truck, and domestic waterway. The factors are ranked based on correlations alone; thus for rail ton-miles, real GDP has the highest correlation and thus has the highest rank. The top five factors for each dependent variable are highlighted in the tables. For rail tons the most important candidate factors are inventory/sales ratio, imports in real U.S. dollars, Industrial Production Index, Industrial Manufacturing Index, and real GDP. The Table provides initial suggestions on the most important independent influencing variables for each measure of freight demand. For example, the industrial production indices rank among the top five variables for both truck and rail freight demands. Also, it is interesting to note how some variables have more influence on freight demand via rail while others have more influence on the trucking demand measures. 21 Statistical Techniques in Business & Economics, Robert D. Mason and Douglas A. Lind, Ninth Edition, 1996; p. 484.

36 Table 3 - Correlation Ranks of Candidate Demand Factors Absolute Correlation Matrix Rail Tons Rail Ton- Miles Rail Train- Miles Rail Car- Miles Rail Rev Ton-Miles Annual Truck Ton- Miles Truck VMT Water Tons Water Ton- Miles Real GDP 6 4 6 6 3 3 3 21 4 Real GDP per Capita 8 6 8 5 6 2 2 17 7 Real Personal Consumption 7 7 9 7 7 8 8 16 3 Real Income Per Capita 9 8 11 8 8 7 7 19 6 Total Housing Starts 16 16 14 15 16 16 16 4 19 Industrial Production Index 4 1 4 3 1 4 4 20 2 Industrial Manufacturing Index 3 2 2 2 2 5 5 22 1 Purchasing Managers' Index 17 17 17 18 17 18 18 11 21 Trade Wt. Broad Cur. Index 14 13 15 14 13 9 9 14 13 Trade Wt. Major Cur. Index 15 15 19 19 15 15 15 3 20 Total Employment 11 3 3 4 5 1 1 9 9 Employment in Wholesale Sector 13 12 13 13 12 10 10 6 12 Exports in Real $ 10 9 12 12 9 11 11 8 10 Imports in Real $ 2 10 10 10 10 12 12 12 8 Total Capacity Utilization 19 19 18 17 19 19 19 1 22 Chained Inv. Sales Ratio (BEA) 12 14 7 11 14 14 14 18 16 Inv. Sales Ratio (Census) 1 11 5 9 11 13 13 15 11 Urban Gas Price in Real $ 18 18 16 16 18 17 17 2 23 Retail Sales in Real $ 5 5 1 1 4 6 6 23 5 Lagged Inland Waterway Trust Fund Tax/Gallon NA NA NA NA NA NA NA 10 17 Grain Tonnage NA NA NA NA NA NA NA 13 18 Coal + Grain Tonnage NA NA NA NA NA NA NA 7 14 Coal Production (Tonnage) NA NA NA NA NA NA NA 5 15 * NA indicates correlations were not determined for rail or truck demand variables with these waterborne-freight-related independent variables, which were added later in the analysis. SOURCE: Developed by the Research Team

37 Correlations by mode are presented in a series of tables (see Appendix C) that take a more detailed look at the candidate variables that influence freight transportation demand. For most of the rail freight demand measures, influencing variables including income (real GDP), the Industrial Production Indices, total employment, and inventory/sales ratios all appear to show a significant relationship (see Table C-2). Less significant are housing starts and the Purchasing Managers’ Index. For the trucking mode (see Table C-3), measures of income (such as GDP), production (such as industrial production), trade (such as imports and exports), inventory/sales ratio, and retail sales all appear to correlate strongly. Less correlated variables include real gas prices, total housing starts, and the Purchasing Managers’ Index. Water Tonnage correlates best with total capacity utilization – a surveyed measure of freight production capacity that fluctuates with the business cycle and that has exhibited a long-term downward trend concurrent with our nation’s industrial production. Tons of freight via water correlates reasonably well with similar manufacturing measurements, although it will be shown later that these variables do not provide a clear enough explanation for the long-term decrease in the volume of this kind of freight transportation. Sub-Period Correlation Analysis As described above, this research investigated whether the independent variables might have a different relationship to freight demand by time period. Tables contained in Appendix C illustrate how well each of the independent variables correlates with each of the nine freight demand (dependent) variables during the three time periods. The tables summarize the entire time period, as well as the three sub-periods, the relative rankings of these correlations and their statistical significance (based on t -statistic). Candidate demand factors that have correlations 0.75 and above (or -.75 and below) are more likely to be statistically significant at the 95% level when their t-statistic (with 26 degrees of freedom via a two-sided test) is greater than 1.96 or less than -1.96. For Rail Tonnage (Table C-4), real GDP, for example, has high correlation during all the time periods, except 1980 t o 1985 w here correlations are negative but not statistically significant. Inventory/sales ratio is negatively correlated over the full sample period including the early 1980’s, but less so in the later periods. While not statistically significant (possibly due to the small number of observations) during 1980-1985, real value of imports appears to be one of the more important influences. Similarly, the Industrial Production Index and personal consumption are well correlated during the latter half of the sample, while not showing strong correlation during 1980 to 1985. For Rail Ton-Miles (Table C-5), between 1980 and 1985, when the rail industry was changing due to deregulation and mergers, very few candidate demand factors other than inventory/sales ratio and total capacity utilization were statistically significant. Imports were important, albeit with a correlation just shy of 0.75. As the railroad industry changed, production-related factors, like industrial production, imports, exports and GDP indicate an important influence. Most recently, consumption-related variables, including retail sales, are significant factors of

38 influence. As imports and exports often travel long-distances to get from/to seaports, trade has become an important explainer of railroad ton-miles. For Rail Train-Miles (Table C-7), detailed data from the AAR was available only since 1990. Retail sales have the highest correlation for the full sample period, and more influence during the second half of the period than during the first. Factors influencing demand related to production like industrial manufacturing are important for the full sample, but their importance declines during 1997 t o 2007. Interestingly, urban gas prices have a significant negative correlation during 1990-1996, but then have an even more significant positive correlation during 1997 t o 2007, which suggests that railroad’s efficient use of fuel had a positive effect on the domestic intermodal market, which grew during this period. For Rail Car-Miles (Table C-8), many independent economic variables have a significant correlation with demand. These include the overall economic health variables such as GDP per capita and real income, but also the more specialized variables such as imports, exports, and retail sales. Interestingly, neither urban gas prices nor the Purchasing Managers’ Index show much correlation. Correlations for Truck Vehicle-Miles (Table C-9) suggest that trucking is truly a broad reflection of our economy. Total employment and real GDP, as well as trade- and consumption-related economic indicators, are all important. The time-period analysis reveals again that consumption and imports emerge in the latter period as more significant factors than manufacturing. Sub-period correlations for Truck Ton-Miles (Table C-10) indicate that total employment, real GDP, and production and manufacturing are all important indicators. T he analysis by time period reveals similar trends, with personal consumption emerging and employment diminishing as the most important factor in truck ton-miles. Trade variables have less importance to Truck Ton-Miles than they do to rail-related measures because trade generates longer freight hauls to/from ports at one end of the country and longer hauls tend to move by rail. Main Results Checking correlations by time period helps validate the relationship between candidate independent variables and measures of U.S. freight demand. This analysis indicates which variables are likely to explain the largest portion of recent and potential future variations in freight demand. It also validates that during the earlier part of the 1980-2007 time period studied, production-related economic factors such as industrial production and total capacity utilization were more important, while in the most recent sub-period, retail sales and consumption factors became relatively more important. Overall GDP and industrial production were important to total tonnage transported, while trade-related variables such as imports and exports seemed to affect mileage-based measures due to the longer hauls involved. Differences between the modes of truck and rail are also important and interesting. For truck, important variables are general economic indicators such as real GDP or total employment, while in the case of rail trade, imports play a significant role in explaining variation due to their much longer average length-of-haul.

39 As the economy shifts – for example, as agricultural exports contribute more to the U.S. balance of trade – additional independent variables could be tested to determine if they correlate to national demand for freight transportation. The Usefulness and Challenge of Predicting Demand This section considers the potential of using the independent economic variables from preceding periods as “early warning indicators” to predict subsequent shifts in freight demand. This occurs regularly in the real world as businesses and consumers react to activities or use indices to predict future economic activity. While changes in economic variables or indices do not necessarily translate to subsequent changes in buying behavior, physical activity, or freight demand, sometimes they do. The Potential of Independent Factors as “Early Warning Indicators” Some of the independent variables may provide correlations with the freight demand measures during a subsequent time period that exceed their correlations with the same freight demand during the same-period. This is intuitively logical because planned investment and purchasing decisions may be postponed or eliminated based on unfolding events. T hose investment and purchasing decisions often tie into freight transportation activities. S imilar to what might happen to a household if one of its wage earners gains a promotion or is laid off – not much might change immediately, but in subsequent time periods, the household may make additional purchases or cancel a v acation. In our research, if an independent influencing variable measuring economic activity or perception during timet-1 has a greater effect on freight transportation activities in timet than it does on freight transportation activities during the corresponding time period, it may be an “Early Warning Indicator,” As part of Task 4, the research team tested if correlations between the independent factors and the freight demand measures in subsequent time periods were greater than the correlations within the same period. All the independent influencing variables were examined to determine whether their “lagged” values, i.e., their values in years 2000-2006, correlated better with the respective measures of freight demand in years 2001-2007, than their same-year correlations. If they did, a “yes” was indicated in Table 4. Because our measures of freight transportation were primarily yearly summaries, our analysis was confined to “lags” of one year only. “Better” correlation means a higher absolute value of the correlation coefficient between the independent variable and the lagged measure of freight transportation – the five highest absolute values of this correlation to the lagged dependent variable are highlighted. Several of the independent variables – such as total housing starts, Purchasing Managers’ Index, Trade-Weighted Foreign Exchange Index (broad trading partners), inventory/sales ratio, and real urban gas price – showed some indication that they were useful as “early warning indicators” in this respect (see Table 4).

40 Table 4 - Lag Correlations in Comparison with Prior Year Demand Measures Rail Tons Rail Ton Miles Rail Revenue Ton Miles Rail Train Miles Rail Car Miles Truck Ton Miles Truck Vehicle Miles Water Tons Water Ton Miles Candidate Independent Variables Real GDP yes yes yes no no no no no yes Real GDP per Capita yes yes yes no no no no yes yes Real Personal Consumption yes yes yes no no no no no no Real Income Per Capita yes yes yes no no no no no no Total Housing Starts yes yes yes yes yes yes yes no yes Industrial Production Index yes no no no no no no yes yes Industrial Manufacturing Index yes no no no no no no yes yes Purchasing Managers' Index yes yes yes yes yes yes yes yes yes Trade Wt. Broad Cur. Index yes yes yes yes yes yes yes no yes Trade Wt. Major Cur. Index no no no yes yes no no no no Total Employment yes yes yes no no no no no yes Employment in Wholesale sector yes no no no no no no no yes Exports in Real $ no no no no no no no no yes Imports in Real $ yes no no no no no no yes yes Total Capacity Utilization yes no no yes yes yes yes no no Chained Inv. Sales Ratio (BEA) yes yes yes no yes no no yes yes Inv. Sales Ratio (Census) no no no yes yes no no no yes Urban Gas Price in Real $ yes yes yes no no yes yes yes yes Retail Sales in Real $ yes yes yes no no no no yes yes Lagged Inland Waterway Trust Fund Tax/Gallon NA NA NA NA NA NA NA yes yes Grain Tonnage NA NA NA NA NA NA NA yes no Coal + Grain Tonnage NA NA NA NA NA NA NA no no Coal Production (Tonnage) NA NA NA NA NA NA NA no yes SOURCE: Developed by the Research Team NA indicates correlations were not determined for rail or truck demand variables with these waterborne-freight-related independent variables, which were added later in the analysis. For these variables, the values for the appropriate timet-1 were included in the regression analyses along with the other independent variables from timet to be considered as potentially influential on freight demand in timet. Several very good predictive models using both the current and lagged independent variables are discussed below and presented in Appendix D. Regression Analysis Results While the correlation analysis provided a simple series of measures of statistical relationships, it is difficult to develop a definitive idea of how various causal factors may influence freight correlation based solely on a paired relationship. While the various high correlations between the independent economic factors, both during current and previous periods, indicated a relationship between them and freight demand, regression analysis was used to identify the various weights, or levels of respective importance, for each of these factors. The regression

41 models were constructed based on pr ior knowledge of freight demand trends, the correlation between dependent variables and independent variables, and statistical fitness diagnostics. The regression models tested included candidate demand factors that were believed to be important determinants of demand (through the foregoing statistical analysis as well as theoretical motivations). Regression (ordinary least squares or OLS) models that were developed to identify weights for candidate demand factors are presented in detail in Appendix D, Regression Analysis Results and Diagnostics. Regression as a Method to Identify Relative Importance Linear regression models are a powerful approach to evaluate the effect of multiple influential variables on a dependent measure. Such variables may account for various macroeconomic conditions influencing freight demand, as well as controlling for exogenous impacts or one-time events such as recessions, mergers, or legislation that might impact the demand for freight. From an econometrics perspective, regression methods should ideally be motivated by theory and guided by real-world logic in order to scientifically test the validity of theories and mechanisms that influence freight demand. Following this fundamental belief, the research team developed a theoretical basis for testing and improving the various regression models. Considerations for Model Construction In addition to checking if models meet fundamental assumptions such as normality, constant variance, and independence, among others, careful attention was paid to constructing models that can be interpreted with ease and satisfy linearity assumptions. When developing econometric models, the data are commonly transformed using various mathematical functions. Among such transformations is the natural logarithm, which is advantageous in terms of fitting a model as well as interpreting the model. Taking a logarithm of a variable may convert multiplicative relationships into additive relationships. Furthermore, variables with high growth curves that may resemble that of an exponential function are flattened to a linear trend. Interpretation of such transformations is dependent on where the transformation occurs. The table below provides a brief overview.

42 Table 5 – Interpretation of models using natural logarithm transformations Model Type (Dependent-Independent) Representation Interpretation Log-Actual LN(Y) = B0 + B1 A one-unit increase in X is associated with a B X +E 1 Actual-Log *100% change in the dependent Y Y = B0 + B1 A 100% increase in X is associated with a B LN(X) +E 1 Log-Log unit change in the dependent Y LN(Y) = B0 + B1 A 100% increase in X is associated with a 100*B LN(X) +E 1 Actual-Actual % change in the dependent Y Y = B0 + B1 A one-unit increase in X is associated with a B X +E 1 unit change in the dependent Y In the context of this research, when relationships are developed in log-actual, the model is assumed to fit the following form: )(* ShockS t C t eXey β= where ty = freight demand at time t c = constant tX = candidate demand factor at time t β = weight of the candidate demand factor S = exogenous shock to the system The S term explicitly adjusts for events that would impact freight demand, but would not necessarily be represented in changes to the independent variables. Examples of these included in this study are NAFTA and deregulation of the surface freight industry. Transforming the above model by taking natural logarithms on t he left and right sides of the equation produces a model which identifies elasticity of demand. For example, if a candidate demand factor increases by 10 percent while β = 1 then freight demand would be expected to increase by 10 percent as well. The transformed model is shown below. ttt vSXcy +++= Dummy) periodshock (*)log()log( β ttt vv ερ += −1 Where: tv = error term that is correlated over time tε = corrected error term that is uncorrelated over time

43 Correlated Error Corrections Another concern in regression analysis is to correct for correlation among error terms from different time periods, also known as serial correlation. Serial correlation violates a fundamental regression assumption that error terms are uncorrelated. This is a concern in time-series models such as this one that have naturally trending time series. The consequences of serial correlation are that OLS is no longer efficient among linear estimators, and that standard errors are underestimated, therefore inflating the significance of independent predictors. T o solve this issue, the research team used a technique known as an autoregressive model (denoted AR in summary tables) to compensate for errors in the estimated coefficients. T his technique iteratively adjusts model estimates in order to ensure that error terms are not correlated over time. Model Selection The concept of model selection is arguably where the science and art of econometrics meet. To ensure that this work was reasonably completed, models were constructed following fundamental knowledge of the transportation and freight market and vetted using common statistical standards. Such standards include testing for heteroscedasticity (Breusch-Pagan Test and White Test), independence (Durbin-Watson), and multicollinearity (Variance Inflation Factors). In addition, model fitness was compared using R-squared values as well as relative goodness of fit measures such as the Akaike Information Criterion and the Bayesian Information Criterion. While models may be statistically fit, the expected direction of independent variable coefficients was a defining criterion in ensuring the logical sense of the models. Factor Weights Table 6 below shows estimates for the separate impacts of different independent influencing variables on the various measures of freight demand. Note that the factor weights summarize the coefficient estimates for all of the models developed so that variables repeatedly used to predict a given mode were averaged for ease of interpretation. When interpreting these results, it i s necessary to keep in mind that these are log-log models, which as discussed previously, represent percentage changes. For example, a 10% increase in real GDP is estimated to match up with a 5.5% impact on rail tonnage, holding all other factors constant. Similarly, a 10% increase in retail sales (in real $) is estimated to accompany a 9.7% increase in rail ton-miles, but only a 2.6% increase in truck ton-miles. The table also includes the estimated impact of the exogenous factors of NAFTA and freight industry deregulation, which presumably had a significant one-time effect on the overall demand for freight transportation. F ollowing the ratification of NAFTA, freight demand increased, everything else being equal, by 4% while during the time period following deregulation (and the rail consolidation that followed) there was a 7% decrease in rail tons. That does not mean there is less rail tonnage now than during 1980; but it does indicate a temporary drop for 1980 to levels lower than expected.

44 TABLE 6 - Potential Factor Weights (Log Actual Models) of Independent Variables (generated as a result of multiple regression models) Candidate Demand Factors Rail Tons Rail Ton- Miles Rail Train- Miles Rail Car- Miles Rail Rev Ton- Miles Truck Ton- Miles Truck VMT Water Tons Water Ton- Miles Real GDP 5.51 11.20 5.73 6.56 10.61 - - Real GDP per Capita 5.83 - - 7.90 - - - Real Personal Consumption 6.75 - - 3.26 2.81 - - Real Income per Capita - - - - 11.51 - - Total Housing Starts - - 1.48 - - - - Lagged Housing Total 1.01 - - 1.13 - - - Industrial Production Index 8.37 9.64 5.61 - 8.58 - - Industrial Manufacturing Index - - - 4.67 - 2.30 2.30 Purchasing Managers' Index 1.00 - - - 2.23 - - 2.53 Trade Wt. Broad Cur. Index -1.43 - 2.47 - - - - Trade Wt. Major Cur. Index - - - - - - - Total Employment - 13.11 - - 20.06 3.70 3.69 Employment in Wholesale Sector - - - - - - - Total Trade in Real $ 3.15 3.84 2.53 1.39 5.00 0.99 0.99 Exports in Real $ 1.54 2.80 - - - 0.55 0.55 Imports in Real $ - 2.37 - - 2.89 - - Total Capacity Utilization 6.70 - 8.72 - - - - 8.57 6.87 Chained Inv.-Sales Ratio (BEA) - - -7.88 - - -1.94 -1.97 Inv.-Sales Ratio (Census) -5.82 -4.97 - -4.84 -7.75 - - Real Gas $ - - - - - -0.46 -0.46 -0.90 -1.55 Lagged Real Gas from B.L.S. - - 1.88 - 1.00 - - Retail Sales in Real $ 6.58 9.69 7.84 9.51 - 2.61 2.60 Rail Ton-Miles - - - - - - - -5.88 Lagged Inland Waterway Trust Fund Tax/Gallon - - - - - - - -1.26 Grain Tonnage - - - - - - - 0.35 Coal + Grain Tonnage - - - - - - - 0.93 Exogenous Impact NAFTA Impacts 0.39 - - - - 0.14 0.14 Lagged NAFTA Impacts - 0.53 0.47 0.41 0.67 - - 1.12 Deregulation Impacts -0.7 - - - - - - -0.14 SOURCE: Developed by the Research Team

45 Tables in Appendix D, Regression Analysis Results and Diagnostics, present the results of the regression analysis. T he appendix contains 18 tables, two for each of the nine dependent measures of freight demand. The odd-numbered tables present the regression analysis results, while the even-numbered tables present the diagnostic values. For each dependent variable, the research team posited several different possible regression models involving different selections of independent variables. The selections were based on the foregoing analyses of correlations and on theoretical considerations about the likely relationship of variables to one another. The number of models developed for each dependent variable ranges from three (in the case of the waterborne freight-related variables) to seven (in the case of rail tons). There are a variety of ways of determining how effective a regression analysis is at explaining fluctuations in value of a dependent variable to fluctuations in the value of one or more independent variables. Based on the regression model’s goal of finding the right factors that are formed into an equation where Dependent Variable = function(Constant + IndVariablei, Ind.Variable2 . .. . Ind.Variablen The goal has been to make the calculated value for the Dependent Variable from the function come close to the actual values. T hese differences can be measured via the sum-of-squares difference between the set of predicted values vs. the actual values. The R ) 2 measures the proportion of total Sum of Squares (SStot) by the explained Sum of Squares (SSreg). A fully explained, or perfect, regression would have an R2 It should be noted that a reasonable regression model is one that is a “parsimonious” representation of theoretical relationships, and relies on a select set of independent variables that allow the model to meet fundamental assumptions. The frugality of the model’s use of influencing variables is the guiding rationale for developing multiple alternative modes, each with only a few independent variables. Developing such a m odel can be as much an art as a science and therefore, several different models for each freight demand measurement, using a limited set of variables are shown in Appendix D. that is 1.00, while a value of .70 would mean that “70% of the variance in the dependent variable – in this case, fluctuating values of freight transportation demand – were explained by the specified independent influencing variables.” Table D-1 shows regression results for the models that predict rail tons. The regression models explain dependent measures of freight transportation volumes well as seen by the very high R2 Table D-3 shows regression results for models based on rail ton-miles. The regressions fit the past very well, as can be seen by the very high R values and low standard errors. Model 1 suggests that a 10% increase in the Industrial Production Index is associated with an increase in tons moved by rail of approximately 8.4%, while a 10% increase in the Trade Weighted (Broad) Currency Index is associated with a 1.4% decrease in rail tons. 2 values and low standard errors. Model 1 suggests that a 10 percent increase in real GDP is associated with an increase in rail ton-miles of approximately 11.2 percent, while a 10 percent increase in industrial production in Model 2 is associated with a 9.6 percent increase in rail ton-miles.

46 Table D-5 shows regression results for five different ways to model rail train miles. Again, the regressions fit the past very well as can be seen by the very high R2 Table D-7 shows regression results for models that predict rail car-miles. As before, the regressions fit the past very well as can be seen by the very high R values and low standard errors. Model 1 suggests that a 10% increase in real GDP is associated with an increase in rail train-miles of approximately 5.7%, reflecting the concurrent improvements in railroad efficiency that have increased the revenue that carriers generate per train-start. Model 1 also suggests that post-1995 rail train-miles were 0.5% higher, possibly based on the one-time “shock” of NAFTA. 2 Table D-9 shows regression results for models based on rail revenue ton-miles, as provided by the AAR. The regressions fit the past very well. Model 1 suggests that a 10% increase in real GDP increases rail revenue ton-miles by approximately 11%. The (lagged) NAFTA effect suggests revenue ton-miles were 0.6% higher during 1994 and 1995, possibly as a result of the free-trade agreement. values and low standard errors. Model 1 suggests that a 10% increase in real GDP is associated with an increase in rail car-miles of approximately 6.6%, while a 10% increase in (lagged) housing starts is associated with a 1% increase in rail car-miles. Lagged NAFTA is a dummy variable that controls for years in which the free trade agreement came into effect with a one year lag. Thus, the model suggests that car-miles were 0.4% higher during 1994 a nd 1995, possibly as a result of the free-trade agreement. Table D-11 shows regression results for models based on truck ton-miles. These regressions likewise fit very well. Model 1 suggests that a 10% increase in total trade increases truck ton- miles by approximately 1%, while a 10% increase in real gas prices is associated with a 0.5% decrease in truck ton-miles. Table D-13 shows regression results for models based on truck vehicle-miles. The regressions are good. Model 1 suggests that a 10% increase in total trade is associated with an increase in truck vehicle-miles of approximately 1%, while a 10% increase in the (chained) inventory/sales ratio is associated with a 1.7% decrease in truck vehicle-miles. These models also use auto- regressive corrections to reduce the loss of efficiency in the time series estimators. Table D-15 shows regression results for models based on domestic waterway tonnage as reported by the Army Corps of Engineers Institute of Water Resources. The regressions are moderately good with R2 The construction of the waterway ton-mile freight demand models was challenging for a number of reasons. Unlike trucking and rail freight service, which handle a diverse geography and set of commodity groups, the carriage of freight via inland waterways is limited primarily to the Ohio- Tennessee-Missouri-Mississippi River system and the Atlantic and Gulf Intracoastal Waterways. between .66 and .76. Model 1 s uggests that a 10% increase in total capacity utilization is associated with an increase in waterborne tonnage of approximately 9%, while a 10% increase in the amount of grain and coal produced is associated with a 0.9% increase in waterway tonnage. There is also a negative correlation between waterway tonnage and the real price of gas. Due to the presence of serial correlation, the waterborne models utilize the auto- regressive correction to adjust estimates for efficiency loss in estimation.

47 Moreover, only a limited number of heavy, low-value, high-volume commodities move via barge. Over two decades of general economic growth in the U.S., waterway transportation demand actually decreased. With the exception of a tax applied to fuel used on the inland waterways that increased from $.04/gallon to the current rate of $.24/gallon between 1981 and 1995, there is very little positive correlation between changes in economic activity and waterway ton-miles. Table D-17 shows regression results for models based on domestic waterway ton-miles as reported by the Army Corps of Engineers Institute of Water Resources. The regressions are good even though they rely on a high negative correlation with rail ton-miles – an intuitively understandable and statistically significant complementary variable. During the past 15 years, the railroad industry has captured an increasing share of long-haul movements of low-value freight, affecting waterway ton-mileage significantly. This is partially due to the fact that as rail service has gotten better and as the waterway network has become less efficient due to lagging maintenance, a l arger share of freight that traditionally moved via waterway has become more contestable and now moves by railroad. Relationships of waterway ton-miles with total capacity utilization and real gasoline prices are comparable to those affecting waterway tonnage. Overall Findings The steps performed prior to the regression analysis helped establish the validity of these models. Established economic measurements were selected based on diversity and availability over the 28-year time period. Correlations between the independent influencing variables and the dependent measures of freight transportation demand were tested for consistency over the 28- year time period. Collinearity among the independent variables was noted to help with the appropriate choice-making. “Dummy” or shock variables were considered for major socio- economic events such as the transportation deregulation of the early 1980’s and NAFTA during 1993 and 1994. Several variables were also tested to determine if their “lagged” values provided additional explanatory value. The final result has been encouraging. A variety of parsimonious regression models were prepared for each of the nine measures of freight demand. W ith the slight exception of Waterborne freight, which decreased over a long period even as the economy grew, the explanatory measurement R2 was quite high for nearly all the models. With the fundamental structure in place, and regression models relying on relatively few, not-too-collinear independent variables, it appears that these regression models provide a strong basis for explaining freight demand volumes.

48 Principal Component Analysis Both the literature review and the analysis confirm that one of the primary challenges of researching the multiple economic, demographic, and other factors that influence freight transportation demand is that they are often similar to one another. Imports grow with exports as trade grows. GDP and population grow with new household formation, and so on. The correlation table in Appendix C identifies the generally high correlations among the independent variables themselves, which is generally referred to as multicollinearity. The challenge of collinearity is one of choice – when a regression model contains multiple independent variables that are highly correlated with one another. The statistical condition may result in inflated standard errors meaning that statistical significance is harder to achieve and signs on coefficients that are vastly different from expectations. Depending on how variables are selected, some might appear statistically significant while others may appear meaningless, but these results will have more to do with the construction of the model than with the quality of the data. The challenge is to choose independent variables based on both intuition and the ability to produce the best prediction of the dependent variable. One statistically rigorous approach of dealing with these multicollinearity challenges is principal component analysis (PCA). PCA has been applied most notably by the Chicago branch of the Federal Reserve Bank in calculating their National Activity Index.22 The Chicago Fed National Activity Index (CFNAI) is based on a weighted vector of 85 m onthly indicators of national economic activity, providing a single summary measure with insight into turning points and fluctuations in the business cycle. The PCA procedure first groups the 85 indicators into four categories (production/income, employment, personal consumption & housing, and sales/orders/inventories) and then determines the appropriate weighting of the 85 i ndividual indicators, by month, into a single index component that the Chicago Fed uses to model overall economic activity. PCA provides estimates of independent variable weightings so that the variation in the linear composite of these candidate demand factors is best utilized – i.e., the most appropriate weighted average of the independent influencing variables can be used instead of a chosen subset of the variables themselves. Thus, PCA reduces inefficiencies in the selection process, minimizes multicollinearity, and offers a more accurate predictor of the dependent variable. The PCA was useful for the trucking and rail analyses because of the multiple independent variables that showed, through their correlations, that they had predictive value. The virtues of the PCA’s improved statistical results were considered vs. the drawbacks of losing the intuitive connection that allows, with a simpler regression model, to state that “If A goes up by 10%, then B goes up by x%”. Steps in Applying PCA to Freight Demand Since the main goal of PCA is to remove multicollinearity, the list of independent variables was divided into categories such that each category is fully represented in the final model, while 22 Background on the National Activity Index, Feb, 2010, Chicago Federal Reserve digital asset publications.

49 factors within each category that are highly correlated could be combined. Groups of variables were developed that measured employment, consumption, production, commodity prices and foreign exchange. Lag values of important candidate demand factors were also included. For example, correlation between real GDP and real GDP per capita would be naturally high. Similarly, correlation between industrial manufacturing and industrial production would also be high since the former is a part of the latter. Detailed compositions of the groups and their correlations, as well as the weights that form the principal components, are included in Appendix E. Table E-1 shows the six categories or variable groups that were developed. Extracted Principal Components and Weights Tables E-2 through E-7 show the results of the PCA for each category. Each category has enough underlying data delineation to explain nearly all the variation. Principal component methods develop coefficients such that maximum variation in the candidate demand factors is captured. The resulting linear combinations are uncorrelated to each other, which is desired. Table E-2 suggests that the first two commodity group principal components explain approximately 96% of the total variation in the candidate demand (refer to right-most column, Cumulative Proportion). The first component explains 78% with the second component adding another 19% to the cumulative proportion. Table E-3 indicates that the first two consumption group principal components explain approximately 94% of the total variation in the candidate demand. Table E-4 indicates that the two foreign exchange group principal components explain nearly 94% of the total variation with the first component explaining nearly 70%. Table E-5 indicates that the two production group principal components explain over 99% of the total variation with the first component explaining the vast majority of the variation. Table E-6 indicates that the first two purchasing manager index and capacity utilization group principal components explain approximately 96% of the total variation with the first component explaining approximately 66% of the variation. Table E-7 indicates that the first two employment group principal components explain over 99% of the variation. Tables E-8 through E-13 present the relative weighting of the factors comprising each principal component group. Results “Parsimonious” regression models, that is, with fewer and more significant explanatory variables, were developed for the nine freight demand variables (via truck and rail). These models are shown with their accompanying diagnostics in Tables E-14 through E-27. These regression models illustrate the relative importance of the grouped variables (e.g., production, employment, etc.) in determining freight demand. All the PCA-based models explain at least

50 90% of the variance of each dependent variable (R2 Table E-14 provides the results for rail tons regressed on principal components and Table E-15 shows the associated diagnostics. The regressions accurately fit historical values as suggested by the very high R >0.90). While interpretation of the model coefficients is less obvious than in the simpler regression models, the PCA allows the incorporation of a broader range of independent influencing variables. 2 Table E-16 provides the results for rail ton-miles regressed on principal components and Table E-17 shows the associated diagnostics. The regressions fit the past very well as can be seen by the very high R and low standard errors. Model 1 s uggests that a 10% increase in the employment principal component is associated with an increase in rail tons of approximately 0.8%, while a 10% increase in the commodity principal component is associated with a 0.3% increase in rail tonnage. 2 Table E-18 provides the results for annual rail revenue ton-miles (reported by the AAR) regressed on pr incipal components and Table E-19 shows the associated diagnostics. The regressions fit the past very well as can be seen by the very high R and low standard errors. Model 1 suggests that a 10% increase in the production principal component is associated with an increase in rail ton-miles of approximately 1%, while a 10% increase in the commodity principal component is associated with a 0.1% increase in rail ton-miles. 2 Table E-20 provides the results for rail train-miles regressed on principal components and Table E-21 shows the associated diagnostics. The regressions fit the past very well as can be seen by the very high R and low standard errors. Model 1 suggests that a 10% increase in the production principal component is associated with an increase in rail revenue ton-miles of approximately 0.9%, while a 10% increase in the commodity principal component is associated with a 0.1% increase in truck ton-miles. According to this model, the advent of NAFTA was associated with a 0.9% increase in overall rail revenue ton-miles. 2 Table E-22 provides the results for rail car-miles regressed on principal components and Table E-23 shows the associated diagnostics. The regressions fit the past very well as can be seen by the very high R and low standard errors. Model 1 s uggests that a 10% increase in the employment principal component is associated with a 1.1% increase in rail train-miles, while a 10% increase in the commodity principal component is associated with a 0.1% increase in rail train-miles. 2 Table E-24 provides the results for truck ton-miles regressed on principal components and Table E-25 shows the associated diagnostics. The regressions fit the past very well as can be seen by the very high R and low standard errors. Model 1 suggests that a 10% increase in the production principal component is associated with an increase in rail car-miles of approximately 0.8%, while a 10% increase in the purchasing manager and capital utilization component is associated with a 0.1% increase in rail car-miles. 2 and low standard errors. Model 1 suggests that a 10% increase in total trade is associated with an increase in truck ton-miles of approximately 0.2%. These models have naturally trending time series with error terms that correlate over time. By using auto-regressive corrections AR(1) and AR(2), these correlated error terms are controlled for and their potential to

51 bias model estimates and associated standard errors are reduced. Compared to other models, the apparent impact of NAFTA is far greater at 9.7%. Table E-26 provides the results for truck VMT regressed on principal components and Table E- 27 shows the associated diagnostics. The regressions fit the past very well. Model 1 suggests that a 10% increase in the production principal component is associated with an increase in truck VMT of approximately 0.2%. Overall Findings PCA provides some limited benefits above and beyond well-constructed, parsimonious regression models. In the research, it does not necessarily result in better results. For the sake of a few additional percentage explanation of the Ordinary Least Squares difference, many new collinear variables are introduced via a complex method that is difficult to explain the connection between independent influencing variable and the resultant change in freight demand. T he method reduces the intuitive connection between “if A does this, then it makes sense that B does that,” which is a fundamental benefit of a simple regression model. With the challenge of finding good predictive variables that have an intuitive connection to the decrease in waterborne freight, the Research Team decided that PCA analysis for Water Tons and Water Ton-Miles should not be performed. While the analysis could find a combination of factors that would predict the decrease in waterborne freight, as was shown by the negative correlations with many economic variables in Table C-1 in the Correlation Appendix, these connections are not intuitive and hence, more difficult to defend. The quality of the regression models and the opportunity to improve the analysis by using a greater number of correlated, independent variables leads to the conclusion that the PCA offers benefits. The PCA results, using multiple independent variables is a more helpful explanatory model of truck and rail demand, at the expense of less transparent relationships. This may be a tradeoff worth making if basic correlations and regression relationships are apparent and an improved predictability is desirable. Such a result was not pursued with the waterborne traffic measures as the quality and diversity of correlations with independent variables was not there. Because of the lack of correlation with a sufficient set of independent economic variables, PCA for Water-Tonnage and Water Ton-Miles was not attempted.

52 Reliability and Representative Tests In order to evaluate the reliability and the ability of the models to predict actual freight demand, actual data was compared against the various model predictions of past periods in a technique known as backcasting. Whereas forecasting is a prediction of future levels of freight demand, backcasting looks at how well can the selected model predict historical values and informs researchers on the actual fitness of the model assuming that underlying economic relationships do not drastically change. Base models, or models utilizing the full dataset for the 1980-2007 historical period, were selected at random from the pool of PCA analysis presented previously. Based on t he model specification, the research team estimated a second set of models on a revised sample of the historical data that randomly excluded three years of data. This ‘sample model’ is used to test the assumption that the trends that were estimated in the main body of this research are not strongly influenced by any single data point or outlying influence. By randomly omitting three years of the 28 p eriods of data, a model without high leverage observations would resemble nearly the same growth and levels as the base model. Please note that this testing is generalizable only to US national level indicators as the relationship of predictors and freight demand of different regions within the US or different nations will likely be correlated at dissimilar levels. If models retained their predictive power as demonstrated as a backcast that is similar if not the same as the full sample backcast, then we can assume that the models are well specified and robust for estimating freight demand relationships. In extending the rationale of the backcast check, if underlying economic relationships and rates of change remain constant in future years, then the model would also be robust and appropriate for forecasting. Three models (rail revenue ton-miles, truck VMT, and truck ton-miles) were used to demonstrate the performance of the models developed in this research. Due to the generally stronger fit of models that include principal component predictors, these three models focus on three of the specifications that utilize principal component independent variables. To evaluate the forecasting accuracy, we use the Mean Absolute Percentage Error (MAPE). It is defined as: ∑ =       − = n t t tt A FA n M 1 1 where At is the actual value Ft is the forecast value For each of the following three backcast examples, three trends are graphed: the base model estimates, the sampled model estimates and the actual freight demand indicator. When interpreting the results, the research team sought to graphically examine the base and sampled estimates for any deviations. Significant deviations indicated that the base model is not robust for forecasting and early warning detection whereas no deviations indicates that the base model is robust for the study’s purposes.

                                                                   53  Rail revenue ton-miles were estimated using the following function: Rail revenue ton-miles = f(PCA commodity component, PCA production component, PCA consumption component, NAFTA indicator, autoregressive correction) Data was available at the quarterly level between 1990 and 2009 Q3, which meant that the specified model could be estimated on 77 periods. Due to the larger data sample, seven randomly selected quarters of data were omitted from the analysis. The backcast predictions for the sampled data appeared to be statistically and graphically reasonable and close to the base data set estimates (see Figure 6). The MAPE is 1.82%, indicating a tight fit between the predicted freight demand data and the actual freight demand. Furthermore, there are no significant deviations between the base and sample estimates indicating that the base model is robustly specified. When the model was tested against the missing data, the predictions closely followed the actual recessionary trend, which suggests that the underlying PCA variables are dependable for modeling exercises. 240 280 320 360 400 440 480 90 92 94 96 98 00 02 04 06 08 Base (MAPE=1.82) Sample (MAPE=1.89) Actual R ai l R ev en ue T on M ile s Years Figure 9. Backcast versus Actual Rail Revenue Ton-Miles Truck ton-miles and truck VMT were also tested for reliability following a different backcast methodology. In order to randomly sample the data, a random number generator was constructed for a range between 1980 and 2007. For truck ton-miles, the years 1984, 1997 and 2002 were omitted. For truck VMT, the years 1984, 1991, and 2004 were omitted.

                                                                   54  Truck ton-miles, using the PCA model, were estimated via the following: Truck ton-miles = f(PCA commodity component, PCA consumption component, NAFTA indicator, autoregressive corrections for one and two periods) Figure 10 below plots the observed number of truck ton-miles as well as the base model and the randomly sampled model. The randomly sampled model generates predictions that are very similar to the base model. The base model MAPE is 2.35, which rises to 2.57 upon removal of three randomly selected data points. Thus, the truck ton-miles model is reasonable in its predictive ability. 600,000 700,000 800,000 900,000 1,000,000 1,100,000 1,200,000 1,300,000 1,400,000 1980 1985 1990 1995 2000 2005 Base (MAPE = 2.35) Sample (MAPE = 2.57) Actual Tr uc k to n- m ile s Years Figure 10. Backcast versus Actual Truck Ton-Miles The truck VMT PCA model follows the following function: Truck VMT = f(PCA commodity component, PCA consumption component, NAFTA indicator, autoregressive corrections for one and two periods) Figure 11 below plots the observed number of truck ton-miles as well as the base model and the randomly sampled model. The base model MAPE is 2.36, which marginally rises to 2.42 upon removal of three randomly selected data points. Thus, the truck VMT model is reasonable in its ability to predict historical values and is not adversely or strongly influenced by specific years of data.

                                                                   55  100,000 120,000 140,000 160,000 180,000 200,000 220,000 240,000 1980 1985 1990 1995 2000 2005 Base (MAPE = 2.36) Sample (MAPE = 2.42) Actual Tr uc k VM T Year Figure 11. Backcast versus Actual Truck VMT These three backcast examples demonstrate that the specified models are robust and are able to provide a consistently reliable and reasonably accurate prediction of freight demand factors. The tests check for strength and consistency and are particularly important in understanding if the regression models tested represents real relationships. The general predicted trend indicates that the generated values from both the base and sample models are no more than an average 2.6% off from the actual values. While this analysis has focused almost exclusively on annual data, it should be remembered that models based on quarter-year data may offer reasonably robust predictions as well, subject to the availability of the relevant data. For example, the quarterly rail revenue ton miles backcast demonstrated above, overcomes seasonality in both the base and sample models and produces an accurate trend prediction. The annual analyses were successful in predicting the direction of trends, but do not offer the potential for more detailed analysis to understand the effect of shorter-term influences.

Next: 4. Conclusions and Recommendations »
Identification and Evaluation of Freight Demand Factors Get This Book
×
 Identification and Evaluation of Freight Demand Factors
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

TRB’s National Cooperative Freight Research Program (NCFRP) Web-Only Document 4: Identification and Evaluation of Freight Demand Factors focuses on the identification of independent variables that may be used to explain gross measures of freight demand over time.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!