Chapter 4 - Preconstruction Services Estimating Process and Models | Estimating Highway Preconstruction Services Costs - Volume 2: Research Report

« Previous: Chapter 3 - Preconstruction Services Case Studies

Page 34

Suggested Citation:"Chapter 4 - Preconstruction Services Estimating Process and Models." National Academies of Sciences, Engineering, and Medicine. 2016. Estimating Highway Preconstruction Services Costs - Volume 2: Research Report. Washington, DC: The National Academies Press. doi: 10.17226/23627.

Page 35

Page 36

Page 37

Page 38

Page 39

Page 40

Page 41

Page 42

Page 43

Page 44

Page 45

Page 46

Page 47

Page 48

Page 49

Page 50

Page 51

Page 52

Page 53

Page 54

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

34 C H A P T E R 4 4.1 Introduction This section presents the methodology followed to develop the data-driven PCS estimating models as described earlier (see Section 1.5, Task 4). Preconstruction service activities typically take a long period of time (sometimes more than a decade) from planning to programming to preliminary design to final design. As the project evolves into downstream PCS activities, more infor- mation about the project becomes available, and consequently, more accurate PCS cost estimating is possible with better- defined project information. The accuracy of any estimating is directly related to the amount of information available about the project. As a result, a PCS cost-estimating process should be aligned with the typical project development process to reflect the maturity level of project definition. A distinctive feature of the PCS cost-estimating process compared to construction cost estimating is that as a project continues to be defined, the project is broken down into dif- ferent engineering functions, and each functional department takes charge of completing the functional analysis and engi- neering requirements for the project, as shown in Figure 4.1. 4.2 PCS Cost-Estimating Process The PCS cost-estimating process during the project develop- ment stages is depicted in Figure 4.2. Very limited information about the project at the earliest stages of project development, such as planning and programming, makes it difficult to esti- mate PCS costs. However, there is the need to establish the probable and approximate PCS cost of the project for budget- ing and funding authorization purposes. This estimated PCS cost can also be used as a baseline cost for monitoring and tracking the performance of PCS costs during the remaining PCS activities. Due to the low maturity of project definition at the early project development stage, only a parametric esti- mating approach, which is a common early and conceptual estimating method, is applicable. In parametric estimating, project characteristic information, such as project type, proj- ect location, project length, and project complexity, is used as major predictive parameters to estimate the anticipated cost of PCS activities. Thus, this PCS estimating is called top-down estimating. When the project moves into the preliminary design and final design stages, the overall project scope gets defined more accurately, and it becomes clear which functional engineering departments should be involved. For example, a right-of-way department may play a significant role for a new roadway construction project as new parcels need to be purchased from property owners for the project, but the same department may have no role in a typical bridge rehabilitation project. When a project is determined to require a specific functional departmentâs engineering service, the functional department needs to estimate the anticipated PCS work-effort hours or costs required to get the service fulfilled. This estimating needs to be as accurate as possible for internal resource manage- ment purposes and for determining the consulting costs if the department decides to outsource the service, which is becom- ing a more popular option as many transportation agencies are operating with fewer staff members. With better-defined scope of work and the experience of similar projects previ- ously, functional departments typically know what specific work tasks need to be completed for a given project. Some transportation agencies, such as Georgia DOT, Florida DOT, and Ohio DOT, have a well-defined WBS for different engi- neering functions. As a result, PCS cost estimating at the functional department level can be performed as WBS-based estimating, and this estimating is called functional-level estimating in this guidebook. The summation of all of the functional-level PCS cost estimates for a given project is the total PCS cost of the project, and this aggregation process can be called a bottom-up estimating. The bottom-up esti- mating result needs to be cross-validated with the top-down estimating result as part of PCS cost monitoring and control. Preconstruction Services Estimating Process and Models

35 Note: PS&E = plans, specifications, and estimates. Planning Programming Preliminary Design Final Design Need Assessment Public Involvement Hydraulic Study Right-of-way Acquisition Environmental Clearance Economic Feasibility PS & E Roadway Design Bridge Design Underground Utilities Surveying Schematic Development Traffic Control Plans Geotechnical Investigation Figure 4.1. Examples of PCS activities throughout project development process. Planning Programming and Preliminary Design Final Design Advertise and Award PCS Cost Monitoring and Control Corrective Actions (as needed) Transportation Need Construction General Project Data PIN Assigned Detailed Project Data Bottom-Up/ Funconal- Level Estimate Top-Down Estimate Standard PCS Estimating Factors Final Project Data Top-Down Estimating Model Updated Historical Data Funconal-Level Estimating Models Lessons Learned from Previous Projects Lessons Learned from Previous Projects Figure 4.2. PCS cost-estimating process.

36 The cost differences need to be documented and explained. For example, scope changes during the project development process would significantly affect the total PCS costs from the bottom-up estimating approach, resulting in a signifi- cant deviation from the top-down PCS cost estimate. Proper documentation and implementation of a feedback loop in the PCS cost-estimating process will also assist in developing a more accurate top-down PCS cost estimate for future proj- ects by allowing the calibration of the top-down estimating method. 4.3 Overview of PCS Cost-Estimating Model Development Creating a framework for developing a data-driven PCS cost-estimating model requires the integration of a number of steps (as shown in Figure 4.3). This cyclic process allows transportation agencies to make continuous improvements in their models, and this process has been applied to develop the data-driven PCS cost-estimating models provided in the guidebook. â¢ Step I: Requirements Analysis. This step determines poten- tial model usage and users as well as anticipated data require- ments. This is a very important part of the process since it determines what approach [top-down or bottom-up (functional level)] should be used, the type of historical data required to build models, and potential data sources. â¢ Step II: Collect Historical Data. In this step, various data- bases are identified, studied, and compiled into one mas- ter database. The master database should be developed in accordance with the PCS cost-estimating needs determined during the requirements analysis stage. At this point, model developers should also determine if current historical data- bases and preconstruction data-collection procedures meet the minimum expectations of quality, quantity, and reli- ability required by the PCS cost-estimating approach to be performed. This guidebook describes some strategies to either improve existing databases or create new and more appropriate data sets. Both top-down and functional-level PCS cost-estimating methods depend on historical data collected from previ- ous projects. However, the level of detail of the databases and their configuration depend on the PCS cost-estimating approach and the final users of the estimating models. Thus, it is suggested that transportation agencies customize their databases in accordance with the needs of different users. For example, a division director may require access to his- torical data with general project characteristics (e.g., proj- ect location, type of work, expected overall complexity) in order to make strategic decisions using an early top-down estimating model. On the other hand, the geotechnical engineering department may need other types of informa- tion, such as regional geology, subsurface soil conditions, and the required efforts and costs for a series of laboratory tests in order to perform a cost estimate of preconstruction geotechnical activities. â¢ Step III: Identify Factors Affecting PCS Costs. Once a master database is established, the next step is to identify significant factors to estimate PCS costs. The selection of these variables may be done with experience and engi- neering judgment, but a structured statistical process (dis- cussed in Chapter 3 of the guidebook) would also greatly help in narrowing down influential variables as a statisti- cal process typically helps better define the relationships across input variables and between input variables and PCS costs. â¢ Step IV: Develop/Update PCS Database. This step involves the development of a suitable PCS database of significant factors identified in the previous step and historical PCS data for a subsequent development and implementation of data-driven PCS cost-estimating models. Along with the quality and reliability of input factors used to produce PCS cost estimates, the amount of available historical data may be a decisive factor in meeting the desired level of accuracy. Chapter 3 of the guidebook presents a detailed descrip-Figure 4.3. PCS cost-estimating model overview. I. Requirements Analysis II. Collect Historical Data III. Identify Factors Affecting PCS Costs IV. Develop/ Update PCS Database V. Develop Model VI. Validate and Implement Model

37 tion of the procedure associated with the development and optimization of a PCS database. Some of the strategies in this chapter are intended to minimize data management efforts while still producing reliable PCS cost estimates. â¢ Step V: Develop Model. In this step, top-down and/or bottom-up/functional-level estimating models are devel- oped using the available historical data and preselected input variables. This involves the combination of qualita- tive and quantitative procedures. The qualitative part comes from the experience and judgment of model developers and users to make an adequate use of the historical data and to appropriately read, understand, and use the out- comes of the models. The quantitative part is the use of the mathematical and statistical tools used to process the avail- able historical data into reliable PCS cost estimates. The guidebook describes four major quantitative tools: multiple regression, decision tree, and artificial neural network used in top-down estimating (Chapter 4 of the guidebook), and the three-point estimation approach for functional-level estimating (Chapter 5 of the guidebook). â¢ Step VI: Validate and Implement Model. This last step consists of two parts: the validation of the models to ensure a satisfactory performance, and the implementation of the validated models. The models that are developed are tested for their performances, and only the models that meet the expectations of the agency can be implemented. Once the performance of a PCS cost-estimating model is deter- mined to be satisfactory, it is ready for its implementation in actual upcoming projects. An efficient implementation of PCS cost-estimating models involves an appropriate interpretation of the model outputs and their incorpo- ration into decision-making procedures, a reliable system to track the performance of PCS cost estimates and expenses throughout the project development process, and a mech- anism to capture and assess lessons learned from previous projects to enhance the performance of PCS cost-estimating practices. Chapter 6 of the guidebook presents specific implementation practices and generic systems to track the performance of PCS costs and capture lessons learned. The implementation and monitoring methodologies in Chap- ter 6 of the guidebook are equally applicable to top-down and bottom-up/functional-level estimating approaches. 4.4 Requirements Analysis This step determines potential model usage and users as well as anticipated data requirements. This is an important part of the process as it defines what approach [top-down or bottom-up (functional level)] should be used, the type of historical data required to build models, and potential data sources. 4.5 Collect Historical Data Within the construction industry, it is commonly accepted that collecting and archiving data on past project estimates and actual costs is a successful way to improve future esti- mates. This same principle applies for PCS cost estimating. Using specific project information and corresponding actual PCS costs and/or work hours from previous in-house projects and consultant contracts creates a knowledge base that is valuable in creating more accurate future estimates. Today, highway agencies collect PCS data along with associ- ated project costs and store them in various data management systems as part of their inventory or accounting systems. In a typical agency, these data management systems or data inven- tories can range from in-house spreadsheets to commercially available programs developed through manual data collection during the preconstruction phase. Other pieces of informa- tion that may be relevant to the estimation of PCS costs might be obtained from less-formal data sources such as paper-based and electronic documents not arranged in a database fashion. All possible data sources must be considered at this early stage of implementing data-driven PCS cost-estimating techniques before proceeding with the identification of project character- istics (herein after referred to as âfactorsâ or âinput variablesâ) affecting PCS costs. 4.6 Identify Factors Affecting PCS Costs The identification of factors that affect PCS costs is an important task when developing PCS cost-estimation models. Factors are distinctive characteristics of a projectâfor instance its length or level of complexity. The successful identification of factors that have a direct influence on the total PCS cost allows for the development of an efficient PCS database. Table 4.1 presents a variety of representative factors that were identified for highway projects based on existing literature, conversa- tions with state DOT personnel, and review of project man- agement documents generated at preconstruction stages. The values of the factors can be numerical, Boolean, or nominal. Numerical values are numbers such as length and number of bridges involved. Boolean variables can only have two valuesâ generally yes or no. Nominal variables are categorical, where values are grouped quantitatively or qualitatively. For example, terrain type can be categorized as level, rolling, and mountainous. It should be noted that some factors that are presented in Table 4.1 are alternatives to each other. For example, the number of lanes and lane width describe the same feature of the roadway (its width). As such, only one of the two factors may be necessary. The factors listed in Table 4.1 and their values are only a small example. Transportation agencies have developed their

38 Table 4.1. Potential factors affecting PCS costs. Category of Factors Factors Description Variable Type Project information Project type Replacement, interchange, new construction, reconstruction, rehabilitation, widening, and reconstruction Nominal Pavement type Asphalt/cement Nominal Highway classification Freeway, principal arterial, collector, etc. Nominal Overall project complexity Low, medium, high Nominal Project location Urban/rural Boolean ROW acquisition required Yes/no Boolean Construction costs Cost in dollars Numerical Geometry, topography, and geology Length Length in miles Numerical Number of lanes 2, 4, 6, etc. Numerical Roadway width Width in feet Numerical Divided roadway Yes/no Boolean Terrain type Level, rolling, mountainous Nominal Special geotechnical consideration required Yes/no Boolean Typical section Open section, curb and gutter, combination Nominal Surveys Topographic survey Level of details required Nominal Pavement elevation survey Yes/no Boolean Hydraulic survey Yes/no Boolean Utility surveys Yes/no Boolean Traffic survey Yes/no Boolean Stream crossing Yes/no Boolean Traffic noise impact analysis Yes/no Boolean Design complexity Horizontal alignment change Yes/no Boolean Vertical alignment change Yes/no Boolean Roadway crossing/intersection Yes/no Boolean Railroad crossing Yes/no Boolean Stream crossing Yes/no Boolean Sidewalk Addition, improvement, or none Nominal Type of sidewalk/shoulder None, sod, aggregate, bituminous, concrete Nominal Standard design exception Yes/no Boolean Number of plan sheets Expected number of plan sheets Numerical Level of service A, B, C, D, E Nominal Context-sensitive design Yes/no Boolean Structural design Predominant type of bridges/culverts Reinforced concrete, steel, etc. Nominal Number of bridges/culverts Number of bridges/culverts Numerical Bridge sufficiency rating 0â100 Numerical Bridge width Width in feet Numerical Environmental factors NEPA classification Categorical exclusion (CatEx), environmental assessment (EA), environmental impact analysis (EIA) Nominal Biological resources report/assessment As required Nominal Traffic control Work zone safety and mobility level Basic/intermediate/major Nominal AADT Average annual daily traffic Numerical Staging of construction Yes/no Nominal Crash severity Crash severity rating Numerical Access control None, partial, full Nominal

39 own values for various factors. For example, project type classification for the Iowa DOT may vary from that of the Montana DOT. Each DOT can use its own classification sys- tem and its associated values. Some factors presented in the table can indicate the level of work involvement for multiple activities. For instance, project length can be an indicator for level of surveying required, expected number of plan sheets, efforts for right-of-way acquisition, and so forth. Currently, not all of these factors are collected in a structured format. As such, a limited number of available factors were used to illus- trate the process of developing a PCS cost-estimation model in this report. Case studies conducted on nine DOTs led to the identifica- tion of a set of factors that could maximize the performance of PCS cost-estimating models. Even though the PCS cost- estimating modeling tools described in the guidebook have the ability to adapt to different preconstruction databases, transportation agencies should consider, to the maximum extent practical, the collection of these eight pieces of infor- mation for each project in order to use them as inputs in their PCS cost-estimating models. These eight items are: 1. Project type, 2. Complexity, 3. NEPA classification, 4. Early construction cost estimate, 5. Length of project, 6. Number of bridges involved in the project, 7. Number of lanes, and 8. Project location. Project type is an important factor that, if appropriately used, may help to substantially improve the performance of PCS cost-estimating models. The following section discusses how different project classification systems may be incorpo- rated into data-driven PCS cost-estimating procedures. 4.6.1 Project Classification Various agencies use different classification systems to suit their strategic goals. These classifications help in defining the scope of the estimating model to be used in accordance with the agencyâs needs. Since different types of projects have unique design requirements, effective project classification schemes are expected to enhance the accuracy of PCS cost estimates. However, estimating models can only be improved if these classification systems are consistently included in the data inventory, as shown in the following hypothetical example: Assume: An agency classifies projects into three different groups: reconstruction, rehabilitation, and resurfacing. The agency may divide the available historical data into these three types of proj- ects and create three independent models, one for each project type. Or model developers may include the project type as an input variable in the model. Or to optimize the estimate further, they might try both and determine which approach yields the most reliable output. Common classification schemes used in top-down esti- mates at the project level are based on complexity and type of work. These variables could also be considered when estimat- ing PCS costs at the functional level; however, it may not be sufficient in some functional areas. For example, the estima- tion of the costs of environmental studies may be approached in a different manner for projects near wetland areas than for those not located near wetlands. Likewise, the geotechni- cal department could prefer the use of different estimating models depending on the geological conditions surrounding the project. Figure 4.4 shows an example of the project classification system used by Iowa DOT. For the purpose of data collec- tion, the geometry-based classification categorizes projects into four classes: point-based, line-based, polygon-based or multi-line projects, and other projects. Category of Factors Factors Description Variable Type Permits required U.S. ACE, state water resource board, FAA Various permits required Boolean Public involvement Number of parcels affected Indicates the amount of negotiation efforts with landowners for ROW acquisition Numerical Preliminary land use Residential, commercial, farming Nominal Special land use National parks, Indian reservations, etc. Nominal Cultural resource management effort None, low, medium, high Nominal Miscellaneous factors Hazardous waste Presence of hazardous waste material at the site resulting in special design requirements â yes/no Boolean Guardrail Addition/removal/improvement/none Nominal Table 4.1. (Continued).

Note: PCC = Portland cement concrete, HMA = hot-mix asphalt, RCB = reinforced concrete bridge. Figure 4.4. Geometry-based project classification (Iowa DOT 2012).

41 4.6.2 Data Cleaning and Transforming Data quality is one of the main issues confronted when developing a data-driven model. It is possible that some of the data attributes may have a significant number of missing values. Such variables may need to be removed before devel- oping a model. For example, the number of land parcels to be purchased for ROW may have a significant effect on total PCS costs. If most projects do not have the relevant data to fill a certain variable field, use of such variables is likely to confuse the model, resulting in lower accuracy of its prediction. Also, while some data-driven models will accept the missing val- ues, other methods will simply not work when data values are missing. Thus, such data should either be recorded manually or should not be used. Similarly, some data attributes may have unexpected val- ues if precautions are not taken in the data-collection sys- tem to validate the data before entering. For instance, if the length of a project is presented as â505,50â by error instead of a proper numerical format (505.50), such data will give errors when developing a model. The use of checklists, numerical data field validation, and so forth in the data-collection sys- tem will avoid collection of incomplete/incorrect data, but when a database is developed manually from other databases, it may have such errors. These data should either be trans- formed to a proper format or should be removed. If required, a regular data quality evaluation may be performed with the data stored in the database. Another aspect of data transformation is to generate an additional set of input data attributes based on existing input data attributes. For example, project complexity can be devel- oped based on the work type, land use, project length, envi- ronmental permits, and design complexities. A categorical or numerical complexity of the project can have a significant cor- relation with the PCS costs compared to the individual factors. 4.6.3 Optimization of Data Management Efforts The optimization of data management efforts is done to select and manage the most effective input variables while minimizing data-collection, cleaning, and processing efforts. It is an iterative process that starts with those factors that rep- resent the lowest data management effort for the agency and determines whether those factors are good enough to develop satisfactory models. If the performance of the models is not satisfactory, other factors that may increase the management efforts continue to be added until satisfactory models are developed. There are two types of efforts that must be considered dur- ing the implementation of a data-driven PCS cost-estimating system. The first is the initial effort required to collect, clean, and evaluate the suitability of the data for PCS cost estimat- ing. The second type of effort is related to the maintenance/ update of the already created PCS database and cost-estimating models. The following example provides a better understanding of the difference between these two types of efforts. Assume: During the initial development of data-driven PCS cost-estimating models, the model developer considers that the distance between the agencyâs headquarters and the job site may be an input valuable to estimate PCS costs. However, this piece of project information has not been collected to date. It means that the model developer will have to invest a substantial amount of effort to check previous projectsâ documents, measure this dis- tance, and provisionally add this information to the PCS data- base in order to evaluate its value for PCS cost estimating. Two things may happen at this point: (1) this piece of information may show a poor performance as an input variable and thus be discarded from the PCS database, or (2) it may positively con- tribute to the estimation of PCS costs. If the latter occurs, the agency would have to incorporate this piece of information into its regular data-collection procedure. This corresponds to the maintenance efforts mentioned in the previous paragraph (second type of effort). In comparison with the initial efforts required to collect and evaluate the piece of data mentioned in this example, maintenance efforts would be substantially lower. The future collection of this information to update the PCS database would not significantly increase data management efforts once it is included in the data-collection protocol. For exam- ple, there are a number of information technology tools that can instantly provide the distance between two differ- ent locations. Thus, this distance could just be uploaded into the system along with other general project information. The optimization of data maintenance efforts, in the long run, will have a greater impact on the agencyâs day-to-day data management activities. The decision of whether to invest in the initial efforts required to evaluate the suitability of a potential input variable, given the risk of wasting time and other resources if the variable is discarded, must be made based on the potential influence of the variable on PCS cost estimating as determined by the experience and professional judgment of agency personnel. Having identified all potential factors affecting PCS costs, the agency should proceed to rate each of them in accordance with the expected effort that would be required to continue tracking and recording them. They could be rated as high, medium, and low. As shown in Figure 4.5, the agency will start by creating a preliminary PCS database considering only low- effort variables and will move up in the scale of effort until reaching a satisfactory level of performance of the PCS cost- estimating models.

42 4.7 Develop/Update PCS Database Once the potential set of input variables has been defined for PCS cost-estimating modeling, the agency can proceed with the development of the PCS database. As discussed ear- lier in this chapter, there are different PCS cost-estimating approaches (top-down and bottom-up/functional level) that may be used by different types of users. This situation might require the development of multiple databases to be used at different levels and within individual functional areas. For instance, an agency may find it practical to use two separate PCS databases for the development of top-down estimates for paving and bridge projects. Likewise, each functional area within this agency (e.g., geotechnical, environmental, struc- tural) may keep its own PCS database of work-effort hours for functional-level estimating. The data management tech- niques described here can be applied to develop each of these databases. The scale of the process and the data sources con- sidered for each database vary in accordance with the scope of models previously defined during the requirements analysis. For management purposes, these databases should be consid- ered as a single PCS database system rather than as separate entities. Several pieces of data may be contained in more than one database. Thus, data management efforts should con- centrate on creating a single master database or a relational database that is connected to smaller databases, which can be accessed by users without much difficulty. After identifying the factors that may be influencing PCS costs, historical data associated with these factors must be gathered to create a preliminary PCS database. Most transpor- tation agencies maintain a large number of databases to record and store data generated throughout a projectâs life cycle. A lot of additional information is stored in paper-based and elec- tronic documents not arranged in a database-friendly fashion. As a result, the development of a preliminary database may need the use of multiple data sources (see Figure 4.6) that can be combined using unique PINs. For example, the right-of-way acquisition division may have the total land area and total number of parcels acquired for the right-of-way, and terrain information may be collected in a structured format by the survey division. Many of those databases are likely to contain data tied to a unique PIN. A consolidated PCS database can be easily developed using the PIN. Although data-driven estimating approaches depend on the amount of data, more data inputs do not always mean better estimates. PCS costs are time sensitive. Over a long period of time, typical PCS cost structures of an agency can vary for multiple reasons, such as inflation, changes in plan- ning and design practices, and employee turnover. Therefore, the amount of data used for PCS cost-estimating purposes must not be so small that it prevents estimating models from efficiently correlating the input variables with the observed total PCS costs nor so large that it fails to reflect current design rates and practices. Figure 4.5. Optimization of data maintenance efforts. Figure 4.6. Consolidation of existing data sources.

43 The size of a PCS database is defined by the number of potential input variables and the amount of historical data for each of those variables (illustrated in Figure 4.7). The magnitude of these two dimensions must be carefully deter- mined to avoid unnecessary data management efforts and to maximize the performance of the PCS cost-estimating models. The optimum amount of data per input variable is constrained by time. Data from projects executed during the previous 5 to 7 years are usually enough for the development of efficient cost estimates. 4.7.1 Evaluation of Factors Affecting PCS Cost Data attributes in the preliminary database are potential factors that may influence PCS costs. While some of them have a significant effect on PCS costs, others may not. Thus, those factors should be analyzed to understand the effect of each factor on PCS costs. The evaluation of these factors is conducted at two different stagesâfirst, through an analysis of the behavior of these factors in previous projects, which is the procedure described in this section, and then using some specific model performance indicators resulting from the use of different estimating tools. (This is the reason the arrows in Figure 4.7 move in two directions.) The latter stage is discussed later in this report. Figure 4.8 shows an example of how the influence of each factor on cost can be determined through a simple analysis of the available data. The figure shows the total costs spent on various PCS activities in 53 projects awarded by Iowa DOT. It is observed that some factors, such as wetland permits, are a very small component of the total PCS cost compared to factors such as the existence of bridges within the project. This indicates that data on the presence of bridges provide more useful information to the estimating model than wet- land permits; therefore, bridge data should be prioritized for collection. Figure 4.7. Dimensions of a PCS database. Figure 4.8. Components of preconstruction costs.

44 Along with the experience of model developers, there are formal methods that may help to evaluate the influence of fac- tors before proceeding to the development of estimating mod- els. Descriptive statistics and visualization techniques, such as scatter plots and box plots, can provide a better understanding of the data and their relationship to PCS costs. When a clear pattern is observed, those factors should be included for in further model development. As another example, an analysis was conducted with a Montana DOT sample data set that included more than two-dozen factors to identify major cost influencers. The analysis identified six factors as significantly important fac- tors affecting PCS costs (see Figure 4.9). It should be noted that the analysis was performed based on a data sample with only three types of project scopeâchip seal, mill and fill, and reconstructionâand it should not be generalized. 4.8 Top-Down Model Development After developing a database and identifying relevant and important data attributes, a number of modeling techniques can be applied to find the best-performing model. Three mod- eling techniquesâmultiple regression, decision tree, and arti- ficial neural networkâare presented briefly. The data sets used for each model are different and are used for illustration purposes only. Various data-mining systems that are available to develop these models include R-statistics, RapidMiner, Weka, STATA, SAS, IBM SPSS, and Microsoft Data Mining Client for Excel. As all these systems were developed with a wide audience in mind, they may be regarded as complicated to use by trans- portation agencies. It is suggested to test the software pro- grams and pick the most suitable one for the agency. Microsoft Data Mining Client for Excel is a relatively easy- to-use system once setup is completed. Also, it is an Excel-based tool, as the name suggests. Because of its ease of use, this sys- tem has been used to demonstrate various data-mining models presented in the PCS cost-estimating guidebook. In this study, three data-mining techniques are presented: multiple regres- sion, decision tree, and artificial neural networks. These are only three of many different data-mining techniques available. 4.8.1 Multiple Regression Multiple regression is a statistical technique that determines a relationship between a dependent variable (also known as a response, output, or outcome variable) and multiple inde- pendent variables, which are usually referred to as predictor, explanatory, input, or regressor variables (Allison 1999). Multiple regression is the simplest top-down PCS cost- estimation model out of the three presented in this research. The concept of multiple regression is fairly similar to that of linear regression. Instead of using a single data attribute as the input variable in linear regression, multiple regression uses multiple data attributes as input variables simultane- ously. In the case of PCS cost estimation, the output vari- able is estimated PCS cost, and the input variables are project characteristics such as project type and project length. The model can be represented in a simple equation: Estimated PCS cost . . . Eq. 4.1 0 1 1 2 2C V C V C V Cn n= + Ã + Ã + + Ã where: Vi = ith input variable, C0 = intercept (PCS cost when all variables are equal to zero), Ci = coefficient associated with the ith input variable, and n = number of input variables. A positive coefficient shows that PCS cost increases with the increase in the value of the corresponding input variable. A negative coefficient indicates the inverse relationship of PCS costs with the value of the factor of the corresponding input variable. The process of developing a multiple regres- sion model is illustrated in Figure 4.10. Selection of suitable independent variables was explained in an earlier section. In this step, the variable selection is done using statistical technique by checking the P-values of all independent variables. If the P-value is lower, another itera- tion can be performed by removing the variable and checking if the adjusted R-squared value increased or not. It may be noted that, unlike the other two modeling tech- niques that are presented in the following sections, multiple Figure 4.9. Factors affecting PCS costs â paving projects (Montana DOT sample data).

45 regression cannot use nominal variables as input variables. Nominal variables are categorical variables that are not ordered. Type of project is an example of such a variable. This is a huge limitation of the multiple regression model. But nonetheless, it is the simplest modeling technique. Table 4.2 shows types of variables suitable for regression model development accord- ing to Allison (1999). Although all three types of variables can be used to build a multiple regression model, it should be noted that it is always preferred to use interval variables (as defined in Table 4.2). The multiple regression model cannot include categorical variables (also called nominal variables). These are descriptive variables that cannot be arranged in a logical order in accordance with their impact on the depen- dent variable, such as the name of a county or the number of the DOT district in which a project was built. Lastly, to use multiple regression modeling, there must be at least as many projects as the sum of the number of input variables and the dependent variable (Allison 1999). As can be seen in Table 4.2, all independent variables in a multiple regression model should either be numeric or be transformable into a quantitative logic scale. Once a model is developed, its performance can be measured using vari- ous model performance indicators. Most statistical software packages yield complex outputs that are not easily under- standable by the average engineer with little or no advanced education in statistics. However, to simplify the interpreta- tion of the outputs in the multiple regression method, model developers can focus their attention on the three elements defined and explained in Table 4.3. It should be noted that the R-squared and adjusted R-squared values correspond to the entire model, while the standard error and P-value are model performance indicators at the variable level. Figure 4.10. Multiple regression PCS cost-estimating model development.

46 The model can be optimized by using a cyclic process intended to discard, one by one, those independent variables that do not show a statistically significant impact on final PCS costs (P-value > 0.05). Variables are discarded one by one to allow the model developer to understand the effect of a vari- ableâs removal on the new modelâs P-values. For the purposes of the guidebook, a cycle refers to an iteration of removing an independent variable and regenerating the model using the remaining variables. The term âcycleâ will be applied to not only multiple regression models, but also to decision tree and artificial neural network models discussed in the guidebook. Model developers should also look at the R-squared and adjusted R-squared values during each cycle. By removing vari- ables with P-values greater than 0.05, the model is expected to be improved as measured by increasing the adjusted R-squared value and reducing the difference between this value and the R-squared value. However, when using a data set with a high degree of uncertainty, such as the one used to build an early PCS cost-estimating model, it is possible for the best model to include independent variables having P-values greater than 0.05. As a result, the developer will notice that the adjusted R-squared value between two cycles is reduced. In this situa- tion, model developers should select the model from the pre- vious cycle, which is the one that provided the largest possible adjusted R-squared value. It should also be noted that the R-squared value is not expected to increase between cycles. Type of Variable Description Examples Interval variable Variable measured in such a way that the difference between two values is meaningful. An increase from 200 to 220 design hours is equivalent to an increase from 340 to 360 design hours. â¢ Length of project â¢ Number of bridges Ordinal variable A variable that may be arranged in a logical order assigning numeric values in accordance with their position in the arrangement. Unlike interval variables, two equal increments of these values cannot be clearly compared. 1 = very simple scope 2 = simple scope 3 = neutral 4 = complex scope 5 = very complex scope Indicator variable (also called âdummy variableâ) Nominal variable with only two possible categories identified with binomial values (0 and 1) for computation purposes. 0 = concrete pavement 1 = asphalt pavement Table 4.2. Types of variables suitable for multiple regression. Model Performance Indicator Description Values Use R-squared (R2) and adjusted R- squared These represent the percentage of the variability in the independent variable that can be explained by the multiple regression model. 0.00 (0%) â 1.00 (100%); 1.00 would mean that the model perfectly fits the observations. Adjusted R2 is always lower than R2. The closer the adjusted R2 and R2, the better the model. An increase in the adjusted R2 represents an improvement of the model. Standard error This indicator refers to the standard error for each variable. It measures the variability in the value of each coefficient. It is similar to the standard deviation of the mean values for the coefficients. The magnitude of this value depends on the level of uncertainty associated with its respective variable. The larger the standard error, the higher the uncertainty. In addition to being used to indicate the variability of a given coefficient, it is used to set confidence intervals and create stochastic models. P-value The P-value measures the level of significance of a given independent variable for the estimation of the dependent variable. 0.00 (0%) â 1.00 (100%); 1.00 would mean that the given independent variable has no impact on the dependent variable at all, and 0.00 would represent the opposite. P-value < 0.01 (5%) = highly significant variable P-value < 0.05 (5%) = significant variable P-value > 0.05 (5%) = discard variable Table 4.3. Model performance indicators â multiple regression.

47 An example of multiple regression is provided within the guidebook in Section 4.2.1. 4.8.2 Decision Tree A decision tree identifies projects with similar characteris- tics and identifies more important cost influencers and pre- sents them in a visual way. A decision tree model consists of nodes and branches, just like a tree (Figure 4.11). At the root or top node, the most important attribute is used to develop the first set of branches. The importance of each variable is evaluated by statistical software using expected informa- tion gain (i.e., better understanding of the data fluctuation) or similar measures after using that variable. The expected information gain is defined as the reduction in impurity or entropy of the data after using that variable. For example, if PCS costs of reconstruction projects (which would usu- ally be very high) and resurfacing projects (which would be lower) are expected to be very different, then this project type variable will enable the branching of the root node into two isolated branches. The branching continues until there are a certain preset number of data points in each branch. If there are less than a desired number of data points or projects in any branch, the branching stops. This is known as pruning. Pruning is necessary to reduce the modelâs over fitting (i.e., development of branches based on very few data points, which might result in unrealistic results, which is especially problem- atic in data sets with outliers). The values at the end node rep- resent the average output value of all projects that fall under that particular branch of the decision tree. When the PCS cost for a new project is to be determined, the prediction is made by following the corresponding branches based on the values of input variables (i.e., factors affecting the PCS costs). The benefit of a decision tree is that it can pro- vide a visual illustration of the internal computations used by the model. Further, the chart developed can then be used to compute PCS costs without any software. It can also use cate- gorical variables in addition to the nominal types of variables mentioned previously in the Multiple Regression section. The performance of a decision tree model can be measured using mean absolute percentage error and mean absolute error, as presented in Table 4.4. The mean absolute percentage error can be presented mathematically as: Mean absolute percentage error Actual PCS Cost Estimated PCS Cost Actual PCS Cost Eq. 4.2 n i i i â = â Where n = number of projects in the validation data set. Figure 4.11. Decision tree model visualization. Model Performance Indicator Description Values Use Mean absolute percentage error Mean absolute percentage error measures the deviation of predictions from actual values. Any positive value. A value of 0.00% would mean that the model perfectly fits the observations. It is usually calculated for both the training and validation data sets. The mean percentage absolute error for the validation data set is used to identify the combination of independent variables that best fit the observations. Mean absolute error It measures the absolute sum of the total difference between the predicted and actual values. Its value depends on the magnitude of the output variables (i.e., larger output variables will tend to have larger mean absolute error in terms of magnitude). It is more challenging to determine the accuracy of a single model using this measure. However, when two models based on the same data set are compared, this measure can be compared to identify the better model. Table 4.4. Model performance indicators â decision tree.

48 In addition to the measures in Table 4.4, the minimum and maximum errors can also be used to check the accuracy of the model. When developing a decision tree, the factors affecting PCS costs can be selected based on attribute evaluation techniques such as subset evaluation and principal component analysis. Such functionality may or may not be available, depending on the software being used. Additionally, some software may automatically perform such evaluations when a decision tree is being generated. These techniques either provide ranks of each of the factors that affect PCS costs based on the factorâs influence on the output variable or ranks of various combi- nation of factors that affect PCS costs based on each factorâs influence on the output variable. An example of a decision tree is provided in the guidebook in Section 4.2.2. 4.8.3 Artificial Neural Networks An artificial neural network is a learning system that has the ability to generalize and learn from data by modeling the neural connections in human brains. Typically an artificial neural network consists of an input layer, a hidden layer or layers, and an output layer. Input values are assigned to each of the independent variables in the input layer; then these values are processed through the hidden layer(s) (working as a black box); finally, a single value is obtained though the output layer. The output value in this case is the estimated PCS costs of the project whose project features were used in the input layer of the model. This method is capable of modeling nonlinear relationships among variables with high accuracy; however, this accuracy depends on the quality, amount, and reliability of the data used to build the model. Berry and Linoff (1997) define an artificial neural network as a powerful, general-purpose tool readily applied to estima- tion, classification, and clustering, which are sometimes best approached as âblack boxesâ with mysterious internal work- ings. Figure 4.12 is a diagram of a basic artificial neural net- work with two independent variables and one hidden layer. This model is powerful, but the internal calculations are not visible to the users of the model. Adding numbers to the independent variables and weights to the artificial neural network in Figure 4.12 might add more sense to the operation of a neural network. Figure 4.13 shows some sample values for these elements and how these val- ues are modified as they move in the direction of the arrows until reaching the output layer. The procedure shown in this figure corresponds to the simplest way to calculate a depen- dent value in a neural network, but it is enough to explain the fundamentals of this method. Actual procedures followed by statistical software applications are usually more complex. To move a value from one node to the next, this value is multi- plied by the weight of the corresponding arrow. The value taken by each node is equal to the sum of all values transmit- ted from the previous layer. A general process of developing an artificial neural net- work is illustrated in Figure 4.14. This is similar to decision tree development. In this case, the relative variable indicator (RVI) that indicates the importance of each input variable is used to select the influencers. The variables with the lowest RVIs are discarded one by one on a per-cycle basis to improve the accuracy of the model. It is also possible that the combi- Figure 4.12. Basic artificial neural network diagram. Figure 4.13. Artificial neural network calculation.

49 nation of the variables with less and high importance may provide better accuracy than the combination of the high- importance variables only. As such, various combinations of variables should be tried. As noted in the Decision Tree section, attribute selection methodologies that provide the ranks of combinations rather than single attributes can also be used. The RVI is another model performance indicator that can be used in addition to other indicators mentioned in the deci- sion tree discussion (see Table 4.5). An example of an artificial neural network is provided within the guidebook in Section 4.2.3. 4.9 Validation of Models and Selection The accuracy of PCS cost-estimation models can be mea- sured using several goodness-of-fit tests. To obtain a better idea about the accuracy of a model, random sampling should be done for testing/validation. There are two general methods of generating training data sets and testing data setsâholdout and k-fold cross validation. In the holdout procedure, a frac- tion of data (usually 67%) is used as training data to develop a model. Then the remaining data are used to test the accu- racy of the model. Predicted and actual values from the Figure 4.14. Artificial neural network PCS cost-estimation model development. Model Performance Indicator Description Values Use RVI This indicator measures the impact of each independent variable in the calculation of values for the dependent variable. The sum of RVI values of all independent variables is equal to 100%. Used to identify the independent variables that represent the lowest contribution to the model, which are discarded one by one on a per-cycle basis. Table 4.5. Additional model performance indicators â artificial neural network.

50 remaining 33% of data points are used to test the accuracy of the model. In k-fold cross validation, usually 10-fold, the data set is partitioned into k-folds (say 10 parts). One part of the k-folds is used for testing, while the remaining data is used for train- ing. This process is repeated until each part of the data is used for validation of the model developed using the remaining parts of the data. Thus, k number of models are developed and tested. The errors calculated from each model are then averaged out to calculate the overall accuracy of the model. Given that any of the three PCS cost-estimating techniques described previously in this chapter may show the best per- formance under different databases and estimating condi- tions, transportation agencies are encourage to use all three approaches. The final PCS cost-estimating model would pref- erably be the one with the lowest average error. The identification of the most accurate model among the three causal methods does not mean that its accuracy is high enough to fulfill the expectations of the agency. Thus, it is sug- gested that agencies establish standard parameters to accept a given model to proceed with its application, or reject the model and review again the original data set looking for pos- sible errors or opportunities for improvement of data quality. 4.10 Bottom-Up (Functional-Level) Model Development The preconstruction phase includes the delivery of many intermediate products and services, such as environmental investigations, geotechnical studies, public involvement, and permitting. The level of effort required to complete many of these tasks is often influenced by project location, resources affected, and regulations activated by the project rather than by a specific project characteristic such as lane miles or bridge length (American Association of State Highway and Trans- portation Officials 2008). As a result, the best way to quan- tify these services is to develop a scope of work for the effort required to complete each. Functional-level cost estimation is a form of bottom-up estimating. The scope of work can be divided into smaller work tasks, which can be estimated individually. These smaller estimates are then combined to form a total estimate for a specific service. A bottom-up estimate is typically estimated by a person who is involved in monitoring the project, such as a senior designer who will manage the team to complete the work (Larson and Gray 2011). Figure 4.15 illustrates the key steps that are taken to form a functional-level estimate. Once the scope of work has been defined, the tasks required to fulfill the scope must be identi- fied. To simplify this process, some DOTs have a standard task inventory, also known as a WBS, which contains a compre- hensive list of common activities that are typically required during preconstruction. This inventory of work tasks can then be assigned a level of effort to complete and, hence, a rate of pay for that effort. After the hours of each specific work task have been multiplied by the relevant payment rate, the cost of each task can be combined to calculate the total PCS cost estimate. 4.10.1 Use of Functional Level PCS Cost Estimating Once there is a scope of work, an office will then need to assess who will complete the work. Should an in-house team be used or external consultants (see Figure 4.16)? While it appears most agencies would prefer to perform work in-house, this is not always possible. The amount of PCS work that is outsourced varies from state to state. Some DOTs have sufficient staff capacity and expertise to complete the majority of work internally, while other agencies employ consultants more frequently. A functional-level estimate can be used to quantify the number of work hours that will be required by a PCS team to complete a given work package. This can play a significant role Figure 4.15. Functional-level estimating process. Figure 4.16. Functional estimate sequence.

51 in managementâs decision on whether to perform the work with in-house resources. If the estimated work effort does not require specialized services and can be accommodated into the departmentâs schedule, then a decision to do the work in- house can be made. The estimate can aid the distribution and monitoring of forward workload to available team members. The use of consultants to assist state DOTs with PCS is pre- dicted to increase (Wiegers 2000). This surge in contracting external services has led to the implementation of various state policies and consultant services manuals. Within these documents, DOT engineers are often required to perform detailed in-house cost estimates or independent cost estimates for the work to be contracted out (Touran and Lopez 2006). The Brooks Act, introduced in 1972, requires that all applica- ble architectural and engineering service contracts be awarded in accordance to an open negotiation process on the basis of demonstrated competence and qualifications. Federal regula- tion stipulates a âdetailed cost estimate, except for contracts awarded under small purchase procedures, with an appropri- ate breakdown of specific types of labor required, work hours, and an estimate of the consultantâs fixed fee for use during negotiationsâ [Office of the Federal Register, n.d. (23 CFR)]. A functional-level estimate fulfills these requirements. 4.10.2 Assigning Hours to Work Tasks Assigning a range of possible hours for any given task rec- ognizes uncertainty and allows a three-point estimate to be formed. The weighted average hours calculated provide the best possible indication of how many hours will be required for a task, given the historic distribution of work hours from previous projects. To combine the minimum, most likely, and maximum val- ues from the range estimate into a single number, a weighted average number of hours can be calculated using Equation 4.3. Weighted average hours Min. 4 Most Likely Max. 6 Eq. 4.3 ( ) = + Ã + This equation is based on a historical distribution of work- effort hours for the project type being estimated. It weights the average hour estimate four times more heavily than either the maximum or minimum hour estimates. The output of this equation is the expected value of the number of hours required for the specific task. An example is shown in Figure 4.17 for estimating the work-effort hours required for utility coordina- tion and documentation for a project. Figure 4.17. Calculating the weighted average hours required for utility coordination and documentation given an estimate range.

52 The most probable number of work-effort hours required for the task of utility coordination and documentation is 15 (after rounding up from 14.5). This total number of hours can now be assigned pro rata to the levels of expertise needed to complete the task. Creating better functional-level estimates is an ongoing process that can continuously be improved upon. The flow- chart in Figure 4.18 illustrates how a WBS and database feed into the development of a functional-level PCS estimate. Recording the actual PCS work-effort hours/costs that cor- respond to each past estimate strengthens the quality of the database that then goes on to form more accurate future esti- mates. All three elements in this diagram affect the success of each other. For engineering offices with no formal estimating process, this feedback loop will take some time and effort to develop. Defining work tasks within a WBS is the best place to start. Once tasks are clearly identified, estimates of their work effort can be created. Review of each estimate compared to actual PCS work-effort hours will then provide the first pieces of data to the database. Over time, the quality of the database will improve as more projectsâ actual work-effort hours are recorded. 4.11 Implementation of PCS Estimating Models 4.11.1 Output Interpretation and Limitations The selected model is ready for application if it shows a satisfactory performance in accordance with the expecta- tions of the agency. It is important to recognize the scope limitations of each model when applying it to real projects. For example, if a given DOT develops a model using histori- cal preconstruction data from its previous corridor projects, the final selected model is only applicable to corridor proj- ects awarded by this agency. Likewise, if an agency develops a model using data from previous paving projects completed in a given county, the model would be only applicable to paving projects to be executed in this county by this agency. 4.11.2 Continuous Improvement Since causal methods rely on historical data to estimate future values, PCS cost-estimating models can be constantly improved over time as more projects are executed and more data are collected, regardless of the model selected (multiple regression, decision trees, or artificial neural networks). The model development procedure may be repeated on a regular basis every 2 or 5 years (or a period of time that the agency considers convenient), or before this time if observed PCS cost performance measures do not meet the standard parameters established by the agency. 4.11.3 Use of Output as a Decision-Making Tool By definition, decision-making procedures involve the selection of the most suitable option from a set of alternatives based on the preferences and selection criteria of the decision makers. As a decision-making tool, PCS cost estimates can be used to select the design methods and technologies that best suit the needs and resource availability of an agency. For example, a given agency may decide whether to use in-house or external designers based on the expected PCS costs associ- ated with each of these alternatives. Likewise, decisions can be made at the functional level related to the design of specific project activities. For instance, the geotechnical engineer may use PCS cost estimates to determine the cost implications of using 3-D technologies to model earthwork activities instead of traditional 2-D excavation and backfill plans. To make a comparison between these two design techniques, the model developed to estimate the cost of this work package must include a variable (probably a dummy variable) that indicates the design approach to be used. Regardless of the nature of the decision to be made, deci- sion makers should take into consideration the limitations of the PCS cost-estimating models. This means that if the agency intends to compare the cost of two different design approaches, it must develop two different models following the procedures described in this research. Key performance indicators (KPIs) can be used to measure the effectiveness of PCS cost-estimating models. There are two types of KPIs used for two different purposes: measuring the performance of the model and tracking the performance of Figure 4.18. Feedback loop for continuous improvement of PCS functional-level estimating.

53 At the Model Level The following KPIs are used to measure the overall performance of the PCS cost-estimating model. In order to draw any conclusions or take any corrective actions to improve preconstruction practices or the performance of the model, the agency should analyze the following three KPIs obtained from the application of the model in a series of projects. Corrective actions or model redevelopment may be needed only if one or more of these KPIs shows an average behavior that does not meet the agencyâs expectations. Construction cost growth (CCG) (%) This KPI is intended to justify the use of PCS cost-estimating models. It represents the variation, as a percentage, of the early construction cost estimate in comparison with the actual construction cost of the project. Final cost performance index (FCPI) This KPI measures the accuracy of the model by comparing the PCS cost estimate with actual PCS cost. Final cost of lost design effort (FCLDE) ($) The FCLDE corresponds to the total cost, in dollars, of activities associated with the development of discarded alternative designs. It also includes the cost of those portions of the original design that at the end were not used to construct the project. Lower FCLDE values represent a better utilization of agenciesâ resources to perform final designs. At the Project Level The following KPIs are used to track the performance of the PCS cost estimate throughout the preconstruction period. These KPIs allow the project manager to detect anomalies in the performance of preconstruction activities and take corrective actions in a timely manner. Cost performance index (CPI) Unlike the FCPI, which is calculated at the end of the preconstruction period, this KPI compares the PCS cost estimate with actual PCS costs at any single moment during the preconstruction period. This indicator can only be determined in bottom-up estimates by comparing the estimated cost of completed work packages with the actual cost incurred by the agency to perform this work. Cost of lost design effort (CLDE) ($) Unlike the FCPI, which is calculated at the end of the preconstruction period, this KPI refers to the cost, in dollars, of activities associated with the development of discarded alternative designs at any single moment during the preconstruction period. A large value in this KPI may represent a poor definition of the project scope. Design placement (DP) ($) This KPI corresponds to the total PCS expenses incurred by the agency at any single point during the preconstruction period. This indicator is more suitable for top-down estimates since the lack of detail in these models does not allow the calculation of CPIs. The interpretation of this KPI is based on a comparison of its value with the total PCS cost estimate and the project managerâs professional judgment. Estimate at completion (EAC) ($) EAC is an adjusted estimate of the total PCS cost calculated from the known cost of completed work packages plus the expected cost of uncompleted work packages. This KPI can only be calculated for bottom-up estimates. Table 4.6. PCS cost estimate â key performance indicators. a PCS cost estimate throughout the project preconstruction period. Table 4.6 describes the different KPIs proposed in the guidebook. Likewise, within the guidebook, Appendix B: Project MonitoringâPreconstruction Services Progress, Part III, presents a template that may be used by DOTs to track and record values for these KPIs in a given project. The Project Management Institute (PMI) recommends a methodology for capturing lessons learned (King 2008). Kingâs methodology consists of a series of questions that the project team should answer and record at the end of each project. These questions are related to three key areas: people, process, and product. Table 4.7 shows some examples of these questions by category. Answers to these questions may be directed to improve preconstruction practices or PCS cost-estimating models. Within the guidebook, Appendix B: Project Monitoring â Preconstruction Services Progress, Part IV, presents a template to assist project teams with the recording of their answers. 4.11.4 Implementing Database Maintenance and Model Development Within an Agency An agency may choose to maintain databases and develop data-mining models in any fashion that aligns most optimally

54 People Description Questions in this category should relate to team effectiveness and stakeholder interactions. Sample questions include those in the next cell. Questions â¢ What did we learn about staffingâskills, knowledge, experienceâ that will help us on future projects? â¢ What are the lessons learned about the issues that caused conflict among the team, and by the manner in which we resolved the problems and took corrective action? Process Description Questions in this category should relate to the inputs, tasks, and outputs of the project processes. Sample questions include those in the next cell. Questions â¢ Were there any tools, techniques, or programs used on this project that should be used or avoided for future projects? â¢ How effective was, or is, our data inventory? For whom, what, and when were these data collected? Product Description Questions in this category should relate to the project deliverables and success factors. Sample questions include those in the next cell. Questions â¢ What is being done well or needs to be improved to define, evaluate, and ensure quality for the design? â¢ What is being done well or needs to be improved to manage agency expectations? Table 4.7. Capturing lessons-learned methodology (King 2008). with its resources and organizational structure. There are numerous approaches that can be taken. One possible sys- tem is to collect data and maintain databases from a central location. A centralized office may also be responsible for creat- ing models with relevant data for decentralized offices (coun- ties or districts). This means a dedicated team with thorough knowledge of the models and data processes is responsible for all models, and a typical engineer need only input the key characteristics of a project into the model to obtain a cost estimate. Such an arrangement relieves the burden of training all PCS staff in data-mining techniques and ensures continu- ity of data capture and analysis across the agency.

Next: Chapter 5 - Development of Guidebook »

Estimating Highway Preconstruction Services Costs - Volume 2: Research Report (2016)

Chapter: Chapter 4 - Preconstruction Services Estimating Process and Models

Welcome to OpenBook!

Get Email Updates