Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.
63 Two common questions regarding APC system design are how much accuracy is needed from APCs and how many APC units are needed to obtain an adequate sample size. Answer- ing those questions requires facing a related question: when an agency generates measures such as peak load, passenger- miles, and route boardings from APC data, how timely and precise must those estimates be? 9.1 Sample Size and Fleet Penetration Needed for Load Monitoring According to the Transit Data Collection Design Manual (16), the passenger-countâbased statistic requiring the greatest pre- cision is average peak load on heavy-demand routes, a mea- sure used to adjust headway. A reasonable target precision to ensure that the route is neither overcrowded nor overserved is 5% or 6%, effectively limiting permissible load bias on crowded segments to about 5%. Sample size needed to achieve this target precision depends on the bias and cv of load estimates. Fleet penetration needed, in turn, depends on the number of daily trips on the route- direction-period being analyzed, the data recovery rate, and how the instrumented ï¬eet is distributed. Fleet penetration of 10% will afford about 20r observations per quarter for a route- direction-period with ï¬ve trips per day, where r is the data recovery rate. (For example, if r = 80%, then 20r = 16 observa- tions that would be obtained.) If needed, greater sample sizes can be achieved by simply concentrating equipped vehicles on heavy-demand routes, at the expense of low-demand routes, for which less precision in load estimates is needed. We have posited elsewhere that APCs make possible a more precise method of scheduling and service quality monitoring focused on extreme values of load rather than mean values. Extreme values reï¬ect the impacts of load variability and service regularity as well as frequency and better reï¬ect the quality of service as felt by passengers. Estimating extreme values requires a far greater sample size than estimating mean values, which is an argument favoring instrumenting the entire ï¬eet with APCs, a course being pursued by Tri-Met. 9.2 Accuracy and Sample Size Needed for Passenger-Miles All U.S. agencies receiving federal assistance and operating in urban areas are required to report annual systemwide passenger-miles by mode to the NTD. Traditionally, these esti- mates are made from a sample of manually counted ons and offs.Agencies can use a standard sampling and estimation pro- cedure that requires on-off counts on 549 or more trips (39), or they can use any other sampling method that achieves a pre- cision of ±10% or smaller at the 95% conï¬dence level. Because manual on-off counts are labor intensive, there is a natural desire to ï¬nd less burdensome measurement and estimation methods, including using APC-generated counts (40). One factor in using counts measured by APCs is the accu- racy of the counts themselves, which, as the previous chapter shows, depends not only on sensor accuracy but also on data processing techniques used for parsing, screening, and balanc- ing. The second factor is having an adequate sample size. The two factors are related; the less accurate the counts, the larger a sample is needed. This section deals with that accuracy/sample size trade-off. For all but the smallest transit agencies, sampling require- ments for NTD passenger-miles reporting are considerably less demanding than are other uses of the data such as moni- toring load or boardings by route, because the NTD precision requirement is only applied to a whole yearâs sample aggre- gated systemwide. Therefore, meeting the NTD require- ment should be easy for almost any transit system with APCs. However, because the NTD requires that alternative sampling methods be statistically justified, the following section examines passenger-miles sampling and estimation with APCs in detail. C H A P T E R 9 APC Sampling Needs and National Transit Database Passenger-Miles Estimates
9.2.1 Standard Error Targets in the Presence of Bias Let Y â = mean passenger-miles per trip b = relative bias in the passenger-miles estimate (b = bias/Y â ) y _ = estimated mean passenger-miles se = standard error of the passenger-miles estimate rse = se/Y â = relative standard error The precision speciï¬cation can be interpreted as: Subtracting E[yâ] = Y â (1 + b) and then dividing by se, By the Central Limit Theorem, the middle term approaches a standard normal variate as sample size increases; therefore, using the notation Φ() = cumulative standard normal distri- bution, the precision requirement becomes From relation 6, selected values of permitted relative stan- dard error for a given value of relative bias are shown in Table 11. For manual data collection, assumed bias-free, the permitted relative standard error is 0.051; with 8% relative bias, the permitted relative standard error falls to 0.012. To be safe, a transit agency would do well to limit the permissible bias in passenger-miles or load to less than 8%. Φ Φâ âââ ââ â â âââ ââ 0 1 0 1 0 95 6. . . ( )brse b rse P Y bY se y Y bY se Y bY se â â â â ââââ ââ â0 1 0 1 0 95. . . P y Y Y Y P Y Y y Y Y â +( ) = â +( ) 0 1 0 1 0 1 0 1 0 95 5 . . . . . ( ) 9.2.2 Sample Size and Coverage Requirements The determination of sample size requirements assumes three stages of sampling: in stage 1 all routes are selected; in stage 2, for each route, certain timetable trips are selected; and in stage 3, for each selected timetable trip, certain days are observed. The assumed cvâs for trip-level passenger-miles at stages 2 and 3 are: The assumed values are conservative estimates based on experience with data from many transit agencies. The values reï¬ect the fact that, for a given route, most variation in trip- level passenger-miles is due to differences in where trips fall within the timetable (peak/off-peak, inbound/outbound), rather than random differences between days. Sample size requirements derived in this section are based on the week- day sample only; the addition of weekends, sampled with the same degree of ï¬eet penetration as on weekdays, will improve precision, although not by much. The effective penetration rate (f3) is deï¬ned as the expected fraction of the daily schedule observed each day. It is the product of ï¬eet penetration rate and data recovery rate. Covering Every Weekday Trip With an effective ï¬eet penetration rate as small as 1% and careful rotation, every weekday timetable trip can be observed at least once per year. The annual estimate is determined by calculating average passenger-miles for each timetable trip, expanding by number of days that trip was operated, and summing over all timetable trips. Stratifying to this level is a very effective estimation technique because it eliminates the effect of variability between timetable trips. The weekday sample size requirement is where N2 equals the number of weekday timetable trips and rse is the permitted relative standard error from Table 11. For bias up to 8% and for all but the smallest transit systems, the N2 term will control; that is, it is sufï¬cient to simply observe every timetable trip once. Covering Most Weekday Trips (Two-Stage Sampling) Logistics and data recovery problems can frustrate plans to observe every weekday timetable trip. The following plan n N rse max , . ( )2 2 0 3 7( )( ) cv cv2 = =0.9 oftimetable trip means (within route) 0.3 ofdaily passenger-milescv cv3 = = (within a given timetable trip) 64 Measurement Bias* Permitted Relative Standard Error* 0.00 0.0510 0.01 0.0500 0.02 0.0471 0.03 0.0423 0.04 0.0365 0.05 0.0304 0.06 0.0243 0.07 0.0182 0.08 0.0122 0.09 0.0061 *Relative to mean passenger-miles per trip Table 11. Relative standard error required versus measurement bias.
assumes that only a percentage ( f2) of the timetable trips is covered. The estimation procedure is to get an average for each timetable trip that was observed, determine the route average (per trip), expand each route average by the number of trips operated per year, and then sum over all routes. The relative standard error of the estimate is given by where D is the number of weekdays in the year (about 252). For all but the smallest transit systems, the third term will be insigniï¬cant, and the size of the relative standard error will depend mostly on f2. Using equation 8, Figure 20 shows the required timetable coverage f2 versus the number of trips in the timetable (N2) for selected values of bias and effective penetration rate. Degree of coverage is restricted to values of 85% or greater, rse cv f N f cv Df N 2 2 2 2 2 2 3 2 3 2 1 8= â( ) + ( ) because lower coverage rates suggest poor logistical manage- ment with likely sampling biases (e.g., whole routes being missed or seriously undersampled). With 5% bias, 85% cover- age is sufï¬cient even for an agency with only 200 trips in the weekday timetable and 1% effective penetration. Only smaller systems with moderate to large bias will need greater timetable coverage or effective penetration. 9.2.3 Intentional Sampling The recommended estimation procedures just described involve unintentional samplingâthe APCs collect data all year long, and the agency just rolls it up. This approach assumes that instrumented buses, for reasons beyond NTD passenger-miles estimation, are being circulated in a manner that covers the entire schedule regularly. Intentional sampling methods with limited sample sizes are clearly inferior, unless data processing procedures are still so undeveloped that each tripâs data must be manually checked. 65 0.80 0.85 0.90 0.95 1.00 0 200 400 600 800 1000 1200 Number of Trips in Weekday Timetable 8% bias, 1% eff. penetration 8% bias, 4% eff. penetration 5% bias, 1% eff. penetration 5% bias, 4% eff. penetration 2% bias, 1% eff. penetration Figure 20. Timetable coverage rate required versus timetable size.