National Academies Press: OpenBook

Multiagency Electronic Fare Payment Systems (2017)

Chapter: Chapter Seven - Fare Data Management and Reporting

« Previous: Chapter Six - Account-Based Fare Payment Systems
Page 58
Suggested Citation:"Chapter Seven - Fare Data Management and Reporting ." National Academies of Sciences, Engineering, and Medicine. 2017. Multiagency Electronic Fare Payment Systems. Washington, DC: The National Academies Press. doi: 10.17226/24733.
×
Page 58
Page 59
Suggested Citation:"Chapter Seven - Fare Data Management and Reporting ." National Academies of Sciences, Engineering, and Medicine. 2017. Multiagency Electronic Fare Payment Systems. Washington, DC: The National Academies Press. doi: 10.17226/24733.
×
Page 59
Page 60
Suggested Citation:"Chapter Seven - Fare Data Management and Reporting ." National Academies of Sciences, Engineering, and Medicine. 2017. Multiagency Electronic Fare Payment Systems. Washington, DC: The National Academies Press. doi: 10.17226/24733.
×
Page 60
Page 61
Suggested Citation:"Chapter Seven - Fare Data Management and Reporting ." National Academies of Sciences, Engineering, and Medicine. 2017. Multiagency Electronic Fare Payment Systems. Washington, DC: The National Academies Press. doi: 10.17226/24733.
×
Page 61
Page 62
Suggested Citation:"Chapter Seven - Fare Data Management and Reporting ." National Academies of Sciences, Engineering, and Medicine. 2017. Multiagency Electronic Fare Payment Systems. Washington, DC: The National Academies Press. doi: 10.17226/24733.
×
Page 62
Page 63
Suggested Citation:"Chapter Seven - Fare Data Management and Reporting ." National Academies of Sciences, Engineering, and Medicine. 2017. Multiagency Electronic Fare Payment Systems. Washington, DC: The National Academies Press. doi: 10.17226/24733.
×
Page 63
Page 64
Suggested Citation:"Chapter Seven - Fare Data Management and Reporting ." National Academies of Sciences, Engineering, and Medicine. 2017. Multiagency Electronic Fare Payment Systems. Washington, DC: The National Academies Press. doi: 10.17226/24733.
×
Page 64
Page 65
Suggested Citation:"Chapter Seven - Fare Data Management and Reporting ." National Academies of Sciences, Engineering, and Medicine. 2017. Multiagency Electronic Fare Payment Systems. Washington, DC: The National Academies Press. doi: 10.17226/24733.
×
Page 65
Page 66
Suggested Citation:"Chapter Seven - Fare Data Management and Reporting ." National Academies of Sciences, Engineering, and Medicine. 2017. Multiagency Electronic Fare Payment Systems. Washington, DC: The National Academies Press. doi: 10.17226/24733.
×
Page 66
Page 67
Suggested Citation:"Chapter Seven - Fare Data Management and Reporting ." National Academies of Sciences, Engineering, and Medicine. 2017. Multiagency Electronic Fare Payment Systems. Washington, DC: The National Academies Press. doi: 10.17226/24733.
×
Page 67

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

58 Because survey respondents were involved in procuring and deployment, the questionnaire focused on collecting and reporting maintenance and ridership topics rather than strategic and tactical plan- ning. In many cases, the processes for analyzing data are developed after the system is accepted. These processes include cleaning, organizing, and managing the data to enable their operational and strategic use. To that end, staffers at WMATA, the Washington, D.C. regional transit authority, were interviewed for a case example on both their management practice and the types of analysis they perform on the data sets they collect. LITERATURE REVIEW Agencies, academics, and vendors are using fare data in developing long and short term plans by gen- erating performance measures and other analytic techniques. In 2011, the Transportation Research journal (Pelletier 2011) published an extensive literature survey documenting the use of fare data. The article groups the research studies and implementations into three categories: • Strategic planning (travel time analysis, demand forecasting, mode choice, user behavior, and ridership profiles); • Tactical planning (trip data, pattern behavior, service adjustments, origin-destination data, journeys and transfers); and • Operational concerns (revenue, crowding, providing better travel time and load information to customers, and equipment performance and maintenance). Organizations that manage regional fare systems are also exploiting their data sets more extensively. They are beginning to integrate the data with intelligent transportation system applications and weather data (Chu 2014). The integration requires a thoughtful approach to joining multiple datasets. In the Expert-Modeling journal, Chu relates that “each application . . . requires a different [integration] approach and data enrichment procedure.” Taking this into consideration, many transit fare vendors are offering tools for big data analytics that use electronic fare data (Bus Ride Magazine, Focus on Fare Collection series 2015). Yet, the benefits of fare data have not been fully realized. An evaluation report on the LA TAP program (Bazilio Cobb Associates 2013) reported a “substantial positive impact on the amount and quality of information . . . [H]owever, the potential benefit from the use of this information has not been fully realized.” PUBLISHED PERFORMANCE MEASURES AND OPEN DATA Most agencies publish typical ridership and revenue data as part of their “vital data” or public- facing dashboard. Several organizations publish more detailed reports about their regional or single agency smart card program; these include Clipper Update and ORCA Joint Board Program Manage- ment Report. The frequency of the performance reports differ: the Clipper Update is published monthly whereas the ORCA report is published quarterly. The WMATA planning group publishes performance studies on a special blog which will be discussed in the case example. Some agencies provide a portal for citizens to view data by different dimensions. The CTA posts its ridership data on an open data portal (Figure 24). chapter seven FARE DATA MANAGEMENT AND REPORTING

59 More fare data are being made available to researchers and the public. For example, Washington State DOT is sponsoring research with the University of Washington to collect and publish electronic transaction data for “Transportation Planning and Travel Demand Management Uses” in support of its strategic goal “to optimize existing system capacity” (Results WSDOT, Moving Washington Forward, Research Portfolio). Several questions were asked in the survey about performance metrics, customer information, and data usage: Survey Question: Survey respondents were asked to describe the performance metrics that they collect. According to responses, the major performance metrics cover equipment availability, reliability, and accuracy. The reliability data include mean cycle between failure (MCBF), mean time to repair (MTTR), availability, or failure rates such as bill jams or rejected. Many smart card programs outsource the maintenance to the vendor, who submits a maintenance report. These are typically graded by the agency as failing (red), just meeting the contracted reliability (yellow), or exceeding (green) the contract regulations (see Figure 25 from MTC Clipper Update, March 2016 report). Programs need to report ridership and revenue data such as ridership by classification, farebox recovery rate, and average cost per service hour. Some agencies generate detailed planning data such as origin/destination, mode choice, and ridership profiles by different dimensions. UTA collects key performance metrics such as card validation accuracy, validation response times, validator availability, correct fare accuracy, back office web interface response time, data store availability, data store completeness, availability of electronic fare inspection solution, electronic fare inspection response time, electronic fare inspection accuracy, card load/reload accuracy, and customer web interface response time. Typical detailed customer analytics collected include data and longitudinal comparisons of mea- sures such as: • Number of unique cards • Number of active cards • Card usage by agency (percentage change) FIGURE 24 CTA ridership view from their Open Data Portal (Source: CTA Portal (http://www.transitchicago.com/ridership/#open).

60 • Unique card use by agency (shows percentage of users who transfer) • Sales channels by sales type • Payment transactions vs. loads and product sales • Monthly ridership usage, including average weekday boardings by agency (side by side graphs and charts). Some analyses are presented as tables and others as charts and graphs (see examples of bar chart versus graph for presentation of average weekday boardings—Figures 26 and 27). The varied resolution of the two presentations show different levels of detail. For example, the Clipper graph shows a drastic dip in boardings around the New Year for 2016, while the ORCA presentation, although showing the relationship from year to year, presents the relative change of each quarter by agency. FIGURE 25 MTC Clipper Monthly Performance Metrics (Source: MTC’s Clipper Update March 2016—System Performance, December 2015–February 2016 (excerpt). [Note: the colors are based on a rating scale found on http://mtcmedia.s3.amazonaws. com/files/Device_and_System_Performance_Level_Rating_Scale.pdf]. FIGURE 26 Graph example of average weekday boards for Clipper (Source: MTC’s Clipper Update March 2016).

61 Respondents with mobile apps collect measures such as: • Total number of downloads per day (by operating system, cumulative downloads) • Average transaction price • Utilization by operating system (e.g., iOS, Android) • Fare/ticket composition • Percentage of mobile app sales compared with (a) credit card sales and (b) pass sales. Survey Question: Respondents were asked a related question about what type of information they collect on their customers. Survey results showed that many agencies allow customers to register their cards and that the minimum data they collect are name, phone, and e-mail address (Figure 28). Although not directly associated with their customers many agencies also collect travel behavior data. Some smart card programs post customer travel and transaction histories and trip experi- ences to their accounts. “Other” category responses included: personal information for reduced fare customers, cell phone number, and purchase history. Two agencies specifically stated that “no infor- mation is collected” or “none unless the customer registers their card and opts to provide informa- tion.” When collecting information on customers, personally identifiable information needs to be protected. Systems that store bank-issued card or payment data for autoload and reload product functions may be required to comply with payment card industry security standards. ORCA as a % of Total Weekday Boardings FIGURE 27 Bar chart example of average weekday boardings for ORCA (Source: ORCA Board Report). 57.1%, 12 57.1%, 12 57.1%, 12 52.4%, 11 14.3%, 3 42.9%, 9 14.3%, 3 65.6%, 14 71.4%, 15 4.8%, 1 23.8%, 5 0 2 4 6 8 10 12 14 16 FIGURE 28 Survey response to information collected on customers (bar chart labels indicate percentage and count) (sample size = 22).

62 CASE EXAMPLE: WMATA DATA MANAGEMENT, ANALYSIS, AND VISUALIZATION Considering the volume of fare data collected on a daily basis, access and analysis of the data are only achieved with managed datasets. As one of the first agencies to implement common electronic fare systems in the United States, WMATA has been collecting and managing fare data since the initial deployment (W. Wu, WMATA, personal communication, Mar. 2016). In trying to make its data available to downstream users, WMATA has implemented robust data management, addressed clean-up issues with typical fare data, integrated fare data with data from multiple systems, and provided analysis tools for downstream users. One of the beneficiaries is its planning staff, which often publishes studies and analyses using WMATA’s multimodal fare data. WMATA planners then provide both the results, analytic methods, and the data to the public to review on the Metro Planning Blog (as shown in Figure 29) (PlanItMetro 2016). WMATA Fare Data Management The fare data management process begins with the raw fare data obtained from the SmarTrip fare col- lection system. Data from all the regional agencies are stored in the NextFare5 database. Each agency has its own procedures for extracting and analyzing the data. Also, the fare system includes a report generator to view and generate reports from the database. WMATA has developed its own processes and data stores (or “data mart”) to retain the data and provide fast, efficient access to information for analysis and reporting. Table 12 describes the data set, type, content, and usage for three data sets used at WMATA by downstream stakeholders, including the public. Cleaning Process WMATA staff has built a procedure to extract, transform, and load (ETL) the data from the operational NextFare5 to the fare data mart. The cleanup procedures are usually embedded in the ETL procedures. Cleaning is based on ensuring the data are “consistent, complete and correct” (E. Durham and K. Vamsi, Business Intelligence Group, WMATA, personal communication, April 2016). The procedures check FIGURE 29 WMATA’s planning blog [Source: PlanItMetro (http://planitmetro.com)].

63 for complete data in mandatory fields, correct formats, meaningful data, and logical consistency between different data files. The procedures are driven by known problems and issues that are seen in the data. For example, for the Metrorail service, where transactions are captured when customers tap on and off the system, a tap-on card identifier (ID) is matched to tap-off ID to build the origin- destination (OD) pair. A missing origin (tap-on transaction) or destination (tap-off transaction) will cause an error. The ETL captures and applies a correction to handle the missing data. In addition, the ID is usually stripped away from the data to protect individual privacy from data users once the data has been cleaned up. WMATA differentiates cleaning from integration processes. A cleaning process validates that every mandatory field is filled and the data in those fields are formatted correctly, and the data are meaningful and logically consistent. The OD pair example is a logical consistency rule. An integration process might assign a different format to data (change date format), aggregate data (group bus stops to zip codes), or apply an algorithm to data (assume OD pair for bus travel- based on tracking the fare card through its daily usage). Integration implies that an inference or analysis is made about the data. When fare data come in late, the approach may require an integration process rather than a cleanup process. It depends on the downstream use of the data; for example, when late arriving transaction data has already been stored in an aggregated format, the integration process may need to be reapplied to the source data to ensure correct and valid information. Sometimes integration needs drive clean-up requirements. For example, an agency might want to track the progress made to fix a fare card validator error. In this example, integration requires matching the maintenance work order and fare transaction data through the equipment and failure code and status codes. A typical work order may record the failure code in a text field rather than specific failure and status code fields; so the cleaning process requires parsing the failure code from the text field and creating a field in which it resides. The procedure is implemented as an ETL, and the resulting data are stored in the appropriate data mart so that they can be integrated with the fare data. Data Set Name Type Content Uses Nextfare5 Operational Data (raw) Retention: One (1) year Updated twice per day Includes every transaction by card ID (origin-destination pairs) Planning Used for customer based information that uses card IDs to study customer behavior such as î Retention rate î Abandonment rate î Influx (new user) rate î Change in usage (mode use by card and change over time) - Use of purse on card (e.g., stored value, benefits,parking) and change in usage—e.g., When Congress changed the benefits deduction from $240 to $120 î Anonymous user usage î Bus: service demand only (due to only having boarding data). Other Users: Accounting, modes (bus, rail, MetroAccess) Data Mart Raw and Aggregated Retention: in perpetuity Updated daily All data from raw and aggregated. Longitudinal and aggregated data is available for users by facts/dimensions based on user requirements Data Visualization and Reporting Tools: Tableau is used by Planning for analytics and reporting. Cognos is used for analytics and reporting. Users can query data and generate their own reports. Ridership Normalized (made consistent over time) Normalized for route changes, equipment failures and other operational issues Used to generate ridership information that is presented to Board and public Source: WMATA interview results, format word table. TABLE 12 WMATA FARE DATA SOURCES

64 Unlike the Nextfare5 database, WMATA’s data mart fare data are clean, validated, and already integrated, aggregated, and stored with other data sources. With this level of data treatment, applica- tion of analytical methods becomes repeatable, fast and reliable. Integration Process The process of joining more than one data set is an integration process. When fare data are combined with other data, it becomes much more useful. For example, when a drop in ridership on certain days can be shown against inclement weather, the economic effects of bad weather can be measured and analyzed, operations resource needs predicted, and other impacts assessed. The integration process is driven by a thorough vetting and description of analysis/reporting needs, issues, and integration requirements as articulated by the stakeholder. WMATA staff develops a requirements specification for each data integration and performance measurement project to ensure all the internal customer’s business needs are met. Each specification uses the Business Process Engineering Data Mart Requirements Specification Format (see text box). The skills required to accomplish these tasks include database management, programming, data analytics and systems engineering skills. The project developed a dashboard for SmarTrip account customers to view the travel time histories for their five most frequented trips. The scope of the specification was defined as WMATA’s customer travel time requirement specification (Document Version #1, Feb. 9, 2016) and included: • Provide web interface to allow registered SmarTrip customers to view their travel time scores; • Enhance customer satisfaction by putting delays in context of successful trips; • Enhance transparency; • Educate customers about travel time standards; • Provide an incentive for more users to register their SmarTrip cards. The project description lists two phases and describes the performance measures that will be pre- sented to the customer. Phase 1 performance measures include the total on-time score as a percentage; WMATA’s Business Process Engineering Data Mart Requirements Specification Format Each specification includes the following sections: • Points of contact (Information Technology staff and stakeholders) • Project purpose (business objective, scope) • Project description (including mockup of resulting report, visualization, or dashboard) • Assumptions and Constraints • Interface requirements   Principles   System input (data sets)   System outputs (services, visualizations, products) • Requirements—types of requirements include:   Business   User roles   Access requirements   Reporting requirements • Availability • Business process workflow diagram • Acceptance criteria Source: WMATA Business Intelligence Group, Outline of WMATA Customer Travel Time Requirement Specification, Document Version #1, Feb. 9, 2016.

65 the number of miles traveled on Metrorail; the number of trips made on Metrorail; and the number of stations visited. In developing the project specifications, specific data sources are identified as the system inputs, among them: automatic fare collection (AFC)/data network connector (DNC) data on entry and exits (same data used to generate current usage reports); a table of WMATA service standard travel time ranges for station pairs by service period from the office of performance (CPO); and information about miles between all station pairs from CPO. The requirements sections also list business requirements, such as the included data (“records that have [OD] fields filled out will be included in all computations”); excluded data (“records with actual travel time less than [two minutes] of the minimum time between a certain OD pair will be excluded from all computations”); performance measures such as percentage of trips on-time for a given OD pairs: and station pairs with five or more trips in the previous three month. This approach may appear onerous; however, it results in a documented process that identifies inputs, outputs, functions, assumptions, responsibilities, and success criteria. Data Integration—WMATA’s Thoughts WMATA’s advice is “do not integrate your data until you know your business requirements” (E. Durham and K. Vamsi, Business Intelligence Group, WMATA, personal communication, April 6, 2016). WMATA staff offered these four observations: • Work with a knowledgeable data customer. WMATA’s data integrators work closely with their internal customers to make the best choices for integrating data sources. There might be competing sources for data at different resolutions and levels of precision. A knowledgeable customer will know which data set will have the highest quality, best reliability, most timeliness, and easiest availability. • Understand how to test and validate integrated data. The last part of the integration process, particularly when the process is based on inferences, is to validate the algorithms used to integrate data sources. Comparing the algorithm results to observations (“truth”) is not always possible, and it is often labor-intensive. Data integrators can test multiple algorithms, and if they have similar results, use them to validate each other. The validation methods will vary depending on the downstream user’s business objectives and requirements. • Create repeatable processes. Many times, researchers will extract data from different sources and manually build the integration. This approach works for one-time research projects, but not for organizations where operational data must be reviewed and reported on a more frequent basis. As demonstrated by the WMATA staff’s cleaning and integration processes, these efforts are time-consuming and labor-intensive. Even when vendors are hired to integrate data sets, best practices for ETL development, results validation, and integrated data set storage are still necessary to ensure quality and repeatability. WMATA staff have integrated universal, enter- prise formats, which are incorporated through data governance procedures so the data are accessible to all. • Stage data to be presented effectively. For fast and efficient use, WMATA stages all the necessary information used to answer a question or respond to a query. If data needs to be seen in more detail, WMATA creates separate but linked data tables to store the more detailed information. WMATA Performance Analysis and Visualization WMATA is using fare data for the usual revenue and ridership assessments, but it also uses the data to analyze customer and performance behaviors. For example, fare data are used to understand long- term trends, inform policy, and measure performance, as well as learn about customers’ usage such as “riders’ travel patterns, journey times, and much more” (Antos 2014). These analyses are better understood when data from multiple sources are displayed using innovative visualizations. As one

66 of the few North American agencies that collects customer station entry and exit data, WMATA’s information by location is exceptional. The planning group has made great use of the available data, having published studies based on its SmarTrip fare data on topics such as: • Metrorail ridership by day (see Figure 30). The visualization showed more than 3,000 individual data records representing more than one trillion rail trips over six years. • Metrorail ridership during holidays that fall on weekdays compared with normal weekday schedules. • Metrorail ridership by origin, destination, day of week, and quarter-hour intervals over the course of several years. • Metrorail transit delay calculation, and impact of transit delays based on fare OD pairs (see Figure 31). The study results will provide general travel time information to all riders, and may eventually be provided to a registered user on their account to show their trip travel time history. • Customer ridership by time of day usage. Figure 32 shows rider characteristics (related to frequency of use over a month) by period of day. FIGURE 30 Metrorail system ridership by day (excerpt 5 of 9 years) (Source: PlanItmetro).

67 FIGURE 31 Metrorail transit delay Calculation (Source: PlanItmetro). FIGURE 32 Customer base ridership by time of day (Source: PlanItmetro). The Metrorail transit delay information based on fare data (tap in and tap out) as shown in Figure 31 shows the variation in travel times between a single origin-destination pair. This data will be made available as a performance metric on WMATA’s website to show the top traveled OD station pairs, and the impact of delays in the service. Although the data on transit delay provides a snapshot into the impact of different types of delays, it does not show the impact for each customer. For example, most commuters travel round-trip between two station pairs during their frequent commutes. In a new initiative, Metro will generate a performance metric for each SmarTrip registered customer’s travel time history between his/her most frequented OD pairs. In the time of day study (Figure 32), riders were sorted by their frequency of use. The study revealed that the “infrequent,” “occasional,” and “rarely used” riders composed the majority of midday trips. During this time, one could not distinguish among categories of user because limited use fare media were available. Since the elimination of the limited use of paper media in January 2016, WMATA planners believe that riders will pay for, keep. and use their plastic cards; and the data from the fare transactions will better reflect travel behavior and frequency of ridership.

Next: Chapter Eight - Conclusions and Suggestions for Future Research »
Multiagency Electronic Fare Payment Systems Get This Book
×
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

TRB's Transit Cooperative Research Program (TCRP) Synthesis 125: Multiagency Electronic Fare Payment Systems describes the current practice, challenges, and benefits of utilizing electronic fare payment systems (EFPS), such as smart cards. This synthesis reviews current systems and identifies their major challenges and benefits; describes the use of electronic fare systems in multimodal, multiagency environments; and reviews next-generation approaches through existing implementation case examples.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!