National Academies Press: OpenBook

Open Data: Challenges and Opportunities for Transit Agencies (2015)

Chapter: Chapter Eight - Conclusions

« Previous: Chapter Seven - Case Examples
Page 69
Suggested Citation:"Chapter Eight - Conclusions ." National Academies of Sciences, Engineering, and Medicine. 2015. Open Data: Challenges and Opportunities for Transit Agencies. Washington, DC: The National Academies Press. doi: 10.17226/22195.
×
Page 69
Page 70
Suggested Citation:"Chapter Eight - Conclusions ." National Academies of Sciences, Engineering, and Medicine. 2015. Open Data: Challenges and Opportunities for Transit Agencies. Washington, DC: The National Academies Press. doi: 10.17226/22195.
×
Page 70
Page 71
Suggested Citation:"Chapter Eight - Conclusions ." National Academies of Sciences, Engineering, and Medicine. 2015. Open Data: Challenges and Opportunities for Transit Agencies. Washington, DC: The National Academies Press. doi: 10.17226/22195.
×
Page 71
Page 72
Suggested Citation:"Chapter Eight - Conclusions ." National Academies of Sciences, Engineering, and Medicine. 2015. Open Data: Challenges and Opportunities for Transit Agencies. Washington, DC: The National Academies Press. doi: 10.17226/22195.
×
Page 72
Page 73
Suggested Citation:"Chapter Eight - Conclusions ." National Academies of Sciences, Engineering, and Medicine. 2015. Open Data: Challenges and Opportunities for Transit Agencies. Washington, DC: The National Academies Press. doi: 10.17226/22195.
×
Page 73
Page 74
Suggested Citation:"Chapter Eight - Conclusions ." National Academies of Sciences, Engineering, and Medicine. 2015. Open Data: Challenges and Opportunities for Transit Agencies. Washington, DC: The National Academies Press. doi: 10.17226/22195.
×
Page 74

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

70 • The top four reasons for not providing open data are: – Too much effort to produce the data/not enough time or people to do the work required; – Too much effort to clean the data; – Concern that the agency cannot control what some- one will do with the data; and – Concern regarding the accuracy of the data. KEY FINDINGS In summary, based on the literature review, the responses to the questionnaire, and the case examples, there are four key findings of this synthesis project: 1. Although the costs of providing open data are not well understood, the benefits to the agency, public, and com- munity strongly support open transit data. The avail- ability of open transit data encourages innovation that could not be accomplished solely by agency staff. The rapid creation of new mobile and Internet platforms, requiring new information technology (IT) develop- ment, would create a strain on typically limited agency resources. By focusing the limited resources on provid- ing accurate, reliable, and timely open data, an agency can cost-effectively provide its information to the pub- lic, relying on third parties (e.g., application develop- ers) to create customer applications and conduct data analyses. The overall benefits experienced by survey respon- dents included the following: • Increased awareness of the agency’s services; • Empowerment of customers; • Encouragement of innovation outside of the agency; • Improvement in the perception of the agency (e.g., openness/transparency); • Provision of opportunities for private businesses; • Encouragement of innovation internally; • Improvement in our market reach; • Greater efficiency and effectiveness as an agency; • Increased return on investment (ROI) from exist- ing web services; • Cost savings; and • Ability to reassign staff. The Moscow case study shows the power of open data to change decision processes. SUMMARY OF PROJECT SCOPE The primary purpose of this synthesis is to determine transit’s experience with open data, how agencies have opened their data, and the uses of the data. A survey was used to collect key information about open transit data and was sent to 67 transit agencies around the world. There was a 100% response rate. Of the 67 surveys received, three were from Canadian agencies and 14 from European agencies. The project examined and documented the state of the practice in open data using the following five elements: • Characteristics of open transit data – Reasons for choosing to provide open data – Standards and protocols for providing open data – Underlying technology used to generate the open data • Legal and licensing issues and practices – Legal and licensing issues – Public disclosure practices • Uses of open data – Applications – Decision-support tools – Visualizations • Costs and benefits of providing open data • Opportunities and challenges – Techniques for engaging users and reusers of data – Challenges associated with providing open data – Impacts on transit agencies and the public and pri- vate sectors The project was conducted in four major steps as follows: • Literature review; • Survey to collect information on a variety of factors; • Analysis of survey results; and • Interviews conducted with key personnel at agencies that have experience with open transit data. This section of the report contains the project’s findings, lessons learned, and conclusions. PROJECT FINDINGS Key statistics from the study are as follows: • Fifty-seven or almost 83% of the survey respondents provide open data; and chapter eight CONCLUSIONS

71 2. Engaging application developers, other data users, and customers is an approach that can accomplish several critical tasks, including: • Obtaining feedback on data anomalies and data quality issues; • Ensuring that some portion of the applications developed by third parties meet the needs of cus- tomers; and • Learning more about how people want to use/ reuse agency data. There are several ways to engage developers and customers. Results of the survey indicated that the most effective methods are conducting face-to-face events, conferences, and “meetups.” Meetups are infor- mal meetings to discuss particular topics, such as app development. For example, Mobility Lab in Arlington, Virginia, hosts meetups to discuss transportation issues and support programmers who are interested in transit; biking and walking; and open data, data visualization, and mapping. 3. The results of the literature review and survey indicate that standards and commonly used formats are to be used to facilitate the generation and use of the open data. The literature discusses how standards are used to gen- erate the open data, such as the case of many scheduling software packages providing schedule data in General Transit Feed Specifications (GTFS) format. Further, using standards makes it easier to transfer applications from one agency to another, which was the case when Worcester Regional Transit Authority (WRTA) was looking for applications; it was easy to take applica- tions that were developed for Chicago Transit Authority (CTA) and adapt them for WRTA because of the stan- dards used. Without standards, planning and operations analyses, such as those described by Wong (19) and Catalá et al. (75), could not be accomplished easily. 4. Opening transit data results in innovation that could not be accomplished within a transit agency. That is not to say that the intellect does not exist in a transit agency— it is an issue of having sufficient resources to develop applications and conduct analyses at the scale that can be done in an open market. Stephen Goldsmith, in the article “Open Data’s Road to Better Transit” (102), men- tions that “some members of the American Public Trans- portation Association believe that open data initiatives have catalyzed more innovation throughout the industry than any other factor in the last three decades.” FINDINGS BASED ON FIVE ELEMENTS Specific findings based on the aforementioned elements are as follows. Characteristics of Open Transit Data As expected, the top three types of open data are routes, schedules, and station/stop locations. This result correlates directly with the high use of GTFS, which requires these data elements, among others. Further, these data types are required to perform trip planning, which is the subject of many customer applications developed using open transit data. The next most common type of open data are real- time information, which is provided by more than half of the survey respondents. This corresponds to the use of either GTFS-realtime or Service Interface for Real Time Informa- tion (SIRI) by almost half of the survey respondents. The most prevalent underlying technologies that produce open data are scheduling software, geographic information system (GIS) software, computer-aided dispatch (CAD)/ automatic vehicle location (AVL), and real-time arrival pre- diction software. This finding is expected, given the types of open data reported by survey respondents. The overwhelming reasons for opening transit data are related to customer information—increasing access to this information, and improving the information and customer service. This result corresponds with almost all of the survey respondents indicating that providing open data is a way to maintain or increase ridership. Improving perception of the transit system and fostering innovation were the next most frequently reported reasons for opening data. The factors that went into the decision about what data to open were driven primarily by the ease of releasing the data (more than half of the survey respondents indicated this). The next two most prevalent decision factors were observing what other transit agencies have done regarding open data and deciding internally without asking any groups outside the agency. A variety of standards and formats are being used, includ- ing GTFS (47 or 83.9% of respondents), Extensible Markup Language (XML) (26 or 46.4%), and comma-separated val- ues (.csv) (18 or 32.1%), followed by GTFS-realtime (15 or 26.8%). The degree of openness in the four categories men- tioned is as follows: • Thirty-two or 57.4% of the respondents reported that the data are completely open (everyone has access). • Forty-seven or 83.6% reported that the data are avail- able in formats that are easily retrieved and processed. • Forty-nine or 87.3% reported that there is no cost for the open data. • Forty-three or 79.2% reported that there are unlimited rights to use, reuse, and redistribute data. Legal and Licensing Issues and Practices Twenty-nine or 50.9% of the survey respondents reported that their agency requires a license or agreement to use the open data. The top three elements that license agreements cover are the right to use the agency’s data; nonguarantee of data availability, accuracy, or timeliness; and liability limitations

72 for missing or incorrect data. Almost 60% (16 responses of 27) of respondents indicate their agency requires acknowl- edgment of a license agreement before data can be accessed. Only one respondent reported agency legal issues resulting from the release of open data to the public. According to the respondents, the top three steps that agencies took to publicly disclose data are to (1) convert tran- sit data into formats suitable for public use; (2) improve data quality to ensure accuracy and reliability; and (3) adopt an open, nonproprietary data standard. Uses of Open Data The top five types of customer applications that have been developed as a result of providing open data are (in descend- ing order of frequency) trip planning, mobile applications, real-time transit information (arrival/departure times, delays, detours), maps and data visualization. The top five decision- support tools that have been developed are data visualization, service planning and evaluation, route layout and design, performance analysis, and travel time and capacity analysis. Almost two-thirds (33 or 63.5%) of respondents reported their agencies do not track usage of open data. The two most prevalent methods of tracking are to monitor data downloads and to keep track of applications developed. For mobile appli- cations, an equal number of respondents reported Android and iOS applications. Sixteen respondents reported a total of almost 266 million API calls per month. Costs and Benefits of Providing Open Data The top five types of costs associated with providing open data are staff time to update, fix, and maintain data as needed; internal staff time to convert data to an open format; staff time needed to validate and monitor the data for accuracy; staff time to liaise with data users/developers; and web ser- vice for hosting data. Almost 90% (43 or 89.4%) of respon- dents could not quantify how much time is spent on any of these activities. Although activities required to provide open data were identified by respondents, resource requirements varied widely. There was limited information regarding the actual labor required from specific staff in the organization or the costs associated with open data. Finally, the top three benefits experienced by survey respondents are (1) increased awareness of their services, (2) empowerment of customers, and (3) encouragement of innovation outside of the agency. Almost 70% (33 or 69.6%) of the respondents engage or have a dialogue with existing and potential data users and reusers. Twenty-five or 75.8% of the respondents engage data users and reusers to obtain feedback on data anomalies and data quality issues. Twenty- four or 60% of the respondents use face-to-face events to engage these groups. Opportunities and Challenges In terms of impacts on the agency and the public and private sectors, the majority of impacts reported by respondents were positive. The organizational impacts on the agency ranged from increased transparency to better and more accurate internal data to lower costs to provide information. Impacts on the customer were numerous, including better and more accessible information for customers; better perception, vis- ibility, and awareness of services; and improved customer satisfaction. The majority of negative impacts were related to resources needed to maintain an open data program. In terms of impacts on the public, creating and improving access to additional and higher quality public services was mentioned, along with improving public perception/image of transit, making transit more competitive, providing better regional coordination of services, encouraging innovation, and providing a better transit experience. The impacts on the private sector are primarily providing business/commercial and development opportunities, includ- ing new and expanded companies (e.g., creating a new eco- system of private entrepreneurs), enabling innovation and the creation of applications that may not have been created by the public sector, and adding value to existing public services. Challenges were noted by survey respondents in five areas, as follows: • Resources and organizational issues, which largely con- sist of limited resources and securing support for an open data program; • Data quality and timeliness issues, which largely describe having to ensure data quality, completeness, timeliness, accuracy, and equity; • Standards and formatting issues; • Marketing issues relating to making the open data known and addressing branding issues; • Technical issues, which consist of tracking users, including who has built what apps and how success- ful they have been; complying with developers’ wishes; how to provide large amounts of data in a timely man- ner; and having a process in place to make the data available when new schedules are released. LESSONS LEARNED Respondents cite the following lessons learned: • Data quality and accuracy – Put quality checks in place when opening data – Test the data before releasing it to developers

73 – Start small in terms of the amount of open data offered and then grow that when confident of data quality of new sources/data sets – Ensure data are compatible with or identical among the different formats in which they are made available • Cost of open data – Staff to support an open data program is needed to implement such a program – Use standards to make it easier to provide open data – Select a technology vendor that supports open data or require it in the contract with the vendor • Organizational and institutional effects, including changes within and external to the agency – Agencies have to get comfortable with providing data when they are accustomed to providing only transit service. – Open data will not solve every customer requirement. – Customers will recognize which third-party services are most effective, and they will not hold the agency responsible for poor third-party services. – It is important that agencies not interfere with the market to ensure that the benefits of competition can be realized. – An open data program should be supported by a proj- ect champion. – Staff roles must be carefully assigned. – Buy-in from coordinating agencies is crucial. – Open data are a fundamental part of an overall infor- mation system. – Agencies must ensure that data reuse complies with public policies. • Developing relationships with and engaging data users and reusers: – Early engagement with potential users is key. Find out what they want and how they want it. Try and track who is developing what, particularly to under- stand the successes and failures. – Respond quickly to opportunities. Developers work on much shorter schedules than planners. – Make it as easy as possible for developers to access data, and make the license understandable. – Developers will help determine the quality of the data if the agency provides a forum for this type of feedback. – Developers will know the latest mobile platforms and can use these with the data. CONCLUSIONS Several conclusions can be drawn from the results of the synthesis. • The benefits of providing open transit data far outweigh the costs. Benefits accruing to the agency itself, custom- ers, and the public and private sectors are far-reaching. Several of the survey respondents discussed using open data as a way to improve their agency’s ability to conduct analyses internally and the perception of public transit within the community. In addition, the agency’s trans- parency as a result of open data has had more positive than negative effects. In a time when public agencies are being scrutinized more than ever, providing data about their operations and internal processes reflects over- coming the old thinking that data should not be released beyond providing paper schedules. – The impacts of open transit data on customers and the general public are significant. Now customers (and those who have not yet taken transit) have free tools that essentially break down the barriers to using transit, such as interpreting a paper schedule or map. Further, real-time information makes it easier to plan trips. In addition, the tools resulting from open data satisfy the desire many people have for obtaining travel information almost instantaneously. However, one important factor in assessing the customer and public impact is ensuring that the tools being devel- oped actually fulfill the customers’ needs. – The impacts on the private sector have been encour- aging over the past several years. Applications and visualizations that perhaps could not have been con- ceived or developed by a transit agency have been created. These apps have changed the nature of travel, where in some cases, the public transit option is more prominent and understood. Further, this has resulted in businesses being established that may have not existed if not for open transit data. Finally, developers are creating innovative ways in which to analyze the data, resulting in potential improvements in service. • The legal fears often thought to be barriers to opening transit data have not been realized. The survey results show that only one agency responding to the survey experienced any legal issues resulting from the release of open data to the public. The literature, survey results, and case examples indicate that simple agreements with data users and reusers can accomplish what is needed to ensure proper use and distribution of the data, along with rules regarding the use of logos, images, and so forth. In addition, as stated in the survey responses, having a plan in place to handle irresponsible users is critical. For example, several techniques for managing irresponsible users included contacting developers and discussing the problem, and limiting or terminating access to the data. • Standards greatly facilitate the use of open transit data. Although this sometimes requires additional effort in producing the open data, it makes it much easier for the data to be used. Clearly, from the literature review and survey, GTFS has become a de-facto standard, with at least 726 agencies using it. Further, it is being used, as reported, in a number of agencies that are just begin- ning to open their data, particularly those that have provided only paper-based data (e.g., schedules) for fixed-route services (and in some case, no information at all about other services). In addition, the use of stan- dards has facilitated traditional planning and analysis

74 of transit data, as reported extensively in the literature. Further, even vendors with proprietary products have developed “translators” that reformat the data within their software to one of the standard formats. Finally, standards are still evolving. Open standards, such as GTFS, OpenStreetMap, and OpenTripPlanner, have led the way in the transit industry and are being used extensively to create new applications. • Engaging with data users and reusers has the potential to increase the value of the applications and visualizations that are developed. Engaging with developers and the public will ensure that developers are taking custom- ers’ needs into consideration. Further, there are many different ways to engage users. In addition, the survey responses indicate that methods of engagement might be based on the sophistication of the agency in terms of open data. • Several factors lead to a successful open data program. – Obtaining and maintaining management-level sup- port for such a program and avoiding bureaucratic delays. This factor speaks to embracing transparency, realizing that transit will be more visible in the com- munity and that there is the potential to improve the perception of transit as a result of providing open data. – Recognizing the need for the appropriate level of resources needed to provide and maintain open data. – Establishing ways to monitor data accuracy, timeli- ness, reliability, quality, usage, and maintenance is a key component of an open data program. Making a decision as to whether each application based on the open data will be tested is part of this factor. Some agencies let the market decide if an application is good or not, and others test each application. – Creating and maintaining licensing or registration to ensure that if a data user or reuser is misusing the data, action can be taken with minimal effort. As suggested by Bay Area Rapid Transit (BART) and Massachusetts Department of Transportation (MassDOT), a license or registration should be simple, conveying the basic principles associated with using the data. – Having an ongoing dialogue with developers and customers regarding the open data program has been shown to increase the value of the data and products that are based on the data. SUGGESTIONS FOR FUTURE STUDY Based on the survey results and literature review, the follow- ing areas are suggested for future study: • Using open data to support performance measurements. Although the literature covers visualizations that exam- ine transit performance, there is no guidance for transit agencies regarding effectively using open data to perform these types of analyses. • Amount of staff time that is required. This will vary by the volume of data opened and depends on the system architecture. A study that examines just how much time is required by various departments and staff and a dis- cussion of the actual costs associated with an open data program could be helpful. For example, if a tool is used to export the data from a scheduling system, this one- time investment facilitates reuse afterward (effectively lowering ongoing costs). Such study should include examining how much time and resources are spent on public records requests, which sometimes are made when open data are not available. • Guidance describing each step in setting up an open data program. Such a document would contain sections relating to the factors mentioned in the final conclusion and detail about the process that many agencies use to issue open data when there is a change to their data (e.g., when there is a schedule change) and when they provide new data elements. In addition, this guidance could contain items describing various types of engage- ments with data users and reusers. • Guidance to use GTFS to depict nonfixed-route transit services. Although work is being conducted by organi- zations, such as The World Bank and Ride Connection in Portland, Oregon, more guidance would be helpful in this area, given the large number of demand response systems in the United States. • Guidance to create accessible applications. Guidance for application developers could include the key ele- ments of an accessible application. Because this issue has not been directly addressed by the Americans with Disabilities Act (ADA), it is a topic that could be helpful to developers as they are conceptualizing their products. • Open fare system data. Most open transit data are oper- ational in nature, but such data do not always contain fare information because of the sensitive nature of the data. With the proliferation of electronic fare collec- tion, particularly smartcard and mobile fare payments, developing and using open fare data is of interest to transit agencies. • Changing the corporate culture. There is a gap in knowl- edge regarding how to show that open transit data have value to the corporate culture within the transit industry. Further, describing practices that encourage the develop- ment and dissemination of open transit data, rather than hindering them (e.g., bureaucratic processes), would be helpful. • Open transit data from the developers’ perspective. This report is written primarily from the transit industry experience with open transit data. More information is needed from the developers to better understand their needs and concerns. • Policies on app centers. Transit agencies could use guid- ance on the policies regarding making applications avail- able on their websites; this is another gap in the literature. It would be helpful if this guidance included information about how to determine which apps are included in an app center, disclaimers to consider, and so forth.

75 • Visualization and other open transit data tools. There is an evolution of tools that use open data to visual- ize important aspects of transit operations (e.g., per- formance), but there is limited information about these tools. • Open transit data ROI. Although the literature contains information discussing the value of open transit data, there is a gap in describing how to calculate ROI. If open transit data have value, it is helpful to quantify it to factor into decision making about which data sets to open and how to open them. For example, the Open Government Portfolio Public Value Assessment Tool (PVAT), described earlier in the report, can help agen- cies determine the public value of their open govern- ment initiatives. • How to use metadata. Metadata, which describes the characteristics of data, is a critical part of any data set, including open transit data. The use of metadata in open transit data is not covered in the current literature. For example, Metropolitan Transportation Authority (MTA) in New York is conducting a travel pattern proj- ect in which mobile phone metadata are being analyzed to understand trip flows. • Crowdsourcing to combine open transit data with other data and open source software evolution. This is another gap in the current literature. For example, as men- tioned in the report, TriMet created a new open source, multimodal trip planner (OpenTripPlanner) that uses OpenStreetMap (OSM), a crowdsource open data set designed for routing. Guidance to agencies regard- ing the use of crowdsourced data/information and open source software would be extremely valuable as it would help agencies move away from proprietary solutions. • Open transit data as an element of enterprise architec- ture. Information on this topic is lacking in the litera- ture and would be most helpful to transit agencies for accommodating open data as they build or rebuild their IT infrastructure. This includes identifying automation (e.g., automated generation of open data from schedul- ing software), relational database management systems and use of a cloud-based IT framework to facilitate inclusion of open transit data. • Procurement processes that support open transit data and open source software. The literature and transit industry practices are lacking regarding how to procure solutions that support open data and open source soft- ware. Guidance for agencies to procure “best value” solutions that support open data would be helpful to move the industry away from proprietary solutions.

Next: Abbreviations and Acronyms »
Open Data: Challenges and Opportunities for Transit Agencies Get This Book
×
 Open Data: Challenges and Opportunities for Transit Agencies
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

TRB’s Transit Cooperative Research Program (TCRP) Synthesis 115: Open Data: Challenges and Opportunities for Transit Agencies documents the current state of the practice in the use, policies, and impact of open data for improving transit planning, service quality, and treatment of customer information.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!