National Academies Press: OpenBook
« Previous: 1. Introduction
Page 14
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 14
Page 15
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 15
Page 16
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 16
Page 17
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 17
Page 18
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 18
Page 19
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 19
Page 20
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 20
Page 21
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 21
Page 22
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 22
Page 23
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 23
Page 24
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 24
Page 25
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 25
Page 26
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 26
Page 27
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 27
Page 28
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 28
Page 29
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 29
Page 30
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 30
Page 31
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 31
Page 32
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 32
Page 33
Suggested Citation:"2. Phase I Information-Gathering." National Academies of Sciences, Engineering, and Medicine. 2020. Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25855.
×
Page 33

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

8 After feedback from the panel at the mid-project briefing, Dr. Flannagan also attended the Research Advisory Committee (RAC) meeting in Louisville, KY in July 2017. She presented the status of the project at that time and solicited feedback as well as contacts who would be willing to review a draft of the Guide. This additional feedback was invaluable in helping the team better understand the state DOTs’ perspective(s). 1.2.1.4 Translating and Interpreting the Literature and Peer Guidance to State DOTs This stage was pivotal to the success of the project – the translation and interpretation of all we had learned – into guidance for the research community. This was perhaps the most challenging of all stages of the project because it involved the synthesis and integration of a significant amount of data, information and knowledge into a concise, targeted guidance document. The team is grateful to the panel, key individuals in U.S. DOT, and state DOT volunteers for providing early peer review of the draft structure and design of the Guide. Details for the four information-gathering subtasks, including research instruments, sources and populations, are provided in the sections below. 2 PHASE I INFORMATION-GATHERING Phase I involved tapping into a variety of information sources to compile what is known about providing public access to products of research, including both reports and data. The primary source of guidance and the foundation for the research project was the information found on the U.S. DOT Public Access Plan guidance pages (https://ntl.bts.gov/public-access), which are hosted and maintained by the National Transportation Library. The guidance, recommendations and best practices found on these pages were written by a team of U.S. DOT employees drawn from the U.S. DOT Office of the Assistant Secretary for Research and Technology (OST-R). Additionally, the Chief Data Officer, the CIO’s office and individuals from the legal department made major contributions to the content of these pages. The information available on the NTL website provides important explanations of how to comply with the U.S. DOT Public Access Plan requirements, using best data management practices. In Phase I, we approached information-gathering through several mechanisms. These included literature review of both peer-reviewed and non-peer-reviewed (“gray”) literature (Task 1a), stakeholder interviews (Task 1b), review of existing data management plans (DMPs) (Task 1c), and review of existing training materials (Task 1d). The results of these efforts are summarized in Sections 2.1-2.4 below. 2.1 Task 1a: Literature Review 2.1.1 Literature Review Strategy The purpose of the literature review was to understand what is available for state Departments of Transportation (DOTs) to draw from for guidance in implementing a plan for providing public access to federally funded transportation research. Our intent was to be as comprehensive as necessary to identify important scholarship and best practices from the profession – regardless of domain of focus. In doing so, we covered both the underlying theory and rationale for good practices for research product preservation. This

9 included coverage of “gray” literature, which includes technical reports, project final reports, websites, and other non-peer-reviewed sources, as well as published peer- reviewed literature. The “gray” literature represents a large portion of the research publications produced by state DOTs. 2.1.2 Literature Review Sources The literature review included authoritative sources (e.g., commercial databases and Google Scholar), specific journals that pertain to the research products and data management and curation field, important texts in the field, and gray literature sources. Gray literature sources included conference proceedings, meeting discussions, seminar presentations, technical reports, white papers, manuals, guidelines and professional communications. We expected that the gray literature would hold a wealth of practical knowledge experience because of the emerging nature of the topic. The search included Transport Research International Documentation (TRID) for sources of information about current practices in all stages of the life-cycle model. The review panel recommended the literature review not be included in the final published report. 2.1.3 Results of the Literature Review From the full literature search, we retained 773 references for review. The literature review was conducted category by category according to the areas of guidance we anticipated would be in the final product. Table 2-1 summarizes the categories, counts and relevance to transportation for the references found. Of the 773 references in the final review, 577 (74.6%) were from the gray literature. These topics are relatively new and as such are likely to be discussed in white papers, guidance documents, conference proceedings, meetings or internal working documents. The majority of the references were to general practices and contexts, with the exception of research products and data types. In this case, we drew heavily from TRID references.

10 Table 2-1 High-Level Characterization of the Literature Review Area of Focus Number of References General or Transportation Specific General Research Products & Data Sharing 178 Mix Research Product & Data Types 113 Transportation Legal Issues (Citation, Copyright) 92 General Sharing in Collaborative Research Communities 75 Mix Metadata 73 General Compliance, Costs, Quality and Metrics 57 General Digital Registries & Repositories 54 General (By Domain) Research Publications & Data Planning 40 General Research Publications and Data Principles & Policies 34 General Training & Certification 22 General – (By Institution) Research Data Service Models 14 General Life-Cycle Models 11 General Research Publications & Data Archiving- Preservation 10 General Total References Reviewed 773 The greatest number of citations pointed to research product and data sharing. While this is contextually relevant, we see this perhaps as more of a result of good guidance and good practice rather than as a primary guidance factor. From this literature, we learned that research cultures and incentives are important factors. These factors vary across and within organizations. The advent of “big science” and projects that are too large for any one researcher to navigate are shifting the way researchers think about data reuse and sharing. When data sharing is collaborative and simultaneous, researchers can see the benefits of sharing. When data sharing and reuse are longitudinal – and where the benefits accrue distinctly to subsequent rather than to original researchers – it is more difficult to get buy-in from researchers. The second largest number of citations was to research products and data types. Understanding research products and data types is essential to designing a full-service infrastructure – one that supports access and discovery (e.g., registries) and one that supports use (e.g., repositories). The literature also reminded us that good practices for managing operational data may be used to manage research data. While the context varies, the practices translate. The search of TRID surfaced a number of references to applied research by data types. These citations are important because they are transportation-specific. The literature review also highlighted the fact that there are communities and pockets of good practices around data management across the transportation community. The challenge is to blend and extend the good practices that exist regardless of whether they come from an operational or a research context.

11 The third largest number of literature citations was to metadata standards and practices. This is not surprising because this has been an important aspect of management and curation for the past 15 years. The literature review highlighted the fact that there are both general, all-purpose metadata standards, as well as standards that are peculiar to data types and subject domains. From these citations, we learned that while there are core metadata standards for publications (e.g., Dublin Core), the most commonly used metadata standard for research is the Project Open Data Metadata Model. While these metadata standards may support discovery and access, in some instances they are insufficient to support sharing and use. This is the case with particular data types and research domains, where there are domain or format specific metadata standards. One additional area that is very important to guidance and pertinent to key questions raised by researchers and administrators is compliance, costs, quality and metrics. This is a fairly new area of research and practice. In this case, we found that the majority of the research is found in the gray literature and is general rather than specific to transportation. The literature review highlighted gaps for this factor. Gaps are the result of lack of experience and practice. We learned that we can identify basic cost categories, but that actual cost estimates and models come from the experience of organizations that operate repositories. We found 54 references to digital registries and repositories. Again, because this is a fairly recent area of practice–largely concentrated in the last 10 years–the majority of the guidance is coming out of the gray literature. While there are some general discussions of the topic, the bulk of the literature tends to focus on specific domains where specialized repositories are warranted. One of the important lessons learned here are the evolving standards for quality assessment and certification of repositories. The literature suggests that institutions with repository infrastructures are self-certifying. Like the digital repository category, citations on research products planning mostly came from the gray literature and was general rather than domain-specific. From the literature, we would estimate the research on this area to be within the last three to five years. While there are plans referenced in the gray literature, we learned that the challenge lies in operationalizing those plans into processes and ensuring adoption at the individual researcher level. Within the academic environment, there seem to be particular communication gaps between researchers, intermediaries and administrators. We expect this will be the case in all types of organizations. Closely aligned with planning are principles and policies. It is not surprising that the literature here is very recent and somewhat limited. Policies and principles are emerging as a result of the need for planning. However, they appear to follow published plans rather than to precede them. Because of the institution-specific and applied nature of training and certification, we found few formal references in the peer-reviewed literature. In fact, most of what was surfaced in the literature review came from interviews with peer institutions. It appears that there are a few good practice models that most institutions tend to follow. Certification models and practices are also evolving, and although they are widely known, there may not yet be sufficient experience to published peer-reviewed results. Finally, there were only a small number of articles on archiving and preservation, life-cycle models, and service models. This is due to the fact that few organizations have sufficient experience with data repositories and registries to be able to speak

12 authoritatively to these topics. There are a few exceptions, including University of Michigan’s Inter-university Consortium for Political and Social Research and the UK Data Archive. 2.2 Task 1b: Stakeholder Interviews and Survey 2.2.1 Interview Goals and Objectives The research team cast a wide net across the federal government and other data- intensive organizations. It was clear from the results of the literature review that the published literature–whether formal peer-reviewed or gray literature–did not address all of the areas of guidance. In addition, it was not clear how far departments and agencies had progressed in operationalizing approved public access plans and policies. Our expectations were that any department or agency that had a publicly available and approved plan would also have some operational experience of value to state DOTs. Our intent for the interview process was to learn who had done what, what they had learned in the process, what good practices they could share, and what they saw as future needs and next steps. 2.2.2 Interview Strategy The first step in the interview process was the preparation of an interview guide and a set of key questions. While the guide and questions provided a framework that ensured we covered similar ground across organizations, each interview was free form and conversational. We assembled a list of federal agencies and departments for the NCHRP panel to review and approve. We identified contacts at each of those organizations and invited every institution to an interactive interview, which was either in-person or virtual depending on their geographical location. Those organizations that accepted the invitation were scheduled for an interview that lasted from 60-90 minutes. The interviews were conducted by two to three members of the research team. Some organizations were not available for interviews or were in the process of transitioning new people into roles. In those cases, we were pointed to publicly available materials. Each interview was recorded and transcribed for working purposes only, to allow the research team to participate in and focus on the conversations. The research team experienced a low initial response rate for two stakeholder groups: state DOTs and transportation researchers. To fill these gaps, we developed an online survey for distribution to those stakeholders. While these were not formal interviews, they provided some input from key stakeholder groups. The data collection form was designed to follow the interview guide. 2.2.3 Sources Interviewed The research team conducted 12 interactive interviews and reviewed detailed materials from an additional four organizations. The organizations are listed below: Interactive Interviews with Organizations ● Department of Defense ● Department of Energy ● Department of Interior

13 ● Dept. of Transportation Chief Data Officer ● Dept. of Transportation Data Curator ● National Science Foundation ● National Institutes of Health (NIH)/NLM ● National Oceanic and Atmospheric Administration (NOAA) ● Smithsonian Institutions ● USGS ● Veterans Administration ● World Bank Materials Reviewed from Organizations Not Available for Interviews ● Dept. of Agriculture ● Department of Energy ● U.S. Fish and Wildlife Service ● U. S. National Parks Service Of these organizations, USGS, NOAA, the Veteran’s Administration, the Department of Energy and the National Science Foundation have the most developed programs. 2.2.4 Observations from the Interviews We offer three general observations from the interviews and surveys, as well as observations about the guidance that state DOTs could derive from individual organizations. 2.2.4.1 Observation 1: Different Stages of Development of Plans and Policies. While each of the organizations may have a published and fully- or partially- approved or final plan, they are all at different states of implementing those plans. The variation extends to state agencies within organizations as well as across organizations. For example, we would have expected NIH/NLM to be the most advanced given their history with PubMed. However, at the time of the interview NIH/NLM were at the point of issuing a Request for Information about the state of operationalization across organizations. USGS appears to be the most advanced of agencies in transitioning from their historical access models to the newly mandated plan. USGS, though, is not representative of all of the Department of Interior, which does not have a plan, but instead has defaulted to each individual service. There appears to be a variation in practices across those services as well. This is also the case for the Department of Defense, where the Air Force appears to be ahead of the other services. 2.2.4.2 Observation 2: Importance of the Department or Agency’s Public Access Mandate. While each organization may have a published plan that has been approved by the White House Office of Science and Technology Policy (OSTP), how those plans are operationalized may be influenced by the nature of the organization’s mandate to provide access. For example, the U.S. DOT has a distinct advantage in that the NTL was a “born digital” organization with a mandate to provide access to transportation information. On

14 the other hand, the Department of Defense is guided by the Code of Federal Regulations in making basic and applied research information available. While we may draw guidance from other organizations, it is important to understand the context when interpreting that guidance. 2.2.4.3 Observation 3: Factors Influencing the Value of Guidance to State DOTs Across all interviews and review of materials, we noted a recurring set of factors that influence the ease or difficulty of translating the plan into an operational process. We summarize those factors below. 1) Whether there is a robust historical process in place to manage publications. A long and successful history of making research available also correlates with the availability of training and support services. Historical foundations, such as those at USGS and NIH are at considerably different stages than U.S. DOT in implementing an approved plan. Many of the organizations we interviewed use the RPPR – Research Projects Performance Report – an annual tool that captures and reports on the results of research funded. This is a formal way of tracking and accounting for research funds and their associated products. How does this translate to guidance for state DOTs? While the U.S. DOT is younger than other departments and agencies, it has an advantage in the digital infrastructure established for the NTL. The NTL can provide the infrastructure to state DOTs that other organizations may lack. 2) Whether researchers are intramural, extramural or a mix. Tracking and accounting for research products and data is less complex in organizations where research is conducted by intramural researchers (e.g., Dept. of Energy national laboratories, Smithsonian Institution). Intramural research within a Department ultimately has one administrative authority. Environments of mixed intramural and extramural research involve multiple levels of administration and different tracking models and present a different set of challenges. Perhaps the most challenging environments are those where research is conducted by subcontractors who are beyond the reach of the funding agencies. This is the case in many organizations, but particularly where contract research is conducted by academic institutions. How does this translate to guidance for state DOTs? Agencies such as USGS, NOAA, Department of Agriculture and National Science Foundation (NSF) have comparable administrative and research environments with both intramural and extramural research being conducted with federal dollars and may be useful as models for tracking mechanisms that extend beyond intramural research. 3) The involvement of intermediaries in providing support services. Where libraries, national laboratories, data centers, information officers, and Institutional Research Boards are available to provide guidance and support services there seems to be a more comprehensive capture of research results. Good practice examples here might be the USGS, Department of the Air Force, and the Department of Agriculture.

15 How does this translate to guidance for state DOTs? There are few good practice examples here other than USGS, Department of Agriculture, and the Department of the Air Force. The challenge for all services in the Department of Defense is the reduction and consolidation of Army Libraries across the country, and elimination of the position of the Navy Librarian. The U.S. DOT has facilitated the development of transportation librarian and knowledge networks across the country. In addition, some state DOTs have information officers and librarians. 4) The complexity of types of research products and data types and formats. Almost every organization we interviewed or whose materials we examined had a complex set of research products and data. Some research data types were more complex than others, though. For example, NOAA manages very large data sets of weather data sets, data sets that are continuous and are maintained for long periods of time. USGS also maintains large climate science data sets. The Department of Energy has one of the most advanced repository models though practices may vary across the national laboratories. The Veterans Administration has complex medical and individual health records with highly sensitive personal data. How does this translate to guidance for state DOTs? From a registry and repository infrastructure design perspective, state DOTs may draw guidance from all of these organizations. It is important, though, to remember that the literature review surfaced many good practices for capture and storage by types of transportation. Before we look to other organizations, we consider what exists within the transportation community. 5) The demand for access to research products and results by the research and the business community. Those organizations which appear to be the most advanced in implementing their approved plans have a long-standing demand for access to their research. For example, the USGS has been meeting the demand for access to their minerals and land use data since the 1930s when they established their first distribution centers, and when they moved those print distribution services to a digital platform in the late 1990s. How does this translate to guidance for state DOTs? Organizations clearly leading the way here are USGS, Department of Agriculture, Department of Education and NOAA. At the state level, our observation is that such demands exist and may be satisfied by the NTL’s guidance on metadata and inclusion in the NTL digital library. 6) The nature of the research community itself, including existing patterns and cultures surrounding exchange of research and use of others research. Where the culture of the community and the nature of the science is collaborative, existing mechanisms and incentives to provide access and use others research exists. Where the science is individual and competitive, cultures and incentives may be lower. Ultimately, compliance is in the hands of individual researchers to ensure that their research products and data are made available and usable.

16 How does this translate to guidance for state DOTs? Of all the organizations interviewed, USGS, Department of Agriculture and NOAA appear to have similar research communities – a mix of theoretical and applied a mix of big science and focused research – and a similar mix of collaborative and individualistic research cultures. 7) The alignment of products and data used for daily operations of institutions, and those that are generated for research. Where the alignment is close, there are opportunities to leverage best practices managing both. This is an interesting issue – some organizations fund research that they may not implement, but make available to others to implement. The Department of Transportation funds research that has value to states, to private industry, to public citizens. There are some important overlaps in the types of products and data that are generated through normal transportation operations and the type of products and data generated through research. How does this translate to guidance for state DOTs? For this factor, we would look for comparability and guidance to the USGS, to NOAA and to the Department of Agriculture in particular. If NIH and the Department of Defense were further along in implementing their policies, they would be good sources of guidance as well. No single organization we interviewed presents a context that is clear and distinct on all of these factors. No single factor is a determinant of success. Every situation is a variation, and even situations that have similar conditions for a single factor will vary in practice. We summarize our evaluation of which organizations are using best practices for each area in Table 2-2. These organizations should be the first models to turn to in each area. Table 2-2 Organizations with Best Practices for Life-Cycle Stages Area of Focus Best Practice 1 Research Publications & Data Planning USGS. Agriculture Compliance, Costs, Quality and Metrics USGS, Energy Research Publications and Data Principles & Policies USGS, NOAA Life-Cycle Models USGS Research Product & Data Types USGS Training & Certification USGS Digital Registries & Repositories USGS, NOAA Metadata Project Open Data Model Research Publications & Data Archiving- Preservation USGS, Energy Research Data Service Models USGS Legal Issues (Citation, Copyright) USGS, NOAA, Agriculture General Research Products & Data Sharing USGS, NOAA, Agriculture Sharing in Collaborative Research Communities USGS, NOAA, Agriculture

17 In general, while there are lessons to be learned from every organization we interviewed, it is clear that USGS is the most advanced in implementing their plans and policies. It is also the organization that is most similar to U.S. DOT in terms of administrative structures, funding sources and mechanisms, complexity of research products, research communities and cultures, and mix of theoretical and applied research. 2.2.5 Survey Results The survey was sent to two groups of stakeholders–those who attended a “speed dating” networking event at the TRB annual meeting and members of TRB committees that are related to data in some way. We received 35 responses from stakeholders who were from a variety of institutions listed in Table 2-3. Where appropriate, these were grouped as shown in the third column. The survey itself is provided in Appendix A Survey Contents, and basic survey responses are provided in Appendix B Survey Responses. Table 2-3 Primary work sector of survey respondents Work Sector Number of Responses Category U.S. DOT 4 Federal Agency US Fed 1 State DOT 11 State/Local Agency Local/Regional gov't 3 Academia 3 Research Organization Research Institution 3 Industry 1 Contractor/consultant 7 No response 2 Missing A key observation from the survey is that transportation research is characterized by its variety, both in funding sources and in data types produced. This level of variety presents particular challenges for preserving all types of products of transportation research, making them discoverable, and making them usable. A second observation is that researchers currently have too little support for data management practices (Figure 2-1) and are storing data in ways that put those data at risk of being lost (Figure 2-2). Forms of institutional support for data preservation vary by the type of organization (Figure 2-3). For example, researchers in federal agencies and to a lesser degree, state agencies, tend to have dedicated data managers, administrative offices, libraries and research support units. In contrast, those in research organizations tend to use colleagues and IT units for support.

18 Figure 2-1 Availability and sufficiency of policies and support for data management

19 Figure 2-2 Data storage locations reported by survey respondents

20 Figure 2-3 Sources of support for data management within respondents' organizations When asked why they did not share data (if they did not share), respondents in research organizations most commonly indicated either that they did not have the right to make the data public, that they needed to publish first, or that the data should not be available. Those from state/local agencies were more likely to cite lack of time or lack of procedures, as well as not having the rights. Finally, when asked for their general views on data sharing, researchers from federal agencies and research organizations indicated their willingness to share data without restriction and held the view that research data can have uses beyond the original project. However, they also had concerns about misinterpretation in the absence of a good codebook, Researchers in state DOTs showed less agreement in their opinions.

21 Perhaps the most telling result from this question was that nobody indicated that they were satisfied with their current ability to integrate and access data from other sources. These current attitudes provide a base for the much-needed development of sharing culture within transportation. 2.3 Task 1c: Data Management Plans 2.3.1 Data Management Plan Review Strategy The project team approached review of data management plans (DMPs) in two ways. First, the literature review provided some background on DMPs, their purpose and their use. Second, the project benefited from a parallel project funded by NSF to review DMPs and their use. A summary of our findings is presented in the next section. 2.3.2 Data Management Plan Review Summary Data management plans, or DMPs generally contain five sections: 1. A description of the data to be generated in the project 2. A description of the standards that would be used in developing and structuring the data 3. A policy statement about how others would be able to gain access to the data 4. A policy statement about what others would be permitted to do with the data 5. A statement on how the data would be archived. Studies done on how researchers have responded to the DMP requirement reveal that many researchers do not appear to have an understanding of the intent behind the requirement or how to respond to what is being asked of them. Other researchers have pushed back against the requirement as yet another mandate made by the government that distracts them from engaging in actual research, or that requiring data to be shared will lead to misunderstand and misuse of the shared data which in turn will hinder rather than help advance research1. Changing the culture of how research is practiced in any field will take time as well as dedicated resources, support and education. Efforts to promote culture change will need to take into consideration how research is currently practiced, be able to clearly articulate measurable benefits for individual researchers and for the field as a whole, recognize potential near and long-term costs or drawbacks, identify the gaps between current practices and desired outcomes, and provide actionable plans to close these gaps. Data management plans are only one piece of the effort to enable research data to be shared with others in ways that retain or enhance its value, but they are a critical component. Reviewing data management plans can reveal how well researchers understand what they are being asked to do and how prepared they are to do it. Recent studies indicate that there is work to be done on both fronts. A 2012 study surveyed principal investigators (PIs) of grants submitted to the NSF (Steinhart et al., 2012) to understand their response to the new DMP requirement. These PIs reported widespread uncertainty over the (then) new requirement and what actions needed to be taken to adequately 1 http://www.nejm.org/doi/full/10.1056/NEJMe1516564

22 respond. The survey results also revealed that researchers were seeking guidance and assistance in helping them develop DMPs. Later studies on DMPs and data management practices provided confirmation of these early findings and have provided some additional details on researchers’ understandings of data management requirements and their strategies for responding. Librarians at Oregon State University surveyed researchers in 2015 on their data management practices and discovered that research assistants handle the majority of data management related tasks with the exception of data sharing which is addressed by the PI (Whitmire et al., 2015). This indicates that there may be a need to expand the scope of outreach and training efforts beyond the researcher him or herself. In addition to confirming that the types and formats of DMPs vary widely by discipline, they noted that the activities taken to support managing and working with data varied as well. Finally, they noted with surprise that a large percentage of faculty were managing their own data servers rather than using university based data storage services. The comments they received indicated that researchers were simply unaware that the university provided data storage services. A study of DMPs written by researchers at the University of Michigan reached similar conclusions. DMPs appear to be written solely for the purpose of submitting a grant proposal rather than being seen as living documents to define and guide the data management process to be followed throughout the project (Carlson, 2017). Though the NSF DMP requirement has been around for six years, these studies indicate that many researchers are still unaware of what is expected of them or how to respond. This is despite efforts from funding agencies, librarians and others to develop resources and support for researchers to raise awareness of the requirement and to guide them through the process of creating an effective DMP. The DMPTool in the United States and the DMPonline website in the United Kingdom provide resources to walk researchers through the process of developing a DMP for their research. The Libraries at the Massachusetts Institute of Technology have produced an extensive guide for researchers on how to manage, share and preserve data2. Other academic libraries have followed suit and produced guides on managing data for researchers at their institution. Support agencies with responsibilities for curating data such as the Inter-university Consortium for Political and Social Research (ICPSR), the UK Data Service and the Australian National Data Service have developed guides for researchers not only for taking advantage of the data services they provide but to try and explain the intent of funding agency requirements and what is expected from researchers in addressing these requirements. Though it is unclear as to how effective the guidance and support provided by libraries and other service agencies has been for particular communities or individual researchers, as a whole the DMP requirements do not appear to be well understood or followed by researchers. We expect that in transportation research, if nothing different is done, the experience will be similar as DMPs become required. Thus, key lessons of this review that influenced development of the guidance in Phase II, are as follows: 1. DMPs must be emphasized as a tool for enabling (and making efficient) public access to research data 2 https://libraries.mit.edu/data-management/

23 2. DMPs must be emphasized in training and DMP training needs to be provided (or mandated) for a wide variety of stakeholders in the research process (i.e., not just PIs) 3. Support for writing DMPs is critical and worthwhile. This might include references to the available tools, but should not be limited to that. 2.4 Task 1d: Training Materials 2.4.1 Training Materials Review Strategy While the literature review uncovered some training materials, most materials found for this project were through stakeholder interviews and reviews of agencies with more advanced data preservation policies and procedures. 2.4.2 Training Materials Review Evidence demonstrates that researchers are often not prepared to think critically about how they will manage their data to make it discoverable, accessible, understandable and usable by others outside of their own research group (Steinhart et al., 2012). This is not terribly surprising as data management knowledge and skills are generally not included as a part of the formal curricula of graduate school department, leaving students to figure out how to manage their data on their own (Carlson et al., 2013). Furthermore, requirements for researchers to share data outside of their lab are relatively new in many fields, including transportation, where standards, practices and resources for doing so are still being developed. Given this situation, providing training, guidance and certification is an essential component of compliance with the U.S. DOT Public Access Plan. Following good practice in data management requires more than just the introduction of new tools; it also requires a cultural shift in how research is practiced. Training will be required not only to introduce the products and tools designed to support data management, sharing and curation but to place them into a larger context. It is clear that good data management, sharing and curation requires the active participation of multiple types of stakeholders across the research life cycle. Each stakeholder type will have a different role to play in the process depending on his or her responsibilities and areas of expertise. Therefore, it logically follows that the training, guidance and resources developed to support data management, sharing, and curation activities will need to be aligned according to particular stakeholder type. The relevant training and stakeholder types are: 1. Baseline awareness training for everyone 2. Training for researchers 3. Training for executive and management roles 4. Training for research support roles 2.4.2.1 General Awareness Training for all Stakeholders General awareness training on data management, sharing and curation is needed for all personnel receiving transportation research funds to establish a shared understanding as to what data management is and why it is important. This type of

24 training is also needed for all stakeholders to see the big picture of the research life cycle and develop a common vocabulary to facilitate the movement of data between the stages. A common thread of general awareness training is the importance of introducing all stakeholders to high-level understanding of: 1. Why data sharing is a good idea (both for the community and the individual) 2. The role of Data Management Plans (DMPs) 3. Stakeholder roles and responsibilities 4. Available support (tools or personnel) Good sample materials for general awareness training can be found at:  http://damaro.oucs.ox.ac.uk/induction.xml  https://www.cessda.eu/Research-Infrastructure/Training/Research-Data- Management  https://www.icpsr.umich.edu/icpsrweb/content/deposit/guide/index.html  https://www.dataone.org/education-modules  http://dmtclearinghouse.esipfed.org 2.4.2.2 Training for Researchers As the producers of the data, researchers will require training in how to develop data sets that can be discovered, understood and used by others outside of their immediate research team. This includes developing a DMP, both to satisfy the requirements of the DOT or other funding agency when applying for a grant as well as longer plans that articulate how the data they generate will be managed over the course of its life cycle. Researchers will need to be trained in best practices in managing and organizing their data to ensure easy access to those who need it and to help others understand how the different elements of the data relate to one another. They will need to be trained on how to document and describe their data using an appropriate metadata scheme to convey information about the data in ways that promote understanding and trust. Researchers will also need to be able to consider issues surrounding how to share the data, such as selecting a data repository, how to connect the data to publications, presentations and other outputs, and what rights over the data will they give others who want to use it? Researchers will need to know how to support the curation and preservation of their data beyond the life of the research project in which it was generated. Finally, researchers should receive training using data generated by others effectively and ethically, including how data should be cited. Based on the results of the DMP review, we also recommend that training for researchers be aimed at a broad set of researchers beyond the PI and other project leaders. Research staff and even students should be trained in good data management practices and DMP construction. Moreover, DMPs as a tool to accomplish the goals of data sharing should be a particular focus of researchers training. A number of organizations and agencies have recognized the need for training and guidance to support data management, sharing and curation practices. Their available materials can provide a good starting point for transportation-specific training for researchers. These sources are summarized in Table 2-4.

25 Table 2-4 Sources of support for data and publication preservation training Source Topics Covered Website(s) California Digital Library DMP workflow https://dmptool.org Digital Curation Centre DMPs https://dmponline.dcc.ac.uk/ ICPSR General awareness, DMPs, data curation practices https://www.icpsr.umich.edu/icpsrweb/cont ent/deposit/guide/ USGS All areas https://www2.usgs.gov/datamanagement/in dex.php University of Minnesota DMPs https://www.lib.umn.edu/datamanagement/ DMP Johns Hopkins University Libraries DMPs (with context) http://dms.data.jhu.edu/data-management- resources/plan-research/write-a-data- management-plan/grant-reviewers-guide/ National Transportation Library DMPs https://ntl.bts.gov/publicaccess/creatingaD MP.html 2.4.2.3 Focused Training for Transportation Researchers Training directed specifically for transportation researchers is rather limited at the moment. Transportation librarians have recognized the importance of data management, sharing and curation to the current research environment. The National Transportation Library has developed detailed guidance3 to support for researchers submitting grants to the U.S. DOT for both intramural and extramural research activities. The guidance provided is primarily focused on developing a DMP and does not extend into other aspects of working with data or into other stages of the research life cycle. 2.4.2.4 Training for Executive and Management Roles We did not find much in the way of training resources designed for institutional research managers and policy-makers. However, one notable resource is the Collaborative Assessment of Research Data Infrastructure and Objectives (CARDIO)4 tool developed by the Data Curation Center. CARDIO is designed to help departments, research groups or organizations within higher education institutions assess their infrastructure, staff skill levels, support of management, and other resources in assuring that data are adequately managed. The tool is administered by gathering information to determine a maturity rating in 30 relevant areas covering organization, technology and resources aspects of managing and curating data. 3 https://ntl.bts.gov/publicaccess/creatingaDMP.html 4 http://www.dcc.ac.uk/projects/cardio

26 2.4.2.5 Focused Training for Support Roles In addition to training for researchers, there are more specialized training programs geared toward data management and curation professionals, particularly for librarians. Metadata is used by librarians in building digital collections and in making content, including research data accessible to their patrons. The University of North Carolina has developed an online tutorial5 to provide librarians and others with a basic introduction to what metadata is, the various elements one can use to describe research data, and best practices for assigning metadata. The National Information Standards Organization (NISO) has developed a primer on understanding metadata that is broad enough that it is appropriate for audiences beyond information professionals. Preservation and curation is another area where librarians, archivists and other information professionals have developed a number of educational programs. MIT hosts the Digital Preservation Management workshops and tutorial6, a program designed to teach information professionals about effective preservation techniques for digital objects. The Primary Trustworthy Digital Repository Authorization Body (PTAB), the group behind the ISO standard for the “Audit and certification of trustworthy digital repositories” (ISO 16363), has produced several digital preservation courses7. This includes a workshop on Open Archive Information Systems (OAIS), the standard set of requirements for an archive or repository to provide long-term preservation of digital information. The Digital Preservation Network has produced a handbook8 on preservation designed to be authoritative and practical in nature. The guide describes how digital resources can be managed for the long term and the issues and challenges that will need to be addressed. For librarians supporting research in the transportation field, the Transportation Librarians Roundtable is a monthly web conference series and forum for staying abreast of important developments in the field. This group has sponsored numerous presentations on issues relating to managing, sharing and curating data including case studies on creating DMP guidance, capturing data, and metadata management. 2.4.3 Training Design and Delivery Although multiple training programs and resources exist for researchers and support staff to teach the knowledge and skills needed for managing, sharing and curating data there are gaps in what is available and what is needed for researchers at state DOTs. Training programs in these areas are often developed by agencies that support the research activities of a particular discipline. ICPSR for example has educational programing for Social Scientists, DataONE has programs for environmental scientists, and the USGS supports Geologists and others doing Earth Science research. At a high level, the content provided by these organizations addresses similar topics and provides general guidance that would be applicable for most any type of research being conducted. However, there are disciplinary and other differences in how research is practiced, what tools, software and equipment are used to generate and analyze data, and how findings are published and shared that need to be accounted for and addressed in any training 5 http://guides.lib.unc.edu/metadata 6 http://www.dpworkshop.org/ 7 http://www.iso16363.org/resources/basic-training-for-digital-preservation/ 8 http://dpconline.org/handbook

27 program. Any training program will need to account for and speak to the cultures and practices of researchers at state DOTs to be effective. That said, existing programs can still serve as potential models for the structure and content for developing training programs for the state DOT. The USGS in particular was cited in the interviews we conducted as being an exemplar in providing useful resources and guidance to researchers. The use of a defined research life cycle as a means to ground their educational programs, break down concepts to make them easier to understand, and to help researchers make connections between these concepts was seen as a strength of the USGS. The use of life-cycle models and other concepts used by the USGS and other organizations can be appropriated and modified to serve the needs of researchers in state DOTs. Training programs should be based upon the Public Access Plan and other stated requirements produced by the DOT with a focus on making these requirements understandable and actionable by researchers and their support staff. Among other things the Public Access Plan requires researchers to:  Make their publications available after 12 months (unless an embargo is enacted)  Submit their publications to the DOT National Transportation Library digital repository  Ensure public access to final research data, subject to necessary restrictions such as security, individual privacy, or confidentiality  Make their research data accessible for search, retrieval and analysis  Develop data management plans that describe their strategies for making their data publically available  Ensure that research project descriptions are submitted to Transportation Research Board’s Research-in-Progress (RiP) database and are updated throughout the project Even if researchers are familiar with these and other requirements, it is unlikely they will possess a sufficient enough understanding to identify the necessary steps to take to satisfy these requirements. Looking beyond the researcher him or herself, additional training programs will be needed to help research managers, compliance officers, and others who provide support for research understand and respond to these requirements. The majority of the existing training programs are geared for academic librarians and are not specific to the needs of the state DOT community. Though some researchers are affiliated with academic institutions with libraries that provide support for data management, sharing and curation, many state DOT researchers are not and therefore do not have access to expertise or services. In addition to providing training that is targeted to the needs of state DOT researchers, training is need for those taking on the responsibility of supporting them. The current offerings listed above may provide a foundation for building a training program for personnel who provide research support, but as with researchers a more targeted training program informed by the culture and practice of DOT research will be needed.

Next: 3. Phase II Development of the Guidance Document »
Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research Get This Book
×
 Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The TRB National Cooperative Highway Research Program's NCHRP Web-Only Document 270: Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research provides guidance for state departments of transportation (DOTs) to help them meet the requirements of the U.S. DOT Public Access Plan requiring preservation of the products of all federally funded transportation research.

The document is released in parallel with NCHRP Research Report 936: Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!