National Academies Press: OpenBook
« Previous: Chapter 6 - Managing Research Publications
Page 53
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 53
Page 54
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 54
Page 55
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 55
Page 56
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 56
Page 57
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 57
Page 58
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 58
Page 59
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 59
Page 60
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 60
Page 61
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 61
Page 62
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 62
Page 63
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 63
Page 64
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 64
Page 65
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 65
Page 66
Suggested Citation:"Chapter 7 - Managing Research Data." National Academies of Sciences, Engineering, and Medicine. 2020. Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research. Washington, DC: The National Academies Press. doi: 10.17226/25704.
×
Page 66

Below is the uncorrected machine-read text of this chapter, intended to provide our own search engines and external engines with highly rich, chapter-representative searchable text of each book. Because it is UNCORRECTED material, please consider the following text as a useful but insufficient proxy for the authoritative book pages.

CHAPTER 7. Managing Research Data Definition of Research Data The U.S. DOT is a complex research environment containing a range of data. A literature review of the Transport Research International Documentation (TRID) database produces a rich set of references to operational research data management practices in transportation. The results illustrate the many facets of transportation data and identify some natural research communities within the larger domain. These include 7. M A N A G IN G R E S E A R C H D A TA In This Section » Definition of Research Data » Explaining Essential Requirements for Research Data » Going Beyond: Research Data Management and Access » Understanding Data Preservation » Deciding What to Preserve: Data Scope and Coverage » Deciding Which Formats to Preserve » Managing Quality of Research Data » Understanding Metadata Standards and Metadata for Transportation Data » Deciding Where to Preserve Data » Understanding How Long to Preserve Data » Chapter Checklist • General data management issues • Air transport data • Bridge asset data • Construction data • Crash and safety data • Driver data • Engineering data • Freight and cargo data • Environmental and land use data • GIS data • Intelligent transport system data • Marine transport data • Materials science data • Planning and design data • Railway data • Road data • General statistical data • Traffic data • Transit data • Transportation information systems and signage data • Vehicle and asset data • Video and photogrammetric data • Weather and climate data 53

54 7. Managing ReseaRch Data Explaining Essential Requirements for Research Data This aspect of essential compliance focuses on managing, accessing, and using the data that support federally funded transportation research. The U.S. DOT has provided guidance but left some important decisions and actions to research organizations. For written research products, the U.S. DOT and TRB have provided the registry for discovery and the repository for storage. However, it is the responsibility of the organization to select a registry and repository for the supporting research data. This means that, to be in compliance, state DOTs and other research organizations must make the key decision of which registry and repository to use. The U.S. DOT has provided guidelines for selecting a data registry and repository for preserving and providing access to data. Additionally, the U.S. DOT has provided a list of repositories that meet essential requirements. This means that organizations must do the following: • Ensure the data preserved are those used to draw research conclusions in the written research product. • Guarantee that data are stored in an open format or describe in the data management plan (DMP) which proprietary formats are used and why. • Certify the quality of the data and that data are interpretable, understandable, and usable by providing explanatory materials within the data package. • Choose a compliant data registry and repository to store data and make sure that those chosen • Will meet the guideline of conformance to the U.S. DOT Public Access Plan at https://ntl.bts.gov/public-access/ guidelines-evaluating-repositories, • Are able to generate and maintain a persistent identifier for the data, and • Have a long-term preservation strategy. • Create metadata for the research data to support discovery, availability, and access. Evaluating Repositories for Conformance with the Public Access Plan The U.S. DOT has provided a list of repositories that currently meet its essential requirements (see page 11 for a listing). As this is an evolving field, research organizations are also advised to use the Registry of Research Data Repositories (re3data.org) searchable listing of data repositories as a starting point for locating potential archiving options for their data. Researchers evaluating local or other data repositories as the option for storing and preserving their data should ensure the repository will • Promote an explicit mission of digital data archiving; • Ensure compliance with legal regulations and maintain all applicable licenses covering data access and use, including, ? ? ? Essential Research Data

55 7. Managing ReseaRch Data if applicable, mechanisms to protect privacy rights and the confidentiality of respondents; • Have a documented plan for long-term preservation of its holdings; • Apply documented processes in managing data storage; • Perform archiving according to explicit workflows across the data life cycle; • Enable users to discover and use the data and refer to data in a persistent way through proper citation; • Enable reuse of data, ensuring appropriate formats and application of metadata; • Ensure data integrity and authenticity; • Be adequately funded and staffed and have a system of governance in place; and • Possess a technical infrastructure that explicitly supports the tasks and functions described in internationally accepted archival standards, such as the Open Archival Information System (OAIS). Complying with essential requirements means that state DOTs and other research institutions have to make choices with both short- and long-term implications. A short-term strategy for meeting essential requirements means selecting an option that will be conformant with the U.S. DOT Plan. An organization’s choice of repository is an essential part of the DMP, which is now part of every research funding proposal. Given the impacts of this choice, it is important for state DOTs and other research institutions to make an “institutional” rather than individual decision. Allowing researchers to make individual decisions may result in (1) increased and redundant institutional costs if multiple repositories are chosen, (2) a scattering of the institution’s data across different repositories, and (3) a challenging foundation from which to define a long-term strategy. Institutions should make short-term strategic choices that will also support their long-term goals. Going Beyond: Research Data Management and Access There are several ways an institution can move beyond essential compliance in the area of data management and access. Some of these include the following: • Expand the scope and coverage of research data by including data from fields other than transportation, including data from ? ? ? Beyond minimum Data management and access ? ? ? Key Websites https://re3data.org https://ntl.bts.gov/public-access/data-repositories- conformant-dot-public-access-plan

56 7. Managing ReseaRch Data sources other than the U.S. DOT, and including more of the data than just what was used to produce analyses for deliverables. • Broaden the scope and coverage of formats supported by the solution. • Include preservation services in the solution. • Expand data quality management and assurance. • Choose metadata standards and provide metadata services. • Build an institutional repository solution. Understanding Data Preservation Digital preservation is not equal to basic data storage, but is separate and unique. While basic data storage addresses short-term access and business continuity needs, the data are at high risk of loss over the long term without the active intervention provided by data preservation (Figure 6). Unfortunately, few data appear to be deposited into a long-lived repository with dedicated digital preservation systems and staff — including hardware, metadata, and organization — that ensure the data are available and understandable in the long term. Devices fail, formats become obsolete, and accidents or disasters can destroy data. A key aspect of digital preservation is guaranteeing the integrity of content over time by ensuring that a file’s essential elements are preserved, context is documented, and content is traceable to its An old 9-track tape Digital preservation is “the active management of digital content over time to ensure ongoing access.” Whereas hard copy materials require minimum maintenance to last decades or centuries, digital content requires active management to make sure it can be accessed over the long term. Consider, for example, books created centuries ago. Most can still be viewed and understood by modern eyes. The paper may yellow or become brittle, but the contents would still be readable with the naked eye. In contrast, consider a 9-track tape just a few decades old. To read it, you need specialized equipment and trained staff. The media itself is often much more fragile and can become corrupted or destroyed more readily than simple paper. Figure 6. Digital Preservation: A Real World Example Plan to use archival and open formats from the beginning of the research plan, if possible. ? ? ?

57 7. Managing ReseaRch Data point of creation.1 While the digital preservation process can differ considerably depending upon the type of data being preserved, it is essential that the integrity of the information be the foremost goal. Deciding What to Preserve: Data Scope and Coverage Researchers generate many types and versions of data during the course of a research project. Going beyond, state DOTs and research institutions should have a set of guidelines that can help determine which data from the research life cycle to preserve. Not all data can or should be preserved. Storage can be expensive and resource intensive. At a minimum, researchers should preserve all data necessary to replicate the findings of a published or significant study. Researchers may go beyond the minimum by identifying and organizing the data prior to depositing with a repository, rather than relying on others to determine the important and useful data. Key questions to consider when appraising data include the following:2 How significant are the data for research? Factors to consider include • Substantive value of the collected information, • Time frame of the information, • Uniqueness of the collected information, • Relationship to previous studies, • Scope of the data, • Influence of these data in the transportation fields, • Data collection methodology, and • Ability to use the collected information for secondary studies. How significant are the source and context of the data, particularly in regard to scientific progress and society? Data must have demonstrated importance to the community, as determined by the following: 1. Substantive value and its influence on scientific knowledge, 2. Likely value to science and/or society over time, and 3. Uniqueness. It is also important to place a high value on data that permit policy analysis and research addressing broad public policy issues or transportation policy more specifically; safety countermeasure evaluation and recommended practice; relationships between transportation policy and other impacts such as environment, mobility, and equity; and transportation economics. ? ? ? ? ? ? ? ? ? Beyond minimum What to preserve

58 7. Managing ReseaRch Data Is the information unique? Determine whether the data are the only source or are the most complete source for significant information. Data that contain information not available in other sources are more likely to warrant permanent retention than records containing data duplicated in other sources. Even if data are unique, however, they may not warrant continued preservation, depending on the other appraisal criteria. How usable are the data? Consider how the usability of the data is affected by the way they were gathered, organized, presented, or analyzed. For example, does the scope of the data cover a national population sample or a representative subsample of the population? Do the data offer enough depth and breadth of information to support a wide range of research methodologies? Consider how the technical considerations affect the usability of the data. For example, some electronic records may pose such technological challenges that extraordinary measures may be required to recover the information, while other records containing similar documentation (either electronic records or records in another format) may be usable with much less effort. Consider how the physical condition of the preservation media affects the usability of the data. For example, some media may have deteriorated such that the data contained are unreadable. Are the data related to data in other repositories? Data that add significantly to the meaning or value of other data already archived are more likely to warrant retention than data lacking such a relationship. Examples would be data that fill substantive gaps, that round out existing subject area concentrations, or that constitute a new version of or addition to data collections in the holdings. Data that are a chronological continuation of data already held by the archives are likely to warrant permanent retention, particularly if the older segments of the data are used often. What are the cost considerations for long-term maintenance? This consideration should play a significant role only in marginal cases. In such cases, an appraisal should balance the anticipated research potential of the data with the resource implications of retaining them permanently. If data carry significant costs for acquisition, processing, archiving, and distribution, the value of the data must clearly outweigh the costs. Other things being equal, data with low long-term cost implications are more likely to warrant permanent retention than those data with high long-term costs. ? ? ? ? ? ? ? ? ? ? ?

59 7. Managing ReseaRch Data What is the volume of data? Data that are clearly of value on the basis of the guidelines listed above should be designated for permanent retention regardless of the size/ volume of the data. The size/volume of a collection should be a factor in the decision making only when the permanent value is marginal. Deciding Which Formats to Preserve While transportation data come in many formats, interviews with individual researchers within the field found quantitative tabular data the most frequently used. Table 5, “Long-Term Preservation of Data: Types and File Formats,” lists commonly used data formats and provides a high-level reference to data types and formats recommended for long-term preservation purposes. The format used to store data can affect a repository’s ability to preserve the content for long-term access and therefore should be carefully considered. Generally, formats that are open source rather than proprietary in nature are better suited for preservation. Consult the repository you intend to submit your data to for guidance on selecting formats. Note: Table 5 should primarily be used to understand the variety of data types and possible formats. All of the data types and formats in this table are pertinent to transportation research. ? ? ? Table 5. Long-Term Preservation of Data: Types and File Formats Type of Data Acceptable Formats for Sharing, Reuse, and Preservation Other Acceptable Formats for Data Preservation Quantitative tabular data with extensive metadata (e.g., data set with variable/ code labels and defined missing values) SPSS portable format (.por) Delimited text and command (“setup”) file (SPSS, Stata, SAS, etc.) with metadata information Structured text or markup file of metadata information, (e.g., DDI XML file) Proprietary formats of statistical packages: SPSS (.sav), Stata (.dta), MS Access (.mdb/.accdb) Quantitative tabular data with minimum metadata (e.g., data with or without column headings or variable names and no other metadata or labeling) Comma-separated values (.csv) Tab-delimited file (.tab) Delimited text with SQL data definition statements where appropriate Delimited text (.txt) of given character set (only characters not present in the data should be used as delimiters) Widely used formats: MS Excel (.xls/.xlsx), MS Access (.mdb/.accdb), dBase (.dbf) and OpenDocument Spreadsheet (.ods) (continued on next page)

60 7. Managing ReseaRch Data Type of Data Acceptable Formats for Sharing, Reuse, and Preservation Other Acceptable Formats for Data Preservation Geospatial data vector and raster data ESRI Shapefile • Essential: .shp, .shx, .dbf • Optional: .prj, .sbx, .sbn Geo-referenced TIFF (.tif, .tfw) CAD data (.dwg) Tabular GIS attribute data ESRI Geodatabase format (.mdb) MapInfo Interchange Format (.mif) for vector data Keyhole Markup Language (KML) (.kml) Adobe Illustrator (.ai), CAD data (.dxf or .svg) Binary formats of GIS and CAD packages Qualitative textual data eXtensible Markup Language (XML) text according to an appropriate Document Type Definition (DTD) or schema (.xml) Rich Text Format (.rtf) Plain text data, ASCII (.txt) Hypertext Markup Language (HTML) (.html) Widely used proprietary formats: MS Word (.doc/.docx) Proprietary/software-specific formats: NUD*IST, NVivo, and ATLAS.ti Digital audio data Free Lossless Audio Codec (FLAC) (.flac) MPEG-1 Audio Layer 3 (.mp3) but only if originally created in this format Audio Interchange File Format (AIFF) (.aif) Waveform Audio Format (WAV) (.wav) Digital image data TIFF version 6 uncompressed (.tif) JPEG (.jpeg, .jpg) but only if originally created in this format TIFF (other versions) (.tif, .tiff) Adobe Portable Document Format (PDF/A, PDF) (.pdf) Standard applicable RAW image format (.raw) Photoshop files (.psd) Digital video data MPEG-4 (.mp4) motion JPEG 2000 (.mj2) Documentation and scripts Rich Text Format (.rtf) PDF/A or PDF (.pdf) HTML (.htm) OpenDocument Text (.odt) Plain text (.txt) Some widely used proprietary formats: MS Word (.doc/.docx) or MS Excel (.xls/.xlsx) XML marked-up text (.xml) according to an appropriate DTD or schema, e.g., XHMTL 1.0 Table 5. (continued)

61 7. Managing ReseaRch Data ? ? ? For more information on data types and file formats, go to http://www.data-archive.ac.uk/ ? ? ? Project Open Data Metadata Schema: https://project-open- data.cio.gov/v1.1/ schema At a minimum, data should be stored in preferred formats to ensure long-term access and reuse. If preferred preservation formats are not available, acceptable formats may be used. Researchers may go beyond the minimum by creating the preferred formats from the beginning of the project rather than waiting until the end to transfer file formats. Managing Quality of Research Data Generally accepted digital preservation standards are not the only features to consider in the assessment of where to deposit data. Data curation is another important aspect. Curation enhances collections so they are complete and self-explanatory for future users. That is, through curation, those responsible for preserving content “ensure that the preserved information is independently understandable to the user community, in the sense that the information can be understood by users without the assistance of the information producer.”3 During curation, data are reviewed and cleaned for accuracy and completeness. This may include recoding missing or out-of-range codes, as well as enhancing and adding labels, metadata, and other documentation. Repositories may offer a broad range of curatorial options. Understanding Metadata Standards and Metadata for Transportation Data Two important metadata standards often cited in data preservation literature include Dublin Core and the Project Open Data Metadata Schema. The former was developed with publications in mind and provides a few generic attributes. These attributes may not be sufficient to enable researchers to learn enough about data. In addition, this scheme was designed to support formal and final publications, rather than project documentation or project reports. For research publications generated from contract research or research projects, the Project Open Data Metadata Schema may be preferred. In addition, there are domain-specific metadata schemas that are relevant to types of research, including geospatial data, environmental science data, biological data, and so forth.4 Data curation enhances collections so they are “complete and self-explanatory” for future users. ? ? ?

62 7. Managing ReseaRch Data Metadata is the area that is most often supported by intermediary services. A common option is to provide a form to enable researchers to generate metadata. These tools support metadata capture, and can also provide easy access to master data vocabularies—for example, geographical locations, International Organization for Standardization (ISO) country names, and Multipurpose Internet Mail Extensions (MIME type). They can also provide easy access to authoritative vocabularies such as the Transportation Research Thesaurus. Good metadata are critical for basic access and discovery, so it may prove useful to have an intermediary review metadata once the form is complete. Metadata guidance and services is a current gap in the transportation research management life cycle. In data collection, transportation researchers overwhelmingly indicated that they used no metadata standards to describe their research products and data. Researchers also noted that metadata creation services often were not available and that even when they were available, they were not sufficient to meet the research needs. The discrepancies between the literature devoted to metadata standards and services and their use by transportation researchers is noteworthy. This is a critical success factor in the operationalization of the U.S. DOT’s Public Access Plan, and it appears to pose a significant gap. At a minimum, researchers should provide a basic description of their data, including using simple Project Open Data metadata standard attributes such as the following: • Title, • Creator, • Description, • Subject terms, and • Geographic coverage. Researchers may go beyond the minimum by using established metadata standards used by their disciplines. Several online lists provide guidance on selecting a disciplinary metadata format. Disciplinary repositories can also help generate the preferred metadata. ? ? ? Essential Metadata ? ? ? Beyond minimum Metadata ? ? ? Need help choosing a disciplinary metadata format? Go to one of the following: • http://rd-alliance.github.io/metadata-directory/standards/ • http://www.dcc.ac.uk/resources/metadata-standards

63 7. Managing ReseaRch Data Deciding Where to Preserve Data Several options exist for preserving research data (see Figure 7, “Research Data Repository Options”), ranging from metadata registries to focused disciplinary repositories. While it is wonderful to have so many options about where to deposit data, at the least, researchers should select long-lived repositories that commit to core digital preservation standards and provide essential curation services. Chapter 9 details issues to consider when a repository is being selected. Researchers may go beyond the minimum by selecting repositories that offer additional features to enhance the immediate and long-term reuse of the data. The National Transportation Library (NTL) maintains a regularly updated list of data repositories conformant with the U.S. DOT Public Access Plan (https://ntl.bts.gov/public-access/ data-repositories-conformant-dot-public-access-plan). The Registry of Research Data Repositories (https://www.re3data.org/) is an online tool to help identify existing international repositories for research data. Simple Data Registry of Metadata A simple data registry of metadata is a catalog of information that links to the actual data collections typically stored elsewhere. Such a registry provides a central inventory of collections relevant to or sponsored by an organization. The registry can be maintained through collecting and refreshing metadata, but without the need to host the actual data. In essence, registries can outsource the digital preservation heavy lifting to long-lived repositories while simply maintaining a catalog pointing to the data sources. The challenge of such a registry is that it hosts no data. If metadata are not automatically harvested, it can be difficult to maintain (add, update) the records for external content. The ROSA P registry at NTL hosts documents but does not host data sets and is thus serving the important purpose of providing a centralized simple data registry for transportation data under the U.S. DOT Public Access Plan. This means that research organizations must find a repository in which to deposit their data so that it will be preserved for the long term and so that metadata can be updated in the ROSA P registry as needed. What follows are descriptions of available repository options, including core strengths and weaknesses. ROSA P General Research Data Repository (Breadth) Domain-Specific Data Repository (Depth and Curation) Institutional Data Repository (Breadth and Local) Figure 7. Research Data Repository Options ? ? ? Beyond minimum Enhancing data reuse

64 7. Managing ReseaRch Data OPTION 1: General Research Data Repository Beyond a simple registry is the general research data repository. This is a repository that accepts a wide variety of data formats from a wide variety of disciplines. Because of the heterogeneity of data accepted, general research data repositories typically provide basic metadata and take minimum curation or enhancement actions. Two examples of this type of registry are Zenodo and figshare. General repositories make it very easy to deposit content, although the heterogeneity of collections can make browsing and discovery challenging. Likewise, the broad base of data covered may mean that tools and services, such as online analysis packages, may be lacking for the type of data to be archived. OPTION 2: Institutional Data Repository Like general research repositories, institutional data repositories, such as the University of Michigan Deep Blue Data repository, typically cover a wide range of data formats and subject areas, with minimum and broad metadata coverage. Again, this makes data deposit easy but removes some descriptive power that aids search and discovery. That said, institutional data repositories have several strengths. They are often supported by the institution’s library, which means they are durable and persist for centuries. They are neutral and interdisciplinary, which makes them especially attuned to diverse data without a home in a more specialized repository. Institutional data repositories are local and can provide in-person services and resources that more distant repositories cannot. OPTION 3: Domain-Specific Data Repository Yet another repository option is a domain-specific repository, such as the Inter-university Consortium for Political and Social Research (ICPSR) or the Protein Data Bank. Domain repositories “serve a scientific community, which may be a traditional academic discipline, a sub-discipline, or an interdisciplinary network of scientists, united by a common focus.”5 It offers specialized metadata and curatorial enhancements specific to the domain or specialty. ICPSR, for instance, uses the Data Documentation Initiative (DDI) metadata standard, which records details specific to surveys and other social science data collections typically archived at ICPSR as well as encodes values at the granular variable level. Disciplinary repositories can also provide other specialized features, including disclosure expertise for human subject research, customized preservation services such as format validations and migrations, and specialized tools to enhance the user experience. Additionally, “they seek to know what [their] community wants and expects in terms of content, format, delivery options, support, and training.”6 Because disciplinary repositories offer focused, deep collections, users may have an easier time searching and browsing for relevant content. ? ? ? Zenodo and figshare are examples of general research repositories. ? ? ? University of Michigan Deep Blue Data is an example of an institutional repository. ? ? ? ICPSR and Protein Data Bank are examples of domain repositories.

65 7. Managing ReseaRch Data Before entering into third-party agreements with any repository providers, be sure to consider the legal terms carefully. If there are any concerns, be sure to consult your counsel. Understanding How Long to Preserve Data A question that is often asked is how long data should be preserved. The short answer is generally for as long as the data has value for its community of users. This can be a challenging question to answer, as it is difficult to predict how long into the future a data set will have value. Instead of focusing on the maximum amount of time for preservation, it might be more effective to consider what the minimum amount of time for preserving the data should be and to develop a checklist for determining the value of the data set after the minimum time has elapsed. Funding and publishing agencies will sometimes define their expectations for the retention or availability of a data set. Currently, the NSF Engineering Directorate states that the “Minimum data retention of research data is three years after conclusion of the award or three years after public release, whichever is later.”7 If such an expectation is stated, it should be used to inform the minimum preservation period. A checklist for determining whether preservation should continue will naturally vary according to the type of data, the needs of the researchers who make use of the data, and the specific value the data have for the community. The questions listed in “Deciding What to Preserve” on page 57 are also relevant for deciding whether preservation should continue. Other key questions might include the following: • How much and what kind of use have the data received over time? • Who depends upon having access to the data? How would they be affected by not having access to the data? • Are the data connected to publications or other research outputs? Would not having access to the data harm someone’s ability to understand or trust these publications? If data preservation will take place through a third-party repository (see “Deciding Where to Preserve Data” on page 63), be sure that the preservation services meet the needs of the research organization by carefully reviewing the terms of service, including how long the organization will commit to preserving the data.

66 7. Managing ReseaRch Data Endnotes 1 Inter-university Consortium for Political and Social Research (ICPSR). 2009. Principles and Good Practice for Preserving Data. International Household Survey Network, IHSN Working Paper No. 003. http://www.ihsn.org/sites/ default/files/resources/IHSN-WP003.pdf. 2 Source: http://data-pass.org/sites/default/files/appraisal.pdf. 3 https://www.dpconline.org/handbook/institutional-strategies/standards-and-best-practice. 4 See the RDA Metadata Directory, http://rd-alliance.github.io/metadata-directory/standards/, or the Digital Curation Center, http://www.dcc.ac.uk/resources/metadata-standards. 5 Carol Ember and Robert Hanisch, “Sustaining Domain Repositories for Digital Data: A White Paper.” Output of the workshop “Sustaining Domain Repositories for Digital Data,” Ann Arbor, MI, June 24–25, 2013, doi:10.3886/ SustainingDomainRepositoriesDigitalData. 6 Ann G. Green and Myron P. Gutmann, “Building Partnerships among Social Science Researchers, Institution- Based Repositories, and Domain Specific Data Archives.” OCLC Systems and Services: International Digital Library Perspectives Vol. 23 No. 1, 2007, pp. 35–53. https://doi.org/10.1108/10650750710720757. 7 Directorate for Engineering Data Management Plans Guidance for Principal Investigators. Updated November 2018. https://nsf.gov/eng/general/ENG_DMP_Policy.pdf. Chapter Checklist From this chapter, you should be able to þ Define essential requirements for research data as described by the U.S. DOT. þ Interpret essential requirements for your research organization’s transportation research. þ Understand the research organization’s roles and responsibilities for data management and preservation. þ Understand the difference between a short-term compliant strategy and going beyond. þ Understand the issues involved in building or supporting a local solution for research data preservation.

Next: Chapter 8 - Data Management Plans »
Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research Get This Book
×
 Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The U.S. Department of Transportation has essential requirements for researchers and research institutions requesting and receiving transportation-related federal research funds. The U.S. DOT strives to make it easier to publish and communicate scientific knowledge. It is a long-range vision which goes beyond the requirements of the U.S. DOT’s Public Access Plan.

The TRB National Cooperative Highway Research Program's NCHRP Research Report 936: Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research is designed to help state DOTs, as well as other organizations that do transportation research, better understand and consider how they will comply with the U.S. DOT policy.

The guide is accompanied by NCHRP Web-Only Document 270: Developing a Guide to Ensuring Access to the Publications and Data of Federally Funded Transportation Research.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  6. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  7. ×

    View our suggested citation for this chapter.

    « Back Next »
  8. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!