Recent years have seen a growing tendency for social scientists to collect biological specimens, such as blood, urine, and saliva as part of large-scale household surveys. By combining biological and social data, scientists are opening up new fields of inquiry and are able for the first time to address many new questions and connections. But including biospecimens in social surveys also adds a great deal of complexity and cost to the investigator’s task. Along with the usual concerns about informed consent, privacy issues, and the best ways to collect, store, and share data, researchers now face a variety of issues that are much less familiar or that appear in a new light.
In particular, collecting and storing human biological materials for use in social science research raises additional legal, ethical, and social issues, as well as practical issues related to the storage, retrieval, and sharing of data. For example, acquiring biological data and linking them to social science databases requires a more complex informed consent process, the development of a biorepository, the establishment of data sharing policies, and the creation of a process for deciding how the data are going to be shared and used for secondary analysis—all of which add cost to a survey and require additional time and attention from the investigators. These issues also are likely to be unfamiliar to social scientists who have not worked with biological specimens in the past. Adding to the attraction of collecting biospecimens but also to the complexity of sharing and protecting the data is the fact that this is an era of incredibly rapid gains in our understanding of complex biological and physiological phenomena. Thus the trade-offs between the risks and opportunities of expanding access to research data are constantly changing.
This report, which was funded by the National Institute on Aging (NIA), offers findings and recommendations concerning the best approaches to the collection, storage, use, and sharing of biospecimens gathered in social science surveys and the digital representations of biological data derived therefrom. It is aimed at researchers interested in carrying out such surveys, their institutions, and their funding agencies.
COLLECTING, STORING, USING, AND DISTRIBUTING BIOSPECIMENS
This report’s initial message to social scientists undertaking the collection of biospecimens is that there is no need to reinvent the wheel. Although working in this emerging area may be new and unfamiliar, they will find available a number of existing documents from the biomedical field that offer advice and describe recommended procedures and laboratory practices for dealing with biospecimens. The panel recommends that social scientists who are planning to add biological specimens to their survey research familiarize themselves with existing best practices for the collection, storage, use, and distribution of biospecimens. First and foremost, the design of the protocol for collection must ensure the safety of both participants and survey staff. At the same time, many issues arise when biospecimens are collected as part of a social science survey that are not encountered in biomedical research. Thus it is often necessary to move beyond the biomedical model to find answers and best approaches for the social science context.
The panel notes that there is a growing tendency among social scientists to propose the collection of biospecimens in surveys regardless of whether they are needed to test a specific hypothesis. Yet many social scientists who decide to add biospecimens to their surveys are not fully prepared to provide for the storage and distribution of the specimens they collect. Indeed, the panel concluded that the issues involved in the storage and distribution of biospecimens are too complex and involve too many hidden costs to assume that social scientists without suitable experience can deal with them unassisted. Therefore, the panel recommends that NIA and other relevant funding agencies support at least one central facility for the storage and distribution of biospecimens collected as part of the research they support.
The collection of biological specimens along with the traditional social and behavioral data promises a number of benefits that are likely to extend beyond the original research team. However, advances are continually being made in genetic analysis and the ability to identify individuals through their social and biological data, and the sharing of biospecimens implies the depletion of a nonrenewable scientific resource. For these reasons, sharing biospecimens with other investigators is highly complex, and best practices in this area have yet to be established. Thus the panel recommends that early in the planning process,
principal investigators who will be collecting biospecimens as part of a social science survey develop a complete data sharing plan. In general, there is no one best plan for the use and reuse of specimens, but the plan should include a discussion of the adequacy of the storage and retrieval protocols. It should spell out criteria for allowing other researchers to use (and therefore deplete) the available stock of specimens, as well as to gain access to any derived data. The plan should also specify the procedures for accessing the specimens and data. It should include provision for the storage and retrieval of specimens and clarify how the succession of responsibility for and control of the specimens will be managed at the conclusion of the project. Finally, the plan should contain information on how specimens and data derived from them are to be documented and provide for public access to that documentation. To ensure the inclusion of all essential information, the panel recommends that NIA (or preferably the National Institutes of Health [NIH]) publish guidelines for principal investigators containing a list of points that need to be considered for an acceptable data sharing plan. In addition to staff review, Scientific Review Panels should read and comment on all proposed data sharing plans. In much the same way as an unacceptable human subjects plan, an inadequate data sharing plan should hold up an otherwise acceptable proposal.
SHARING DIGITAL REPRESENTATIONS OF BIOLOGICAL AND SOCIAL DATA
Once a survey has been conducted and biospecimens have been collected and analyzed, the survey team is left with a large amount of valuable social and biological data in digital form. Yet given the above-noted advances in genetic analysis and the ability to identify individuals through their social and biological data, a difficult issue facing the field is how to share the digital representations of these data as widely as possible while ensuring the protection of confidentiality. This issue is especially acute when detailed genetic information is generated from survey participants’ biological samples and linked to social science data, which may be as sensitive or even more sensitive in their own right. At present, no data restriction strategy has been demonstrated to protect confidentiality while preserving the usefulness of the data for drawing inferences involving multidimensional interactions among genomic and social variables, which are increasingly the target of research.
For these reasons, the panel recommends that both rich genomic data acquired for research and sensitive and potentially identifiable social science data that do not change (or change very little) with time be shared only under restricted circumstances, such as licensing and (actual or virtual) data enclaves. Making confidential genomic data available for unrestricted public use would require such intense data masking to protect confidentiality that it would distort genomic analyses and sharply limit their usefulness. As a security
measure, the panel recommends that genomic data and other individual-level data containing uniquely identifying variables that are stored or in active use by investigators on their institutional or personal computers be encrypted at all times.
At the same time, some digital biosocial data can be shared if first subjected to procedures that alter the original data; restricted access should not be the only mode of data protection. Yet evaluating the specific risks of sharing data and devising ways to protect data from breaches are complex and specialized tasks requiring an expertise in disclosure protection methods not possessed by most principal investigators and their institutions. Currently, not enough is known to be able to represent these risks either fully or accurately. Determining the best protection schemes for the sharing of sensitive social and biological datasets also requires a significant investment of resources, and it would be wasteful for individual investigators to expend their resources on such efforts rather than on collecting and analyzing the data. Instead, the panel recommends that NIA (or preferably NIH) develop new standards and procedures for licensing confidential data in ways that will maximize timely access while maintaining security and that can be used by data repositories and by projects that distribute data. The panel also recommends that NIA and other funding agencies assess the strength of confidentiality protections through periodic expert audits of confidentiality and computer security. Willingness to participate in such audits should be a condition for receipt of NIA support. Beyond enforcement, the purpose of such audits would be to identify challenges and solutions.
Further, NIH should consider funding Centers of Excellence to explore new ways of protecting digital representations of data and to assist principal investigators wishing to share data with others. NIH should also support research on disclosure risks and limitations.
OBTAINING INFORMED CONSENT
If participants are to provide truly informed consent to taking part in any study, they must be given a certain minimum amount of information. They should be told, for example, what the purpose of the study is, how it is to be carried out, and what participants’ roles are. In addition, because of the unique risks associated with providing biospecimens, participants in a social science survey that involves the collection of such specimens should be provided with other types of information as well. In particular, they should be given detail on the storage and use of the specimens that relates to those risks and can assist them in determining whether to take part in the study. To this end, the panel recommends that, in designing a consent form for the collection of biospecimens, in addition to those elements that are common to social and biomedical
research, investigators ensure that certain other information is provided to participants:
how long researchers intend to retain their biospecimens and the genomic and other biodata that may be derived from them;
both the risks associated with genomic data and the limits of what they can reveal;
which other researchers will have access to their specimens, to the data derived therefrom, and to information collected in a survey questionnaire;
the limits on researchers’ ability to maintain confidentiality;
any potential limits on their ability to withdraw their specimens or data from the research;
the penalties (such as the elimination of research support) that may be imposed on researchers for various types of breaches of confidentiality; and
what plans have been put in place to return to them any medically relevant findings.
Additionally, the panel recommends that NIA locate and publicize positive examples of the documentation of consent processes for the collection of biospecimens. In particular, these examples should take into account the special needs of certain individuals, such as those with sensory problems, the cognitively impaired, or children.
The panel also notes that participants in biosocial surveys are likely to have different levels of comfort with how their biospecimens and data are used. Some may be willing to provide only answers to questions and others to provide specimens as well. Some may be willing for their specimens and data to be used only for the current study, while others may consent to their use in future studies. One effective way to deal with these different comfort levels is to offer a tiered approach to consent, allowing the participant to determine just how his or her specimens and data may be used. Accordingly, the panel recommends that researchers consider adopting a tiered approach to obtaining consent. Tiers might include participating in the survey, providing specimens for genetic and/or nongenetic analysis in a particular study, and allowing the specimens and data (genetic and/or nongenetic) to be stored for future use. Additionally, participants who are willing to have their specimens and data used in future studies should be informed about the process that will be used to obtain approval for such uses.
As part of the informed consent process, the panel also recommends that NIA direct investigators to formulate a plan in advance concerning the return of any medically relevant findings to survey participants and to implement
that plan in the design and conduct of their informed consent procedures. In addition, the panel recommends that NIA, the Office of Human Resource Protections (OHRP), and other appropriate organizations sponsor training programs, create training modules, and hold informational workshops on informed consent for investigators, staff of survey organizations, including field staff, administrators, and members of Institutional Review Boards (IRBs) who oversee surveys that collect social science data and biospecimens.
A final issue facing social science researchers who include biospecimens in their surveys is obtaining approval from IRBs. A number of challenges exist, including the fact that few IRBs are familiar with both social and biological science; thus investigators may find themselves trying to justify standard social science protocols to a biologically savvy IRB or explaining standard biological protocols to an IRB that is used to dealing with social science. Another issue is that institutional IRBs are increasingly busy, and they are particularly demanding whenever potential risk to human subjects is at issue. Therefore, the panel recommends that investigators considering collecting biomarkers consult with their IRBs early and often.
The panel believes that, by following the above recommendations and several others offered in the full report, it should be possible to overcome many of the practical issues related to the collection, storage, retrieval, and sharing of biospecimens and derived biodata. The result should be improved access to research data without compromise to appropriate protection for research participants.