1
Introduction
As a major producer of scientific data,1 and as a partner for international cooperative research, China has a great deal to offer to the world’s knowledge base. Although China’s research capabilities are rapidly improving, some significant problems remain. Among the recognized impediments are inadequate digital archiving and access policies and practices that inhibit progress and improved international cooperation. Particularly noteworthy in this regard was the high-level data access initiative announced in February 2003 by Guanghua Xu, the Chinese Minister of Science and Technology (MoST), and supported by the National People’s Congress. This China Scientific Data Sharing Program includes “creating a law to ensure that scientific information is communicated more widely, and coordinating efforts by government departments to develop information centers and databases to facilitate the communication of scientific and technological information.”2 This new policy toward greater openness with publicly funded scientific data is part of a broader effort to modernize the national research and development infrastructure and its management in China.
In 2000, the U.S. National Committee (USNC) for CODATA and the Chinese National Committee for CODATA held two bilateral meet-
ings with senior science officials and data managers from both countries to discuss various data management and policy issues.3 Following these two initial meetings of the U.S. and Chinese CODATA committees, the Chinese side was augmented by other experts from the Chinese Academy of Sciences and MoST who are leaders of the Scientific Data Sharing Program. The USNC for CODATA hosted a delegation of these Chinese data policy experts in the summer of 2002.
The fourth of these bilateral meetings of data experts was held in Beijing in October 2003. Focused on scientific resources sharing policy, that meeting provided some of the advance groundwork for the June 2004 workshop that is the topic of this report. It also re-confirmed the commitment of the Chinese science policy community to promoting greater openness regarding Chinese scientific data and identified the priority areas for additional focus.
The effective long-term preservation of and open access to digital scientific resources in all countries increases in importance as an essential component of the global public research infrastructure, which can now be integrated through the Internet. The challenges in storing and maintaining access to these growing collections of data and information are substantial, even in economically more developed countries. Moreover, although many of the challenges that require sustainable solutions are the same for digital data and information across all disciplines, others are distinct or unique to certain disciplines or types of information. And while all solutions are context dependent, some may be based on extending or emulating existing successful models, and others may require and benefit from entirely new approaches.
China faces substantial hurdles in this regard. Although many of its data resources and especially journal literature still reside in paper formats, China already has significant digital information preservation and access requirements that in many cases are not being successfully addressed. Factual databases and journals can provide an important research and economic tool for China—just as they do in more economically developed countries—for capacity building in science and education, for supporting sustainable development of commerce and industry, and for promoting good governance. Resolving the many difficulties in preserving and making
3 |
For additional information, see http://www7.nationalacademies.org/usnc-codata/China_US_Data_Seminars.html. |
broadly available the digital scientific data resources successfully today will provide great benefits for future generations; the costs of inaction are incalculable, but certain to be substantial. At the same time, it is important to recognize that even the most economically developed countries have encountered various difficulties with the preservation and open access issues. Careful consideration is needed to develop long-term plans for sustainable digital archiving in China, as in all other countries.
WORKSHOP ON STRATEGIES FOR PRESERVATION OF AND OPEN ACCESS TO SCIENTIFIC DATA
The international Workshop on Strategies for Preservation of and Open Access to Scientific Data was held on June 22-24, 2004, in Beijing. It built on the results of the four previous bilateral CODATA meetings and on the new Chinese scientific data access policy initiative noted above. The workshop explored in detail the various scientific and technical, legal and policy, institutional and economic, and management aspects that need to be addressed in successfully implementing sustainable and accessible archives of digital health and environmental data resources in China. It examined various models of open archiving that might be adopted or adapted for use within the Chinese context. It also provided much needed high-level attention to these typically under-appreciated problems by bringing together scientific information managers, digital archiving experts, national science policy and funding officials, and representatives of development organizations, who will be able to incorporate the results of this project into their future planning.4
The workshop was organized pursuant to the following statement of task:
-
Identify research areas in which preservation of and open access to digital scientific information require high-priority attention in China, and provide the underlying rationales for the areas chosen.
-
Identify and discuss the scientific and technical, institutional and economic, legal and policy, and management factors relevant to providing open access to digital scientific information resources (both the data and the literature), including an examination of different possible models and
4 |
See Appendix B for the biographical summaries of all the speakers at the workshop. |
-
their potential benefits and shortcomings in China, and drawing on examples of other digital archiving and access regimes in related areas.
-
Review and discuss the current status of access and archiving regimes for the types of scientific information identified in task 1.
-
Identify possible follow-up activities to improve open access and preservation for each major type of digital scientific information selected for discussion in task 1, taking into consideration the results of the discussions under tasks 2 and 3.
The workshop addressed these four tasks over two and a half days through a mix of invited presentations, focused panel presentations, and some discussion by all of the participants in both plenary and breakout sessions.5 The primary areas of focus were on biomedical data, earth and environmental data, and related scientific, technical, and medical literature. Although many of the issues identified and discussed during the workshop were focused on the Chinese context, they also likely are relevant throughout much of the developing world. Because of significant language barriers and time constraints the identification of follow-up activities requested in task 4 could not be done through group discussions in the breakout sessions. Any potential follow-up activities were identified only in the context of individual presentations.
STRUCTURE OF THIS REPORT
Because this report is a summary of the workshop, it is limited in scope to the presentations and other information identified during the meeting. Chapter 2 presents two keynote speeches in their entirety by high-ranking officials at the Ministry of Science and Technology who describe the development and status of China’s national Scientific Data Sharing Policy.
Subsequent chapters include summaries of the other speakers’ presentations. Several international perspectives on the preservation of and open access to public scientific data are presented in Chapter 3. Chapter 4 discusses the cross-disciplinary issues—policy and legal, institutional and economic, and management and technical—that affect the preservation of and open access to public scientific information. The report concludes with a discussion of these issues in the areas of life sciences and public health data;
5 |
See Appendix A for the workshop agenda. |
earth sciences, environmental, and natural resources data; and scientific information, journals, and digital libraries.
The appendixes to the report provide additional background information, including the workshop agenda and the biographical summaries of the workshop speakers. The presentation materials used by the invited speakers are available in English on the USNC for CODATA Web site at http://www7.nationalacademies.org/usnc-codata/chinese_workshop.html.