National Academies Press: OpenBook

Bits of Power: Issues in Global Access to Scientific Data (1997)

Chapter: 3 Scientific Issues in the International Exchange of Data in the Natural Sciences

« Previous: 2 Trends and Issues in Information Technology
Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

Scientific Issues in the International Exchange of Data in the Natural Sciences

Science is the process and the product of discovering the cumulative body of knowledge and understanding through which we humans comprehend the tangible universe. Its cumulative nature is a key to the uniqueness of the knowledge gained in the natural sciences. This knowledge is sometimes reorganized at a profound conceptual level when a field undergoes a shift of paradigm—for example, the change from the caloric fluid to the kinetic theory of heat or from the continuum of classical mechanics to the discreteness and duality of quantum mechanics. Yet the facts of science and the links among them remain; we may change the way we interpret those links, but the body of scientific data continues to accumulate.

Data in science are like bricks, and the theoretical concepts are the mortar that connects them to give a subject its structure. Each new bit of data plays a part: it may be uncovered in efforts to test a hypothesis, estimated from previous information, or collected in observations, experiments, or computations. As an observed or measured new piece of information, it becomes part of our base of knowledge, to share, interpret, and reconcile with the data already in hand. Scientists ask, "Are these new data consistent with what we already know? Are they just what we might have expected, or do they require us to question the results, to repeat the experiment, or to find a new interpretation that accounts for why the data are what they are?" When the scientific community resolves these questions, the new data become part of the foundation on which the next conjectures and experimental plans build. At this stage, also, researchers begin to consider the implications of the new data, both to strengthen and extend basic understanding in the natural sciences and to seek applications that may bring benefits to

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

society and progress in bettering the human condition. Throughout this process, scientific data are the cumulative substance on which all of science builds.

Data in science are universal—they have the same validity for scientists everywhere. The atomic mass of iron, the structure of DNA, and the amount of rainfall in Manaus in 1972 are facts independent of the political views of their user, the time at which we determine them (apart from the evolving, improving accuracy of the determinations), or the user's location. Their utility depends on the precision and accuracy with which they are determined and the units we use to express them. A DNA sequence or a nuclear cross section can be as important to a researcher in Novosibirsk as it is to another in Pasadena. Consequently, except in situations involving national security, the protection of individual privacy,1 or proprietary rights, scientists have developed an ethic of full and open exchange of data, within and across national boundaries. Although infringements occasionally do occur, they typically generate community disapproval. Full and open exchange of information is a fundamental tenet of basic science that scientists regard as essential to optimizing their own work and that of their colleagues, as well as to enabling the advance of science overall.2

Traditionally, scientific data were compilations in lists, tables, and books—essentially all on paper—which circulated like all other scholarly information, through personal exchanges, subscriptions, and libraries. Today, electronic handling of scientific information is becoming the norm. With this evolution has come a dramatic increase in the international scope of scientific cooperation and exchange of information. While basic science has always been largely a collaborative activity that readily crossed national boundaries, electronic communication has made this cooperation much more informal, intimate, instantaneous, and continuous than ever before. Consequently, scientific data now may flow between scientists in different parts of the world as if they were across the street.

Scientists have been, to a large extent, the creators of the means and the environment for the ethical code governing the open exchange of their data. This is as true in the evolving electronic environment as it has been in the past. Now, however, interests outside the scientific community are exerting forces on that environment that could severely restrict this open exchange. Scientists believe that restrictions on data access will slow the progress of science and significantly diminish the potential benefits that science renders to society.

An important consideration in any discussion of exchange of scientific data concerns the "market" in which scientists participate, and particularly what its "goods" and "return" are. Scientists in academia and government are motivated overwhelmingly by the desire to generate ideas that influence the course of science. They want their papers to be read, so much so that they regularly pay page charges to have those papers published. Traditional concepts of copyright, protection of intellectual property, and financial return to the creator of a written work may apply to a scientist who writes a textbook, but become irrelevant to the researcher publishing a paper in a scientific journal. Publishers of such journals,

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

sometimes including professional societies, adhere to traditional motivations for protection of intellectual property and copyright. Scientists are usually delighted when someone wants to photocopy their articles; their publishers are sometimes aghast at the same photocopying act. This tension is often overlooked in considerations of adapting to electronic exchange of scientific information. It becomes especially important when one tries to bring economic and legal thinking to bear on the management of scientific data, and on the behavior and the system of values of scientists. (For more detailed discussion on these issues, see Chapters 4 and 5.)

This report focuses primarily on international access to scientific data for basic research purposes. Nevertheless, in some disciplines, such as meteorology, a significant part of the data is generated to serve the general public by making possible severe weather and flood warnings and associated weather prediction. In formulating policies for international data exchange, the need for data for these applications also must be taken into account. In this chapter the committee broadly characterizes types of scientific data and their use in the laboratory physical sciences, astronomy and space sciences, Earth sciences, and biological sciences; outlines some of the major data trends, opportunities, and challenges in the natural sciences; discusses selected discipline-specific issues; and describes problems of access to data in less developed countries. The chapter concludes with the committee's recommendations for steps to improve access to data in the natural sciences worldwide.

TYPES OF DATA AND THEIR USE IN DIFFERENT DISCIPLINES

There are several ways to characterize scientific data: among others, by form, whether numerical, symbolic, still image, animation, or some other; by the way they were generated or gathered, that is, from experiment, observation, or simulation; by level of quality; by the size or form of the databases that contain them; by the nature of the support for their generation or distribution, that is, public or private, national or international; and, of course, by subject. Perhaps the most obvious differentiation is according to the degree of refinement of data along the path from collection to publication. Several linked levels of data can be distinguished in this hierarchy, beginning with initially collected experimental or observational data.

In the laboratory sciences today, data at this first level are rarely raw readings or counts. Sophisticated means for gathering and manipulating such information have softened the concept of "primary" data. The computer mediating an experiment is likely to extract from several measurements some average of the total signal minus the background noise. Frequently, the first data an experimenter sees appear as a curve or a set of points that represents addition, subtraction, and averaging of several kinds of measurements, all collected and manipulated electronically. While some of these data may be published as tables, most data at this

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

level have limited distribution. They are useful when shared among the participants in a large collaboration, for example, in a high-energy physics experiment. International distribution of data of this kind is normal practice, particularly among collaborating scientists.

The second major level of data in the laboratory sciences is usually published scientific results based on collected data, sometimes including the data and sometimes only providing a pathway by which the data can be obtained. Evaluated data files, the next level in the hierarchy, are compilations of data from several sources created when an "evaluator" has worked to obtain the "best" values of the tabulated quantities. Such files are often broadly disseminated, sometimes in journals established for that purpose, such as the Journal of Physical and Chemical Reference Data; increasingly, these files will be available electronically and, with hypertext, will be linked, so that anyone reading a manuscript will have ready access to on-line data files on which published results are based, just by clicking on the relevant figure or text. When data are structured or compiled in an organized manner, whether in raw form or after thorough evaluation or processing, they become a database.

In the observational sciences, scientific research leads to the generation of data that can be processed and interpreted at different levels of complexity.3 Typically, each level of processing adds value to the original, raw data by summarizing the original product, synthesizing a new product, or providing an interpretation of the original data. The processing of data leads to an inherent paradox that may not be readily apparent. The original unprocessed, or minimally processed, data are usually the most difficult to understand or use by anyone other than the expert primary user. With every successive level of processing, the data tend to become more understandable and better documented for the nonexpert user. One might therefore assume that it is the most highly processed data that have the greatest value for long-term preservation and international exchange, as in the case of the laboratory sciences, because they are more easily understood by a broader spectrum of potential users. In fact, just the opposite is usually the case for observational data, because it is only with the original unprocessed data that it will be possible to recreate all other levels of processed data and data products. To do so, however, requires preservation of the necessary information about the processing steps and ancillary data.

Another important way to characterize scientific data in general is by quality, as indicated by their degree of acceptance in the scientific community. "Prepublication" data bear no certification whatsoever. Such data would, for example, be considered by most scientists to be inappropriate as legal evidence. Data accepted for publication in a refereed journal carry a certification that they, and the text that accompanies them, contain no obvious error and are admissible topics of scientific discourse. Published data, however, are often challenged, and occasionally the data or their interpretations prove erroneous. When they have been thoroughly validated, data become dogma. Values of natural constants, to

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

some number of decimal places, are firmly established in this way. Steps toward confirmation of the soccer-ball structure of the molecule C60 illustrate this progression in acceptance and endorsement of data and their interpretation. At first it was conjectured, prior to publication; then the proposed structure was published and shown to be consistent with other evidence then available. When a method appeared for preparing the substance in macroscopic quantities, new experiments, notably x-ray diffraction, nuclear magnetic resonance, and infrared spectroscopy, gave unassailable proof that the molecule is indeed shaped like a soccer ball. Since then, nobody would think of questioning that structure.

Particular uses of data and characteristics of disciplines in the natural sciences influence needs for and conditions affecting global access to information in those areas of research. Examples of successful international data exchange activities in each of these areas are given in Appendix C.

Laboratory Physical Sciences

The laboratory physical sciences comprise an interrelated set of disciplines that includes chemistry, materials science, physics, and the subdisciplines and applications of each of these. The primary users of most of the data generated and exchanged in these fields are other physical scientists, although data from research in chemistry, materials science, and condensed-matter and polymer physics find heavy secondary use in manufacturing and engineering applications. Recognition of potential new or changed applications often stimulates the generation of new data and concepts from basic science in these areas; the flow of stimuli as well as data runs both ways, between applied and basic sides of these sciences.

The laboratory physical sciences generate data largely from experiments, simulations, or theoretical computations.4 (In the observational sciences, the data typically describe single, unique events, such as the weather on a particular day or the explosion of a supernova.) Although experiments in the physical sciences can be repeated, it is often the case that due to the size of the apparatus, the extent of the collaboration, the rarity or uniqueness of the test material, and the expense involved, the results of a single experiment are adopted and exchanged.5 Instead of simply repeating experiments, scientists in these fields generally learn of a new advance and quickly use it as a steppingstone to go beyond that advance, frequently by modifying the technique or the apparatus. In the case of less complex laboratory research, scientists typically repeat the previous experiments, as much to validate the new approach as to check the previous results. The research results and the underlying data from basic experimental research are not limited by national boundaries. Information presented in an international meeting, a seminar by a foreign visitor, or an electronically circulated preprint is at least as likely as a new publication in an international journal to stimulate a new line of work. When scientists are engaged in international collaboration and exchange

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

data that are not yet ready for publication, the national boundaries separating the collaborators are even more transparent.

Another characteristic of the physical sciences is associated with the established theoretical framework of many of the subdisciplines. The data derived from the theoretical numerical simulations in many cases look like experimental data, and often are replicated. These simulations, particularly animations, may not be part of the conventional manuscripts that report the results, but this kind of information is now exchanged globally on a variety of media. Exchanging data from simulations is a process vulnerable to the congestion problems of the Internet, described in Chapter 2, especially as the volume of such data grows.

Like all other scientific disciplines today, the physical sciences use electronic networks to coordinate, collect, compile, and distribute nonproprietary data through informal and formal means. Projects to evaluate data on particular topics, such as the thermodynamic or spectroscopic properties of a set of closely related substances, typically involve small international collaborations that communicate by Internet. More complex efforts, such as determining the "best" values of natural constants, require more formal cooperative working arrangements and regular data exchange. One such effort is the maintenance of the Evaluated Nuclear Structure Data File (ENSDF), an electronic database of evaluated data on properties of atomic nuclei and on radiation produced by decay of unstable nuclei. The database has existed in electronic form for about 25 years. An international network of individuals carries out its evaluations. Within the United States, this work is coordinated and supervised by the National Nuclear Data Center at Brookhaven National Laboratory; internationally, the International Atomic Energy Agency performs these functions. The ENSDF effort collects data from publications and other sources and then evaluates and distributes the data in a variety of formats as users call for it. Prior to the Internet, these data came to ENSDF on magnetic tapes, but now they arrive via electronic network, primarily by file transfer protocol, a convenient and widely used mode of transferring data electronically at moderately high speed. The dissemination effort is truly worldwide, with active on-line accounts in approximately 40 countries, on six continents. This system is described in more detail in Appendix C.

Physical scientists, in general, seek the most timely, lowest-cost, and most widely effective means for disseminating their results and for obtaining those of others, as long as proper citation is not compromised. Apart from proprietary data associated with commercial products, data in the physical sciences tend to be readily available, through journals, government publications, and books, and increasingly, through electronically available databases. The databases in the laboratory physical sciences may seem large in comparison with, for example, dictionaries; nonetheless, among the four areas of natural science considered in this report, the laboratory physical sciences typically have the smallest databases.

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

Astronomy and Space Sciences

The primary needs for and uses of data from space are in fundamental research, but there are many collateral applications, such as precise positioning, mapping of the Earth, navigation, education, and even entertainment, as the public interest in Comet Shoemaker-Levy demonstrated. Astronomy is indeed interesting to the public. As such, its data must not only be collected, but also be interpreted and made available for formal and informal educational purposes, as well as for the advancement of our knowledge about the universe.

Most data in astronomy and space sciences come from observations made from Earth's surface or from spacecraft;6 a modest fraction of the data comes from laboratory experiments. The data from experiments, terrestrial or in spacecraft, conform closely in character to data in the laboratory physical sciences. Usually, an individual observer or observing project collects the data and distributes them to other individuals as soon as they have been taken. These data frequently have significant value to other researchers and for purposes other than those for which they were gathered. It is useful, for example, to compare data taken by different observers in different wavelength bands or to compare observations at different times in order to interpret variable objects. Hence it is important to store space science data in a form readily available to other researchers. Most astronomical data archives, which are open to all scientists, do so. Use of these archives is limited only by ease and cost of access. Consequently, this community has had to adopt efficient data management practices throughout the life cycle of the data, to permit effective access by the entire community, national and international.

Research in astronomy and space sciences is collaborative and, inherently, deeply international; it requires multinational efforts to collect data and to implement efficient transnational exchange of data. Electronic links now provide the requisite efficient communications and exchange of data. The scientific reasons for this international character include the following:

  • Ground-based observatories must be located at optimal observing sites, such as mountaintops with good observing conditions, which are found only in certain countries;
  • Some experiments require simultaneous observations at several points, such as long-baseline radiointerferometry;
  • Only parts of the sky can be seen from any single location; and
  • Some observations, such as those in the x-ray and far-ultraviolet regions, can be made only from outside the atmosphere, and hence require orbiting observatories, while others require sending probes to other planets, which creates a need for collaboration with scientists in nations that have space programs.

An economic driving force for the internationalization of space science is the

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

high cost of large new facilities; this encourages international collaboration as a means of cost-sharing. Thus, the Hubble Space Telescope was developed and is operated as a partnership of NASA and the European Space Agency (ESA), with access available to astronomers from all over the world. The Gemini project, building two 8-meter telescopes, one in Hawaii and one in Chile, is a partnership of the United States, the United Kingdom, Canada, Chile, Argentina, and Brazil.

Even without explicit or formal collaboration, international sharing of astronomical data generally enhances the field. Recent examples include the impact of Comet Shoemaker-Levy on Jupiter, the International Halley Watch, the observations of Comet Hyakutake, and the observations of Supernova 1987a. Still less organized research projects are enabled daily by accessing archived data for historical and multiwavelength comparisons and by facilitating communication among collaborating astronomers.

Astronomers and space scientists establish research strategies and priorities for data collection in their subdisciplines. In the United States, this is usually done within the National Research Council, for example, under the decadal Astronomy Survey Committees or the Space Studies Board's planetary and space physics science strategy panels, or by NASA or National Science Foundation (NSF) science working groups or ad hoc science community studies. Other nations and international organizations develop similar research strategies, for example, the ESA's Horizon 2000, plans for the European Southern Observatory, and the international Gemini project. Such planning efforts are becoming more international and effectively identify data needs and policies in support of the projects.

Earth Sciences

In the broadest terms, Earth science data are fundamental to the discovery and creation of knowledge concerning the interactions among matter, energy, and living organisms.7 Development of this knowledge is essential for ensuring the prospect for humanity on our finite planet in the face of rapid demographic and economic growth. Between 1820 and 1992, the world population increased 5 times and the gross domestic product per person grew 8 times, with a resulting global economy growth rate of 40 times. World trade grew more than 500 times. 8

The best estimate at this time is that the increase in population over the next 50 years will be greater in real numbers than the increase over the last 170 years, accompanied by further large increases in economic activity and world trade.9 This situation will bring to the fore new environmental issues and problems that will press us ever more urgently to ameliorate the impact of humankind on the environment.

Within the purview of the physical Earth sciences are natural phenomena at all spatial and temporal scales that present major scientific challenges for understanding and prediction. These phenomena include natural hazards such as hur-

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

ricanes, tornadoes, floods, earthquakes, and volcanic eruptions. Besides the societal impacts associated with climate, natural hazards, and natural resources, there are numerous man-made hazards that are coupled with natural phenomena that are the subject of Earth science research. Examples include the prediction and mitigation of pollution plumes in ground water or the atmosphere (e.g., chemicals or radioactive materials), the factors involved in stratospheric ozone depletion, and the monitoring of treaties that ban underground nuclear testing (e.g., in support of the Comprehensive Test Ban Treaty).

The physical, chemical, and biological processes that shape the world in which we live are complex and interdependent. To understand them requires observations with sufficient spatial and temporal resolution and coverage to characterize the phenomena of interest and to constrain theoretical predictions that are based on conceptual or quantitative models. Therefore, the lifeblood of research in most of the Earth sciences is observational data, sometimes global in coverage, and taken repeatedly over time. Many of these data also must be integrated with data from experimental manipulations, or from other disciplines.

An example is atmospheric circulation, which controls weather over the entire Earth with significant variations on time scales ranging from hours to decades or longer, and spatial scales ranging from less than 1 km to thousands of kilometers. Weather forecasts for more than a day at a time require the rapid and repeated acquisition, processing, and interpretation of very large amounts of synoptic observations on at least a continental scale. Satellite systems that gather the necessary data have been and are being developed, but timely access to the data gathered by different organizations or countries is a major concern. Climate studies require many of the same observations as for weather prediction, but also data on the oceans, land surface, and cryosphere for the entire Earth. Therefore, international sharing of very large volumes of global atmospheric circulation data is essential for meaningful scientific investigation of past and present climates.

Scientific knowledge in the various subdisciplines of the Earth sciences has advanced to the point where important, multidisciplinary global-scale problems can be tackled with insight and scientific rigor, provided that high-quality global observations are available and that computational resources are adequate to process and interpret large and diverse data sets. Major examples of interdisciplinary and integrating research programs in the Earth sciences are the World Climate Research Program of the World Meteorological Organization (WMO) and the International Council of Scientific Unions (ICSU), the International Geosphere-Biosphere Programme organized under the auspices of ICSU, and, nationally, the U.S. Global Change Research Program.10 These are major initiatives, begun in the 1980s to understand the driving mechanisms (both natural and human) that cause significant changes in the Earth system. These efforts involve collecting and analyzing massive data sets from Earth-observing satellites and integrating them with multiple-area or site-specific data all over the Earth, including developing countries. Significant progress in these types of complex

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

research programs can be made only if there is effective transnational flow of data and information.

Biological Sciences

The breadth of the kinds of data in the biological sciences is probably the widest among the four areas described in this report.11 The subjects of the data encompass types and modes of propagation of life forms, modes of provision of food and fiber, conservation of the planet's biota, public health and safety, the molecular bases of life processes, and biotechnology. Data in the biological sciences differ somewhat from those in the physical sciences, have some characteristics in common with the other observational sciences, and have some unique characteristics. Biologists have no fundamental constants or periodic table. They do share with chemists the data specifying structures of molecules, such as nuclear magnetic resonance spectra and x-ray diffraction patterns, as well as the inferred structural parameters themselves. However, many biological data specify ranges of incidence or of values of some properties. Such data require textual descriptions, which become part of the databases. Collections of such data require modes of access that are different from and frequently more complex than those that serve well in the physical sciences. Analysis by computing associations and similarities, rather than by direct, experimental, causal assessment, is characteristic of biology.

Concepts are sometimes less well defined in biology than in the physical sciences, and so clarity can be compromised when terms with even slightly different definitions, explicit or implied, are used to classify and describe what should be commonly understood data. Even the concept of ''species" causes problems. For example, there are questions regarding the variabilities found within and between species and regarding whether species should be defined according to DNA sequences, with no distinctions within the species, or according to taxonomy, with differentiations made among subspecies. This issue is elaborated in greater detail below in this chapter.

Biologists use some large databases, particularly those of nucleic acid sequences that form the fast-growing genome databases. Efforts to build, maintain, and distribute the information in these databases are highly international and collaborative. Centers around the world collect data from contributing scientists and immediately share them, incorporating them as they accumulate into a coordinated database. In this respect, biologists share certain problems with the observational sciences. Proprietary concerns probably arise at least as frequently in the biological sciences as in the laboratory physical sciences or the Earth sciences, but much more frequently than in the space sciences.

A somewhat unique characteristic of many biological data, especially regarding distributions of species, is that they are very location-specific. Consequently, in order to protect fauna or flora in a given location, or the privacy or

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

property rights of the people who live there, barriers unrelated to the research itself sometimes arise that inhibit the flow, particularly the international exchange, of biological data.

DATA TRENDS, OPPORTUNITIES, AND CHALLENGES IN THE NATURAL SCIENCES

The increasing use of electronic means for data collection, storage, manipulation, and dissemination is one of a number of broad and interrelated trends that have significant implications for access to data in the natural sciences. These trends include the following:

  • Rapid growth of the body of scientific data;
  • Development of large international research programs;
  • Insufficient funding for data management and preservation activities worldwide;
  • Decentralization of data management and distribution;
  • Electronic publication; and
  • Increasing use of simulations and animations as scientific data.

The discussion in this section addresses these broad trends as well as the opportunities and challenges they present. Some disciplineor field-specific issues are discussed in the next section.

Rapid Growth of the Body of Scientific Data

In every area of the sciences, both the volumes and the types of data have grown at rates unforeseeable 30 years ago. This growth has been especially rapid primarily because of vast improvements in and increasing availability of imaging detector arrays at most wavelength ranges. For example, in the Earth sciences, new technology allows data to be collected repetitively with high spatial resolution. Remote sensing systems are generating immense volumes of data that are pushing the limits of our ability to store, retrieve, and analyze those data. For instance, the introduction of ground-based Doppler radar and new satellite systems is significantly increasing the data volumes within the atmospheric sciences. Table 3.1 shows a selection of land remote sensing data sets and their anticipated volumes that are archived by the Earth Resources Observation Systems (EROS) Data Center operated by the U.S. Geological Survey in Sioux Falls, South Dakota. In seismology, new initiatives both in the United States and in other countries have resulted in continuous, broad-band digital recording at high sampling rates. Special studies using up to 1,000 sensors generate very large data sets for each experiment. Table 3.2 illustrates the actual and projected growth in data volumes at the Incorporated Research Institutions for Seismology (IRIS) Data

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

TABLE 3.1 Projected Volume (in Terabytes) of Satellite Remote Sensing Data Holdings at the U.S. Geological Survey's Earth Resources Observation Systems (EROS) Data Center, 1997 to 2005

Data Source

By 1997

By 1998

By 1999

By 2000

By 2001

By 2005

Landstats 1-5

120.5

122.5

122.5

122.5

122.5

122.5

AVHRRa

12.5

16.5

20.5

24.5

28.5

40.0

SIR-Cb

20.0

90.0

90.0

90.0

90.0

90.0

Landstat 7

20.0

70.0

120.0

170.0

170.0

SRTMc

112.0

113.0

114.0

117.0

MODISd

10.0

36.0

62.0

88.0

166.0

ASTERe

15.0

60.0

110.0

160.0

310.0

TOTAL

153.0

274.0`

511.0

642.0

773.0

1,215.5

a AVHRR—Advanced Very High Resolution Radiometer

b SIR-C—Shuttle Imaging Radar-C

c SRTM—Shuttle Radar Topography Mission

d MODIS—Moderate-resolution Imaging Spectroradiometer

e ASTER—Advanced Spaceborne Thermal Emission and Reflection Radiometer

SOURCE: U.S. Geological Survey, National Mapping Division.

Management Center. Table 3.3 provides a representative sample of astrophysics data archived by NASA and demonstrates a similar trend in the space sciences.

Technology for data storage and computation continues to improve at a rate consistent with the capability to handle the rapid growth of accumulated data in the observational sciences. Scientists worldwide will have to adapt their research strategies to make effective use of these new data.12 Although state-of-the-art projects can manage the increasingly large data volumes, perhaps with difficulty, other users, especially in developing countries, are unlikely to be able to access or effectively use such data for their own research.

Development of Large International Research Programs

As the previous discussion indicates, basic scientific research has become ever more internationalized as a result of several factors: the expanding capabilities of communication and computation networks, the capabilities for conducting high-quality science in increasing numbers of countries, and the economic driving force for sharing the high costs of large projects. These factors have led to the formation of new organizational paradigms and methods of data management. Consequences of this internationalization have been more and higher-quality science, faster progress, and ever more international involvement. These opportunities have appeared at all levels, from individual investigator and small-group science to large-scale "big science" projects.

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

TABLE 3.2 Summary of Actual and Projected Data Volumes Archived in the Incorporated Research Institutions for Seismology (IRIS) Data Management Center, 1994 to 2000

Data Volumes (gigabytes/year)

Data Source

Number of Instrumentsa

1994

1995

1996

1997

1998

1999

2000

GSN

100

1,159

2,359

3,959

6,003

8,047

10,091

12,281

FDSN

146

370

670

1,070

1,530

2,050

2,670

3,416

JSP arrays

5

1,095

2,190

3,650

5,475

7,300

9,125

10,950

OSN

30

0

0

15

58

218

498

936

PASSCAL-BB

500

1,318

2,277

3,556

5,154

7,073

9,312

11,867

PASSCAL-RR

500

542

885

1,341

1,912

2,597

3,397

4,310

Regional-Trig

500

150

290

490

730

1,030

1,390

1,755

TOTAL

1,781

4,634

8,671

14,081

20,862

28,315

36,483

45,515

NOTE: Abbreviations are as follows:

GSN—Global Seismic Network (IRIS)

FDSN-Federation of Digital Seismic Networks

JSP-Joint Seismic Program (with the former Soviet Union) (IRIS)

OSN-Ocean Seismic Network

PASSCAL-BB—Program for Array Studies of the Continental Lithosphere-Broadband (IRIS)

PASSCAL-RR-Program for Array Studies of the Continental Lithosphere

Regional Recordings (IRIS) Regional-Trig-Regional Triggered Recordings

a Projected for the year 2000.

SOURCE: IRIS Data Management Center, private communication, 1994.

Recent years have seen several new multibillion-dollar international projects, and multimillion-dollar international efforts have become almost commonplace. These "megaprojects" or "megascience" programs have a number of common characteristics. They require long-term funding commitments; they may necessitate the building of new large facilities or instruments, which then require large expenditures for operating funds; they typically involve teams of researchers working on different aspects of the project, with the consequent requirement for international communication and data exchange; and, with the current state of technology, their scientific objectives cannot be fulfilled by using a smaller-scale research format.13

In 1991 the U.S. Congressional Budget Office identified 80 projects funded by the U.S. government that each cost at least $25 million (in 1984 dollars) during the period from 1980 to 1986.14 Many of these involved significant international participation. In contrast, there were only a handful of such nonmilitary large-scale research projects in the 1950s and 1960s.

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

TABLE 3.3 A Representative Sample of NASA Astrophysics Archives, by Satellite Mission

 

High Energy Astrophysical Observatory 2

International Ultraviolet Explorer

Infared Astronomical Satellite

Hubble Space Telescope

Compton Gamma Ray Observatory

Data type

X-ray data

Ultraviolet data

Infared data

Optical/Ultraviolet data

Gamma-ray data

Year of launch

1978

1978

1983

1990

1990

Duration

2.5 years

Ongoing

300 days

Ongoing

Ongoing

Total data volume (gigabytes)

~100

~100

~150

~5,500 by year 2005

~10,00 by year 2000

Data center

Einstein Observatory Data Center, Cambridge, Massachusetts

National Space Science Data Center, Greenbelt, Maryland

Infared Data Analysis Center, Pasadena, California

Space Telescope Science Institute, Baltimore, Maryland

National Space Data Center, Greenbelt Maryland

 

SOURCE: NASA; reprinted from National Research Council (1995) Preserving Scientific Data on Our Physical Universe: A New Strategy for Archiving the Nation's Scientific Information Resources, National Academy Press, Washington, D.C.

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

For the purpose of this discussion, it is useful to identify several types of large international scientific projects and programs:15

  • Experimental facilities for neutron beam, synchrotron radiation sources, lasers, high-energy particle physics, high-field magnet laboratories, and fusion experiments.
  • Fixed observational facilities such as optical and radio ground-based telescopes, environmental remote sensors (e.g., lidars and radars), and deep ocean drilling projects.
  • Space science observational satellites, including space telescopes for astronomy and astrophysics, space physics observatories, and planetary missions.
  • Earth observation satellites for collecting data about Earth's atmosphere, oceans, land surface, and geophysics.
  • Distributed observational programs that collect data in many different locations as part of an internationally organized research program. Examples of this include the global seismic network, the International Geosphere-Biosphere Programme, the Human Genome Project, and the new Biodiversitas project.

It is the latter two types of scientific research initiatives that pose the greatest data management challenges for effective international exchange, as discussed below in this chapter.

Insufficient Funding for Data Management and Preservation

Despite the vast increases in recent years in the amount of data collected and stored, and the very large augmentations in the funding allocated to new observatories and experimental facilities in ambitious international research programs, there has not been a commensurate increase in the funding for data management and preservation. At the same time, the costs associated with data retention and distribution are typically far less than the costs of reacquisition (in those cases in which reacquisition is even physically possible). Although the committee did not perform comprehensive research on the actual funding levels worldwide, the members of the committee believe this to be a problem common to most disciplines, in many research programs, on both domestic and international levels.

In the laboratory sciences, funding agencies focus their support on research and tend to overlook the fact that data compilations and data access are key to progress across the research spectrum. Furthermore, the sheer volume of data now available makes it increasingly difficult for individual compilers in the tradition of Ptolemy and Beilstein to fill this need as a pro bono activity, or for the work to be done as merely ancillary to some funded research program. In addition, society fellowship and award committees generally do not place much value on the contributions their applicants may make to the infrastructure of science in the form of data compilation, organization, and evaluation work. Funding agen-

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

cies have an opportunity to enhance the international aspect of these activities by supporting scientists from developing countries who tend to be well-educated but underemployed. Such a policy could be very cost-effective in that it would not require high initial capital costs for facilities, but only labor and data access costs.

The funding situation in the observational sciences tends to be even more difficult, with potentially more significant negative effects, given the data-intensive nature of such research. Experience indicates that scientists associated with new observatories get much more support than those handling data from old ones, even though the payoff from optimal utilization of existing data sometimes is greater. For instance, according to figures supplied by NOAA, the agency's budget for its national data centers in FY 1980 was $24.6 million, and their total data volume was approximately 1 terabyte. In FY 1994, the budget was only $22.0 million (not adjusted for inflation), while the volume of their combined data holdings was about 230 terabytes! During this same period, the overall NOAA budget increased from $827.5 million to $1.86 billion, mostly to fund the acquisition of new observational data.

An example of insufficient funding for data management and preservation is the National Land Remote Sensing Satellite Data Archive at the EROS Data Center. This national archive was established by Congress 16 in 1992 and endorsed by the 1996 National Space Policy without the provision of any new funding to support its expanded mission.

Although improvements and significant cost reductions in data storage and processing technologies have enabled government data managers to keep up with the demands in most cases, the pressures from chronic underfunding occasionally have led to ill-considered attempts to commercialize or privatize the data management and dissemination functions (as discussed in Chapters 4 and 5). In other cases, these financial difficulties have led to inadequate preservation and access provisions, which sometimes result in the partial or total loss of irreplaceable data sets.17 The challenge is to develop data management and archiving infrastructure and procedures that can handle the rapid increases in the volumes of scientific data, and at the same time maintain older archived data in an easily accessible, usable form. An important part of this challenge is to persuade policymakers that scientific data are indeed a precious resource that should be preserved and used broadly to advance science and to benefit society.18

Decentralization of Data Management and Distribution

Data collection, management, and distribution over the long term depend on a variety of institutions, that is, organizations that transcend the interest of individuals or ad hoc groups of scientists. These institutions have various roles, missions, and funding responsibilities that affect use of and access to scientific data. Many have direct responsibilities or objectives regarding data generated by publicly funded research, including, in particular, the following institutions:

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×
  • Government science agencies and academies of science;
  • Intergovernmental scientific organizations and coordinating bodies;
  • Publicly funded data management institutions, including data centers and libraries;
  • Publicly funded research institutions, primarily in academia;
  • National and international nongovernmental organizations, such as scientific and engineering societies, library associations, and information industry associations;
  • Commercial publishers; and
  • Governmental policymaking and regulatory bodies.

The information technology revolution has changed the roles of some of these institutions and brought about the establishment of new entities. For scientific data management at both the national and the international level, the technological changes have supported the development of organizations with the following attributes:19

  • Widely distributed responsibility. New telecommunications, data management, and standardized technology is leading to highly reliable distributed data management capabilities. The growing availability of information technology professionals (along with the lower technical skill levels actually needed by end users) is enhancing the ability to distribute data more broadly and increase user participation. Such distribution of data and their ownership (whether actual or implied) by user groups improve the utility of the data and help create important support for long-term retention.
  • High-value peer-to-peer communication. With on-line access to data and people, a variety of new collaborative relationships can develop. Information can be broadcast to interested individuals in a timely fashion. Data can be provided directly to field researchers to focus new data collection. Physical proximity and formal lines of communication are no longer vital to effective organizational operation. Indeed, closed, highly structured organizations often will be uncompetitive or unable to take full advantage of innovation.
  • Specialized data functions. When resources and capabilities are distributed, some specific locations can make an effective contribution by specializing. Specialized groups can be created in a scientific discipline or in some aspect of data management, archiving, or standard setting. Such centers can achieve significant economies of scale, reducing overall costs while enhancing the effectiveness of certain functions for the benefit of all.

The National Nuclear Data Center (NNDC) is one of many examples of the evolution from sole source to distributed network. Under an international agreement, NNDC at Brookhaven National Laboratory is the U.S. source for the international distribution of evaluated nuclear data files (see Appendix C). When the

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

primary forms of distribution were hard-copy books or published tables, the sole source was obvious—the printed pages generated at NNDC. When on-line access to the databases became available in the mid-1980s, the files had to be mirrored for easier access overseas. With the advent of the World Wide Web, individuals gained the ability to manipulate the databases, which today still reside at NNDC, using overlay programs that (physically) reside at another data center, possibly in Europe.

In the future the user probably will be unaware of where the data file being accessed physically resides and will be able to link to the journal article (published by a commercial publisher, for example) in which the data originally appeared. The electronic journal article could have links to the original data tables of the authors. NNDC has evolved from a collector of evaluated data files, which were formatted into camera-ready pages for hard-copy publication, to a center that maintains on-line access to a few databases via a variety of overlay programs (no longer is there a single, static format). It is one of many centers around the world, now common in most scientific disciplines, that develops new, electronic forms of networked data dissemination.

Electronic Publication

The development and acceptance of electronic networks as a means of communicating, searching for data and information, and accessing information rapidly and directly have driven the increase in electronic publications of all types, and scientific publications in particular. 20 Although not all publishers of scientific journals are moving to completely electronic form, there is a distinct trend to provide alternative paper and electronic versions of many publications. For example, the American Institute of Physics is working to provide its library clients by early 1997 with electronic access to every one of its journals to which the library subscribes. In addition, it now offers some of its journals in CD-ROM form as a space-saving alternative or supplement to its subscribers.21 The Institute of Physics in the United Kingdom also provides subscribers with electronic access to all 33 of its journals.22

NASA is sponsoring an all-electronic peer-reviewed journal, Earth Interactions.23 It is only available electronically and allows color representations of phenomena, including time-lapsed video clips of observations to show time variations. Mathematical calculations on subsets of original data can be carried out by the reader as well, providing both in-depth understanding of the material and a check on the validity of the author's results. This new journal is the product of three professional societies—the American Meteorological Society, the American Geophysical Union, and the Association of American Geographers—with additional support from the Ecological Society of America and the Oceanographic Society of America. These societies have a combined membership of approximately 45,000, and so this form of electronic publication will soon be

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

available to a substantial segment of the Earth sciences community. Submission, editing, and peer review will all be done electronically.

There is increased attention to electronic publication of astronomical research papers and data as well (see Box 3.1). Most conference proceedings are collated from electronic submissions in standard (e.g., TeX, LaTeX) formats. Abstracts of papers in most space science disciplines are now available on-line,24 and several journals are publishing electronic versions.

This trend is likely to accelerate and to open new opportunities for communicating research results to all scientists. Electronic publication will not just replace paper, however—it will alter the sociology of science. Writing, refereeing, and reviewing of a publication are now discrete and strictly ordered events, but they need not be in the electronic world. There, annotation, critique, elaboration, and revision can all go on iteratively and indefinitely, and in some instances no doubt they will. Some publications likely will become "living documents," under revision until they are no longer of interest. Even though our current social norms for attribution are based on the static publication model, it is doubtful that the scientific community would retain that model in order to preserve these norms. The value of a dynamic discourse is too great.25

Many electronic journals will not be "printable" in any meaningful sense. It is not just that they will contain motion and sound, but will incorporate also rich contextual links to the primary materials. Clicking on a graph will give the reader access to the data on which it is based, allowing alternative models and interpretations to be explored. A related important benefit of electronic publications is that results based on observations and modeling can be checked and validated by both reviewers and readers; restrictions on article length in paper journals and limited access to original data and software currently preclude any meaningful checks of the validity of published results based on observational data. A "copy" of the bits in an astronomical "plate" is as good as the original.

Publications also will become "active" agents, rather than passive stacks of paper. The term "program" has some of the wrong connotations, but nonetheless future publications will include executing programs-not ones that can be executed, but that are executing—autonomously gathering data, making predictions, becoming richer and more valuable as time passes.

In short, the Internet and World Wide Web are far more rapid and enabling means of communicating results, ideas, and other aspects of research than paper publications. Many changes in the conduct and dissemination of scientific research, from the individual to the international scale, may be expected to arise from these developments. 26

Increasing Use of Simulations and Animations as Scientific Data

Related to the trend in electronic publishing is the increasing use of simulations and animations in research.27 Large-scale computation arose in part from a

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

BOX 3.1 Space Science Data and Electronic Publishing

The space science community has been at the forefront of electronic publishing. Scientific societies, such as the American Astronomical Society (AAS), have been leaders in this type of information exchange. For example, AAS collects al of its meeting abstracts electronically and publishes the AAS Job Register on-line. In addition, AAS pioneered the development of effective electronic journals with its online publication of the Astrophysical Journal. What started out as an experiment has turned into a success story in electronic publication. 1

Supported in part by a grant from the National Science Foundation, the AAS in cooperation with the University of Chicago Press first developed an electronic version of the Letters portion of the Astrophysical Journal(eApJ).2 Produced in two versions(HTML for screen reading and PDF for local printing), this journal goes well beyond the electronic delivery of paper manuscripts typical of most “electronic” journals.

The eApJ has references tied into the NASA-supported bibliographic database maintained by the Astrophysics Data System (ADS; see <http://adswww.harvard.edu>), which provides abstracts of most references and is developing an archive of page images of several of the most useful astronomical scholarly journals.

The eApJ uses URNs instead of URLs (names instead of locations) as link targets, and so the links will remain valid indefinitely. Both the ADS and the eApJ use a standardized notation for naming articles, which enables links and pointers to be generated automatically during the publishing process. As part of the sophisticated set of links associated with this journal, the eApJ includes a capacity for forward referencing whereby each article carries with it an updated set of references to articles that refer to it—an automated electronic citation service. Before the full Astrophysical Journal came on-line in November 1996, the AAS made arrangements to establish mirror sites in Great Britain, Europe, Australia and (possibly) Japan to ensure relatively rapid response times.

The success of the eApJ is propelling the astronomical publisher to bring their literature on-line rapidly. Over 95 percent of the world's peer-reviewed astronomical literature is expected to be on-line by mid-1997. Standard protocols, conventions, and procedures will be absolutely, critical if this networked system of literature and data is to be effective for the working scientist.

Electronic publishing in this area has also enhanced data access and archiving. The Astronomical Data Center, located in Strasbourg, France, has an agreement with the publishers of Astronomy and Astrophysics “to provide all data files from their publications.” The Astronomical Data Center at Goddard has a similar arrangement with the AAS, which includes Icarus and the publications of the Astronomical Society of the Pacific, as well as AAS publications. This type of arrangement “permits the two centers to archive a major portion of the international astronomical data without individual requests to the authors” of journal publications.3

  • 1  

    Response by Peter Boyce, American Astronomical Society, to the committee's ''inquiry to Interested Parties (see Appendix D).

  • 2  

    The full journal is now available on-line at http://www.journals.uchicago.edu/ApJ/

  • 3  

    Response by Nancy G. Roman, NASA Goddard Space Flight Center, to the Committee's "Inquiry to Interested Parties" (see Appendix D).

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

need to simulate reactive hydrodynamic flows. Such problems still help drive the development of increasingly powerful computers. However, during the past three decades, simulations have become integrated elements of the toolboxes of experimentalists and theorists in many of the physical and biological sciences.

The increasing importance of modeling and simulation is evidenced for the materials science field by the recent (1992) inauguration of two new journals devoted exclusively to this topic: Computational Materials Science (Elsevier) and Modeling and Simulation in Materials Science and Engineering (Institute of Physics Publishing). Materials data modeling encompasses two quite different areas: materials R&D (both theoretical and experimental) and data handling and application activities (continuum level design calculations, process modeling, service behavior modeling, and compression, extrapolation, and interpolation of data). Other research areas in which computer simulations have become standard are in the design of optics for electromagnetic radiation and of beams of electrons and ions; flow of fluids; folding of protein molecules; interaction of enzyme molecules with their substrates, the species on which they act; melting and freezing, at the atomic level; the motions of individual atoms during reactive collisions of molecules; and collisions of gaseous atoms and molecules with surfaces.

Many simulations require repeated solution of equations of motion of the system by computer. These equations may be simple or complex, but however simple they are, the ability to solve them over and over, many millions of times, as the system they describe evolves, is a consequence of the power of electronic computers. The simulation results may be reduced to only a few summary numbers, which was the usual practice in the early years of computers. Now it is common for the results to include numerical information about entire time histories, information that can be put into tables and graphs.

Perhaps the most dramatic advance in simulations, however, has been the use of graphics, particularly animations. The information in an animation can give insights into a scientific phenomenon that could not be guessed from individual snapshot images or numerical indicators. Time may serve as a surrogate for a spatial dimension, allowing the investigator to visualize the behavior of a function of three independent variables. Animations are useful when an investigator uses a preconception to decide what indicators would be best to compute and it turns out that the situation does not correspond to that preconception. For example, one study examined the high degree of solid-like, cooperative, and collective motion of most, but not all, of the atoms in the supposedly liquid surface layer of a cluster of atoms. Instead of an amorphous swarm of atoms swirling on the surface, the outer, "molten" layer of the cluster showed organized, collective (but loose) vibrations by all but a few of the surface atoms. All the quantitative indicators showing liquid-like character arose from a few atoms displaced from the surface so that they were free to float just outside it, in almost-free fashion. The consequence of seeing the animation was the construction of a theoretical

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

model very different from the one the investigators had been planning to use. In short, animations have become real tools of research, not just pedagogic devices.

The data in animations cannot yet be stored or exchanged in conventional journals. However, electronic storage and transmission over the Internet and the World Wide Web make it possible for scientists to share not only their tables and still figures, but also their animations. A few published papers include references to Web addresses that provide animations of material discussed in the papers. Such data are still in a form that is primitive compared with what one can foresee. Now it is possible to play an animation, even a multiwindow display that shows several characteristics evolving simultaneously, and sometimes to stop the motion to study an individual frame. In the future, it will be possible for the viewer to stop an animation and examine the image or images from all sides, perhaps even to carry out manipulative operations on that image that correspond to simulations of physical processes. The capacity for such data manipulations will advance as the bandwidth available for data transmission increases, as data storage becomes cheaper and faster, and as the software for generating, storing, and displaying animations and more elaborate time histories becomes more user-friendly. It is already inevitable that scientists will generate image sets for molecular phenomena analogous to the computer-based "tours" of towns and cities that allow the viewer to choose any path through the area. For example, it is possible for a pharmacological researcher to "fly" a molecule of a potential new drug to a conjectured target receptor site, say in the brain, to judge whether it is a topological fit.

The capability to share images from animations has a number of advantages. However, images require more storage space, and the user must have considerably larger bandwidth capabilities to allow electronic transmission of the animations than typically are required for exchange of numerical or symbolic data.

DISCIPLINE-SPECIFIC DATA ISSUES

Over the past two decades, the National Research Council and other groups have issued numerous reports that have addressed scientific management issues for digital observational data in the Earth and space sciences.28 More recently, several studies have examined such issues in the biological sciences.29 Most of these reports have focused quite narrowly on the data management problems of specific disciplines or agencies; however, many of their recommendations have broader validity and may be applied to other disciplines and institutions in the observational sciences in the international context.

As noted above in this chapter, the very large scale environmental observational research programs in the Earth and biological sciences pose the greatest data management challenges and the most difficult public policy issues. This is so not only because of the complexity of the scientific questions in those disciplines and their data-intensive nature, but also because of their inseparable link-

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

ages to major socioeconomic issues, the potential for private-sector exploitation of the data and related research results, and their relevance to other major government concerns such as national security, trade, and foreign policy objectives. Although the focus of this study is on issues in the transnational flow of scientific data for use in basic research and not on the role of scientific data in these other, much broader contexts, it is at the interface of scientific use of data and their broader potential applications that the most vexing public policy issues arise—a topic addressed in some of the discussion below in this report. The sections immediately following, however, focus on some of the more specific discipline related challenges and opportunities in providing broad international access to scientific data.

Observational Environmental Sciences

Measuring and Monitoring Systems

The data now available on a global basis are inadequate to document and understand many environmental and health problems, or to anticipate problems that may arise because of the increasing influence of human activity. What is required is a comprehensive and long-term effort to observe, understand, assess, and predict the global environment—a World Environmental Watch.

Significant progress has been made toward this end, beginning with a series of National Research Council studies in the 1980s that outlined the rationale and data requirements for a new branch of scientific inquiry called Earth system science.30 This in turn led to the formation of ambitious international global research programs, such as the International Geosphere-Biosphere Programme and the U.S. Global Change Research Program. A set of international complementary observing systems has been proposed, elements of which are in various stages of deployment and development. Included in this set are the Global Climate Observing System (which includes the World Weather Watch and the Global Atmospheric Watch), the Global Ocean Observing System, and the Global Terrestrial Observing System.31 These internationally coordinated efforts will integrate observations from multiple satellites and airborne and in situ sensors deployed worldwide.

The Global Ocean Observing System and the Global Terrestrial Observing System are in earlier stages of development than the World Weather Watch and Global Atmospheric Watch and the associated Global Climate Observing System (see Appendix C for a description of the World Weather Watch). However, for Earth's land surface, substantial observing systems exist within many countries, and for the oceans, observing systems have been developed by countries to a considerable extent for the coastal oceans and to a limited degree for the open ocean through international collaboration.

One of the oldest international observing systems (over 100 years old) is the

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

volunteer ship observing program coordinated by the WMO and the Intergovernmental Oceanographic Commission. Through this program involving thousands of ships from many countries, weather and sea surface temperature observations have been available to all countries in real time. It also has provided data on the climatology of the ocean area for many decades and still does in conjunction with meteorological satellites.

More recently, an extensive observational network for measuring the upper ocean was put in place in the western Pacific Ocean as part of the Tropical Ocean Global Atmosphere Program. The ocean data from this network and the data from the World Weather Watch Global Observing System provide the basis for the development of atmosphere-ocean coupled models, which have formed the foundation for experimental forecasts in seasonal to interannual predictions. Extensive plans also have been formulated for the components of the Global Ocean Observing System to support the study of climate more generally.

The Global Terrestrial Observing System has several major components for the land surface, surface and ground water, and seismology. Most nations have developed hydrologic observational networks for both the surface and the subsurface water. River stage observations are taken in most countries, both for flood forecasting and for water resource management. The further development of these observational systems is essential if the nations of the world are to cope with the wide range of environmental changes that are occurring and can be envisaged.

In the biological environmental sciences, monitoring systems are much less fully developed than in the Earth sciences. Carefully planned and coordinated global monitoring systems for new and emerging diseases and ecological monitoring and biodiversity surveys are needed. An epidemiological system now in place determines which strains of influenza virus are emerging each year. The composition of each year's vaccine depends on effective monitoring and early warning. Recent outbreaks of Ebola virus in Africa indicate the need for more monitoring information that combines epidemiological and ecological data.

An example of a lack of ecological monitoring comes from consideration of the world's island ecosystems. We know a great deal about animals and plants in special habitats such as the Galapagos Islands, but essentially nothing about their microbiota. We do not know the similarities and differences in microbial ecology between the Galapagos Islands and, for example, the Cape Verde Islands, despite their geological similarities.

Given the nature of the regional and global problems and the interdisciplinary nature of the environmental and health sciences, research on a specific problem often requires the use of data from several observing systems. Therefore, important requirements are observational consistency in space and time, with accurate georeferencing to the maximum degree possible; thorough documentation of data attributes; and substantial institutional commitments to the long-term continuity of key observational programs.

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×
Quality Control and Assurance

Quality control operates smoothly and almost transparently in those sciences in which experiments are readily reproducible or lead to subsequent experiments that validate the original ones. In the observational sciences, implementing effective quality control for data requires the use of an audit trail system that includes anomaly detection, reporting, and correction, as well as the rigorous refereeing of manuscripts for publication.

Quality assurance, the mechanism used by management to assure that the quality of work is as claimed by those doing it, typically plays a far smaller role in basic science than in applied science and especially in manufacturing. However, one can interpret any mechanism to assure scientific integrity as a kind of quality assurance procedure. This concept would thus include the mechanisms to detect and investigate scientific fraud. Such "quality assurance" efforts are carried out in universities and at the National Institutes of Health, for example.

In recent years, some organizations, such as the Carbon Dioxide Information Analysis Center (CDIAC) at Oak Ridge National Laboratory have devoted significant efforts to producing high quality global Earth science data sets whose accuracy and reliability have been determined, accompanied by the descriptive (metadata) documentation needed for their use. The CDIAC has quality-assured and documented several key global change databases on such diverse topics as concentrations of carbon dioxide and other greenhouse gases in the atmosphere, carbon fluxes from the terrestrial biosphere to the atmosphere resulting from changes in land use, carbon chemistry in the oceans, and long-term climate trends in the United States.32 These value-added data sets are certified as valid by the primary users who collected the data, or by those who subsequently carried out the quality-control checks of the data. This is a somewhat costly, but successful, approach for assuring secondary users of the quality of relevant data sets.33

Preservation of Historical Data Sets

The trend toward bigger, more complex, and more expensive facilities and programs in the observational sciences, and toward attendant international collaboration, has brought about greater attention to and incentives for effectively archiving data. It also has encouraged the development and maintenance of a curatorial infrastructure necessary to manage the data better, to provide more uniform processing and documentation, and to make retrospective data more easily accessible and usable.34

Research using archived data has grown in scope and importance, especially in enabling the comparison of observations taken at different wavelengths and at different sites. In the space sciences, there are efforts to coordinate data catalogs and indices, facilitating discovery of what data are available (e.g., NASA's Astrophysics Data System and its extragalactic database, SIMBAD). Space as-

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

tronomy in the United States has been at the forefront of archiving and distributing data electronically. Archives are an integral part of all U.S. space astronomy missions and are now typically being planned for ground-based observatories and for other countries' space missions as well. The archiving technologies are openly available and shared, and data sets from new and different observations are incorporated increasingly in existing archives. Comprehensive catalogs and good user-access tools are recognized as very important, as are properly maintained and preserved duplicate data sets. Some major archives are also duplicated at different sites to reduce communication loads and to promote innovation and allow different uses.

In the Earth sciences, the study of Earth processes involves time-dependent behavior over time scales ranging from seconds to millions of years. For relatively short time scales (years or less), observational data from a common observing platform may be available from a single database. For longer time scales (decades to centuries to many thousands of years), it is necessary to scrutinize all of the retrospective observations available and to use proxy data preserved in the geologic record or in written records. Research on global change and on natural hazards, for example, whose goal is improved prediction of future conditions or events, depends heavily on accurately reconstructing the record of the past. Box 3.2 provides several examples of interesting data reconstruction projects in China.

Because there has been an increasing awareness of the great value of retrospective Earth science data, some conscientious efforts to rescue and preserve older data are being made both nationally and internationally. In the United States for example, the National Climatic Data Center and the National Geophysical Data Center have devoted time and resources to data rescue in recent years and now have a policy of transferring all their digital data holdings to new storage media at least once every 10 years.

Many specialized observational databases requiring long-term retention exist in biology as well. These include such diverse subjects as agricultural records of many types, including experimental field tests going back to the last century; museum, zoo, herbaria, and microbial culture collection records; hospital and other medical records; ecological data; breeding histories of domestic animals and plants; macromolecular sequences and their accompanying annotations; taxonomic treatments, toxicological information; folk medicine; and characterization of biological products such as food, fiber, and fine and bulk chemicals. Some of these data are in computers, but require normalization for consistency and readability. Others could be transformed into machine-readable form to make them generally accessible. These two tasks are labor intensive and require a large component of highly skilled labor. Such undertakings require careful prior evaluation for potential worth. Selective evaluation and support is necessary not only to enhance the intellectual effort to maintain existing databases, but also enable the creation of new ones.

Once the primary data are analyzed and used to publish research results, the

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

BOX 3.2 Examples of Digitizing Historical Environmental Data

In the People's Republic of China the following efforts have led to the digitizing of these historical records, among others:

  • A joint study between the Chinese Academy of Sciences and the U.S. Department of Energy used proxy records for grain harvests to extend the annual rainfall data for Beijing back to 1260 A.D.1
  • Computer analysis of ancient Chinese sunrise eclipse records showed that the length of the day was 70 milliseconds shorter in 1876 B.C. than now.2
  • Information on medicinal plants from Chinese pharmacopoeia from thousands of years ago, as well as other folk medicine records, are being computerized in an effort to identify active ingredients that might form the basis for modern drugs.3

1  

F.A. Koomanoff, Ye Duzheng, Zhao Jianping, M.R. Riches, W-C., Wang, and Tao Shivan (1988), "The United States Department of Energy and the People's Republic of China Academy of Sciences Joint Research on the Greenhouse Effect," Bull. Am. Meteorol. Soc., 69:1301.

2  

K.D. Pang, K. Yau H.H. Chou, and R. Wolff (1988), "Computer Analysis of Some Ancient Chinese Sunrise Eclipse Records to Determine the Earth's Rotation Rate," Vistas in Astronomy, 31:833-847.

3  

Senliang Li (1990), "Data Acquisition of the Chinese Medicinal Plant Database," presentation at the CODATA International Conference, Columbus, Ohio.

authoring scientist may be reluctant or inattentive about placing the unpublished primary data in a publicly accessible database or archive. Rather, the scientist is likely to concentrate on creating new data to be interpreted and summarized for additional printed publications. No incentives exist in most biological disciplines to encourage the contribution of primary data to databases. The few exceptions mostly involve data on biological macromolecules such as proteins, DNA, RNA, and complex carbohydrates. No crediting mechanism, however, adds to professional standing in the same way as a printed publication. The result is the loss of a great deal of useful data.

The long-term retention of biological databases also is being funded and managed in a haphazard, uncoordinated fashion throughout the world. This chaotic situation is unnecessary. The worldwide cooperation in establishing DNA and RNA genetic sequence databanks pointed out in Appendix C demonstrates what can be done. The world's information science community, together with the world's biologists, now have the combined skills and much of the infrastructure to preserve and to make basic biological information resources broadly available. 35 The scientific base and technology exist to produce much needed information structures that are the biological equivalent of the global weather

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

monitoring system. The critical biological problems the world's population faces in such areas as medicine, agriculture, sustainable ecology, food production, and water quality know no political boundaries and require effective information transfer and access to archival or baseline data.

The fact that observational data are unique and not reproducible leads to the conclusion that they should be preserved as part of the historical record of the dynamic behavior of Earth and its inhabitants. The intrinsic, long-term value of observational scientific data was emphasized and discussed in detail in a recent National Research Council (NRC) study.36 A major point made there was that such data are an invaluable national (and by inference international) resource that should be preserved and utilized to advance the state of knowledge about our natural environment.

If this view becomes widely accepted by national and international organizations and governmental bodies, the traditional practices of allowing older data to become increasingly inaccessible or destroying them will be supplanted by policies and procedures to preserve retrospective data in accessible and usable forms. It should be noted that the data volumes of all previously collected data in a given area of the observational sciences typically are modest or insignificant in comparison with the volumes that the current data collection systems produce; if there were a policy of preserving older data indefinitely into the future, all prior data would be transferred to new storage media in compatible formats as new storage and retrieval technology is adopted.

Despite the ability of data storage and computational technology to keep pace with the data volumes being generated, there may be instances in which sufficient resources may not be available to preserve all the useful data from a research program or a science agency. In the unlikely event that a suitable long-term repository cannot be found, the decision regarding what data to purge should be made by representatives of the primary discipline and other major user groups.

Data Integration for Interdisciplinary Research

The improvements in technological capabilities have led to new opportunities to address important scientific problems that earlier were either obscure or considered intractable. Although a considerable amount of research in the observational sciences continues to be done by individual investigators in specialty areas, there are now many more multidisciplinary, multiinvestigator studies involving complex natural processes in space and time. Attendant to this evolution is the need to access diverse types of complementary observations made by many different scientists and organizations around the world.

The large international initiatives in global change research mentioned earlier provide good examples of programs in which success is critically dependent on the transnational flow of scientific data and information. The observational data necessary to obtain meaningful results in most areas of investigation include

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

in situ (point or local) measurements, regional observations using various observing platforms (e.g., balloons, remotely piloted vehicles, airplanes), and satellite remote sensing.

A major Earth science initiative involving significant transnational flow of scientific data is the International Decade of Natural Disaster Reduction.37 The focus of this initiative is on understanding the dynamic processes that cause major natural disasters (e.g., earthquakes, volcanic eruptions, tsunamis, floods, hurricanes, tornadoes) and mitigating their effects through enhanced prediction capabilities and precautionary safety measures. All of these phenomena occur globally, and their study involves comparisons of many different types of observations. Commonly, cooperative studies are carried out by scientists from several countries and through agreements among countries or institutions. International sharing of a wide variety of observational data, including demographic data documenting human and economic effects, is essential for improving our knowledge of these natural hazards.

A particularly challenging problem is accessing and merging relatively sparse, lower-resolution, retrospective observations with the higher-resolution current observational data to document changes occurring in the environment. A recent NRC report, Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data, provided possible solutions for integrating multiple environmental data sets at different spatial and temporal scales. This report considered in detail six case studies to elucidate the problems of interfacing diverse types of geophysical and ecological data to address important environmental problems in a global context. The lessons learned from these case studies provided the basis for a set of recommendations to overcome barriers deriving from the data themselves, from users' needs, from organizational interactions, and from system considerations. The committee endorses those recommendations and incorporates them here by reference.38

The interfacing of several different environmental data sets in a single research project can be difficult because the data layers are not to a universal template and therefore do not "stack" perfectly. The misalignment of only a few tenths of a degree in longitude or latitude creates major problems and leads to misinterpretations. For example, regions of the world with complex coastlines are very difficult to study when co-registration of data layers is not perfect.

An important tool is Geographic Information System (GIS) software, which enables all types of data to be correlated or compared geographically. This capability greatly facilitates multidisciplinary research that involves many different types of observational data and disparate scales of sampling ranging from point measurements to repeated, synoptic, high-resolution satellite imagery. Until recently, most GIS software was geared toward spatial data, but now the time dimension is being incorporated more formally through four-dimensional assimilation techniques. GIS is being used routinely both for fundamental scientific research and for applications in the Earth sciences, although many problems

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

BOX 3.3 Barriers Encountered In International Environmental Assessments Using Geographic Information Systems

''I have been working for five years on an EPA-supported project developing geographic data and geographic information systems for environmental assessment on the Mexico-U.S. border. This effort has required the acquisition, verification and enhancement of various types of geographic, earth science and demographic data from Mexico, the US and several international agencies.

"During this binational project, I have encountered several major issues associated with geographic data. At the conceptual level, these issues involve the cognitive representation of the landscape used to capture the data and to represent it in the digital domain. Additionally, standardization of data acquisition methods, geographic scale, resolution, spatial accuracy, feature (attribute) definition and metadata presentation are some of the technical issues that lead to a lack of comparability and impair maximum utility in a binational setting. I have found that the definition and resolution of the conceptual and the technical issues are influenced greatly by the differences in culture (scientific, political, philosophical), language, economic development, etc. Seldom is the reality of the earth's surface represented identically in the mental maps of two individuals. The translation of this perceived reality to digital maps distorts the variation even more. Unlike other types of data, the wide variety of distinct applications for individual geographic and earth science data sets compounds the potential for misunderstanding and misuse by various user groups."

SOURCE: George F. Hepner, University of Utah, personal communication, January 18, 1996.

remain.39 Box 3.3 summarizes some of the barriers encountered in an international environmental assessment project using GIS.

Documentation to Support Secondary Users of Observational Data

As discussed above, research in the observational sciences increasingly involves the integration of multiple, diverse data sets, most or all of which were not collected by the end users. The primary researchers who collect the data often do not make the effort to include the documentation that secondary users need. These secondary users,40 who frequently are less knowledgeable or technically sophisticated, must have sufficient information about the data (i.e., metadata) in order to avoid possible misuse and misinterpretation. Of course, if the data are used improperly, incorrect results and interpretations are likely to result, and these, in turn, may be propagated through the scientific community of secondary users, thereby spawning still more erroneous interpretations.

Therefore, a key component of effective international and interdisciplinary

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

use of scientific data is the associated metadata that describe where and how the data were collected, the calibrations that convert raw data into physical units, corrections that have been made, the quality and reliability of the data, the data format(s) and any other information or caveats concerning the proper use of the data.41

For example, research on global change is in large part being carried out by secondary users. Past observations of the temperature at Earth's surface, gathered at many locations globally for weather, agriculture, or environmental studies, provide a long-term record that is being investigated to determine whether global warming is taking place. In other cases, one type of observation may be the proxy for another parameter that has not been adequately observed (e.g., cloud cover as a proxy for precipitation). A remote sensing example is the original Landsat, which was designed for looking at vegetation, crops, and other agricultural purposes but has been extensively used for geological studies; likewise, the Seasat synthetic aperture radar imagery was taken to study the sea surface, but it has provided a new type of observational data to study geological features on land, particularly faults and other geological boundaries, surface textures, and soil moisture. Past climates are being inferred from paleontological data on fossil spores and pollens originally collected for biological studies of limited local areas.

The research community does not now have a coordinated effort to index these and other extant data sets and to distribute and update this index in the form of an electronic directory. Not knowing what data are available is clearly a barrier to international research in global change science. Efforts are currently under way to address this issue. For example, the International Geosphere-Biosphere Programme's Data and Information System has proposed identifying data sets at three levels: (1) by directories, identifying the existence of data sets; (2) by guides, containing information about their quality and other characteristics; and (3) by inventories, specifying the individual items that are present.42

Declassification of Environmental Data at the End of the Cold War

Observational data collected for military or espionage purposes are necessarily kept secret for some prescribed period of time, at least until the documented events, or the inherent evidence of the data collection techniques and technological capabilities themselves, can no longer compromise national security. Some of these data sets contain valuable historical data, particularly observations of certain locations or phenomena that are collected on a consistent, repetitive basis for many years, or even decades.

In recent years, both in the United States and especially in the countries of the former Soviet Union, the end of the Cold War has led to the release into the public domain of many of these data. An example is seismic data gathered for the purpose of underground nuclear test monitoring. Until the end of the Cold War,

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

no regional recordings of Soviet nuclear explosions were available, nor were Soviet recordings of U.S. or other countries' test explosions. Now there is access to large amounts of these data by scientists outside the former Soviet Union. Other types of Soviet Earth science data, such as gravity and magnetics observations and Arctic oceanographic observations, also have been made available to the scientific community. 43 Likewise, U.S. data from some previously classified observational programs, including reconnaissance satellites44 and undersea sensors, 45 have been made publicly available. The international availability of useful Earth science data has increased significantly with these data declassifications, and the committee encourages all governments to undertake similar reviews of classified retrospective data sets.

Improving International Access to Scientific Data in the Observational Environmental Sciences

A striking example of the benefits of extensive data collection and research for international management of environmental problems is evident in agreement by the nations of the world on a clear strategy for mitigating depletion of ozone in the stratosphere.46 Not only was agreement reached in a limited period of time, but substitute substances and technologies also have been developed rapidly without a large economic impact on society.

Unfortunately, not all of the many global environmental and health problems can be confronted in the same way as was done for stratospheric ozone. In fact, many of the underlying research issues are extremely complex and interrelated. In the case of reducing the uncertainties regarding the much publicized global warming trend, extensive geophysical and biological data on the atmosphere, ocean, land surface, and cryosphere will be required on a global basis for long periods of time. The role of the ocean in the global carbon cycles and in the energetics of the atmosphere, the impacts of deforestation and decertification, the full implication of the radio actively active gases, and a host of interrelated natural processes need to be understood. Such understanding can be gained only by acquiring and analyzing comprehensive data sets on a global basis, with the active involvement of most, if not all, nations, and with the best efforts of the world's scientific community. But we do know from the stratospheric ozone problem that international agreement can be reached when adequate data and understanding of the problem are available to policymakers throughout the world. It is therefore essential that environmental and health data and information capable of describing our global atmosphere, ocean, and terrestrial system be fully and openly available. Moreover, making such basic data broadly available is fundamental to ascertaining the veracity and validity of the scientific process and of the resulting conclusions. If the data supporting the conclusions are not readily

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

available to others for independent analysis, then the confidence in the research process and results will be undermined.

This is not to say, however, that all data must be made widely available as soon as they are generated. Indeed, an important reason for some period of delay is to ascertain the accuracy and integrity of the data and to prepare them for broader use, as discussed in the previous sections. The difficulties inherent in the collection and proper documentation of data by field researchers, or in the processing and organizing of large and complex data sets, can make a delay in the release of those data not only justified, but prudent. In addition, it is customary in many cases for a funding agency to provide the principal investigator or originating research group with the right to withhold public release of their data for a prescribed period of proprietary use, not only to adequately prepare the data for broader dissemination and use, but also to give the principal investigator an opportunity to analyze the data and to publish the first results. At the same time, there may be legitimate countervailing public policy reasons for early or even immediate availability of data, for example, data collected in publicly funded government programs such as meteorological satellite systems, which have both immediate operational and longer-term research applications.

While the availability of scientific data as soon as is reasonably possible should be the presumption, a single, uniform time period for the release of all data is neither sensible nor desirable. What is important is that the funding agency together with the community of scientists make a thorough evaluation of the competing interests guiding the release of its data prior to the initiation of every major data collection program, to establish the terms and conditions of data availability in consultation with the principal research and data user communities, and to subsequently enforce compliance. The 1996 NASA Science Policy Guide provides a good example of a data availability policy for nonproprietary scientific data obtained through public funds (see Box 3.4).

From a broader policy standpoint, the committee believes that the U.S. data management policy established for the U.S. Global Change Research Program in 1991 (commonly referred to as "the Bromley Principles" in reference to D. Allan Bromley, the President's Advisor for Science and Technology at that time) provides an excellent model for all major aspects of data availability and access. Box 3.5 contains the main points of that policy. The committee adapts the definition of "full and open exchange of data," subsequently developed by the NRC's Committee on Geophysical and Environmental Data,47 as data and information are made available with as few restrictions as possible, on a nondiscriminatory basis, for no more than the cost of reproduction and distribution.

Unfortunately, the international exchange of data between research groups, government agencies, and scientific data centers, including the World Data Centers, is rapidly becoming more complicated, just at a time when full and open exchange is most needed to make progress on major global environmental problems. A growing number of government data centers outside the United States

Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×

BOX 3.4 NASA's Data Availability Policy

Ready access to data from NASA research programs and missions (via modem data archiving and communications technologies) by researchers not directly involved in the program increases the return on NASA research investments. It is therefore NASA policy that nonproprietary scientific data obtained from NASA programs and missions will be made publicly available in usable form as quickly as possible. (Nonproprietary data are data that may be distributed without violating patent, trade secret, or copyright laws or NASA's ability to obtain and protect U.S. government intellectual property rights.) Such data constitute a national resource that can be used by scientists, policymakers, and the public throughout the country to undertake new scientific studies, permit wider assessment of the validity of the results and conclusions from NASA missions, and facilitate broader public understanding of the value of NASA programs and missions.

The issue of data rights is a complex one that involves consideration of a wide range of competing factors including:

  • the right of public access to data which has been obtained at public expense;
  • the need to protect the original ideas which form the basis for competitively selected research (there is a strong tradition and body of law in the United States concerning the protection of intellectual property rights);
  • the principle of fairness to investigators to allow them to pursue original ideas and hypotheses and to carry out the scientific investigations for which they were selected;
  • the need to avoid the premature release of misleading results;
  • the need to verify data prior to public release;
  • the need for early release of data when such early release is critical to national needs or required for overarching public policy reasons;
  • the need for early release of data for educational and public information purposes; and
  • the need to protect data which may have a proprietary commercial application which may confer a competitive advantage, particularly to U.S. industries competing in the international marketplace.

A wide variety of approaches to data rights have been used:

  • Virtually immediate release for scientific and public information purposes.
    • Shoemaker-Levy 9 observations
  • Release of data as soon practical when data are considered reliable for general use.
    • Earth Observing System
  • Restricted use of data during a limited calibration and verification period, after which verified data are deposited in a public archive.
    • Magellan
  • Restricted use of data for a limited period of time (typically one year), after which verified data are deposited into an archive for general use.
Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
×
    • Most past solar system exploration missions
    • Compton Gamma Ray Observatory
    • Hubble Space Telescope
    • Life and microgravity sciences research
    • In-space technology experiments (2 years)
  • Restricted use of data for an extended period to carry out investigations requiring the acquisition of data over a long interval of time; data are eventually archived,
    • Cosmic Background Explorer
  • Stringent restrictions placed on access to data on the basis of Privacy Act or other considerations.
    • Human research data
    • Proprietary data obtained through the use of NASA facilities such a wind tunnel test results
    • Restrictions that result from data purchase agreements
  • It is NASA policy that all nonproprietary scientific mission data be made publicly available after the shortest reasonable time in forms which permit a wide range of users to derive scientific, technical, and other benefits. However, it appears that neither NASA nor the research community would be well served by the rigid adoption of a single uniform policy on the distribution and dissemination of data. Rather, the policy should be established for each mission or research program on a case-by-case basis. Well-understood and widely circulated criteria for making such determinations must be established. The approach to be taken for each program or mission will be spelled out in Announcements of Opportunity, Research Announcements, or other competitive mechanisms so that prospective participants understand the conditions of participation. Mechanisms may also have to be developed to assess adherence to NASA policies concerning the general availability of data. If a change is necessary in a previously agreed-to approach to data rights, NASA will consult the investigators affected to develop a mutually agreeable plan that meets the spirit of the principles set forth here.

    SOURCE: Reprinted from NASA Science Policy Guide (1996), available on-line at<http://dlt.gsfc.nasa.gov/cordova/guide.html>.

    charge high prices for data and impose various dissemination and use restrictions. Ad hoc bilateral agreements between data centers and government science agencies are becoming commonplace. These agreements take on many forms and can lead to a situation in which individual scientists will no longer be able to obtain data for their projects without a major effort or large expense. Additional legal constraints on access to scientific data of all types are currently being implemented or considered, as discussed in depth in Chapter 5. The committee thus recommends that internationally, in both intergovernmental and nongovernmental organizations, the full and open exchange of scientific data from publicly funded research be adopted as a fundamental principle.

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    BOX 3.5 The "Bromley Principles" Regarding Full and Open Access to "Global Change" Data

    The overall purpose of the policy statements is to facilitate full and open access to quality data for global change research. They were prepared in consonance with the goal of the U.S. Global Change Research Program and represent the U.S. government's position on access to global change research data.

    • The Global Change Research Program requires an early and continuing commitment to the establishment, maintenance, validation, description, accessibility, and distribution of high-quality, long-term data sets.
    • Full and open sharing of the full suite of global data sets for all global change researchers is a fundamental objective.
    • Preservation of all data needed for long-term global change research is required. For each and every global change data parameter, there should be at least one explicitly designated archive. Procedures and criteria for setting priorities for data acquisition, retention, and purging should be developed by participating agencies, both nationally and internationally. A clearinghouse process should be established to prevent the purging and loss of important data sets.
    • Data archives must include easily accessible information about the data holdings, including quality assessments, supporting ancillary information, and guidance and aids for locating and obtaining the data.
    • National and international standards should be used to the greatest extent possible for media and for processing and communication of global data sets.
    • Data should be provided at the lowest possible cost to global change researchers in the interest of full and open access to data. This cost should, as a first principle, be no more than the marginal cost of filling a specific user request. Agencies should act to streamline administrative arrangements for exchanging data among researchers.
    • For those programs in which selected principal investigators have initial periods of exclusive data use, data should be made openly available as soon as they become widely useful. In each case, the funding agency should explicitly define the duration of any exclusive use period.

    SOURCE: Data Management for Global Change Research Policy Statements, U.S. Global Change Research Program, July 1991.

    The committee believes that such an agreement would significantly improve the ability of researchers to develop an adequate scientific understanding of our natural environment and the human condition, to address major problems facing the world community, and to broaden and enrich the knowledge base of all humanity.

    Given that scientific data in all the disciplines—not just the observational

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    sciences—are widely distributed globally in different archives or databases and that significant changes in data collection, storage, retrieval, and dissemination are steadily taking place, it is clear that some form of distributed data management strategy will be required to assure effective and efficient access by the scientific community. Considering the trends outlined in this chapter, and the imperative for broad and sustained international cooperation in environmental research, the committee concludes that the most viable and effective approach for the transnational flow of scientific data and information is through a system of connected international networks, each of which is the gateway to particular types of information. These data exchange networks, building on the successful models presented in Appendix C, would connect peer institutions for mutually beneficial rewards and collaborations, and provide data access to the research and education communities. The committee recommends the continued evolution of the existing distributed network of data centers as part of the global information infrastructure, with coordinated standards and procedures to provide unrestricted access at zero or low costs to data required for the study of regional and global problems.48 This "network of networks" would provide connectivity to multiple data archives internationally and would serve as a coordinated source for important scientific data and information. Significant savings in research time, effort, and cost, as well as an overall enhancement of results, could be realized by using such a resource.

    Terminology and Nomenclature in Biology and Related Fields

    A significant barrier to sharing of information in the natural sciences is that subfields within disciplines have different languages, jargon, and usage. Without clear means for bridging resulting gaps in understanding, communication can be difficult. Moreover, lack of precision in terms themselves or in their use can lead to fundamental problems in searching for and interpreting data. A biologist, for example, may use the common name of an organism in recording and transmitting data without taking into consideration the limitations of the term or differences in usage. "Mouse" can signify any of a large number of small rodents. Peanuts are not nuts. Hospital records that indicate "atypical E. coli" without including the original observations justifying the label ''atypical" do not communicate as much information as may be needed for treatment of a patient or for subsequent studies.

    The complexity of what is being observed can also complicate precise description. Although the largest epidemiological studies describe far fewer events than does a short span of infrared satellite data, for example, the resulting biological database typically will have many more attributes associated with each event than are associated with, say, astronomical events. Likewise, a patient's hospital record often contains hundreds of different kinds of observations generated in the course of a diagnostic workup, and even a routine blood sample can yield data of at least a

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    dozen kinds. Variability in the interpretation of each observation and associative reasoning play key roles in subsequently understanding the information.

    A fundamental problem related to transfer of biological information in particular is the lack of a consultative body to standardize definitions for the words used to describe features of organisms. Dictionaries' coverage often is limited to narrow groups of organisms. There is no universal language of biology. The chemist knows with great precision what the term "sodium chloride" means; determining precise descriptors for aspects of biological entities presents challenges of another magnitude.

    Consider, for instance, the term "spore" as used in a description of microorganisms. Spore-forming microorganisms consistently turn up in microbial biodiversity and bioconservation studies. But the taxonomy of these organisms is complex. Spore formation occurs in bacteria, fungi, protozoa, and algae, and the range of types of spores is also wide. Currently, the definitions and descriptions of spore types are very confusing but must be taken into account. Because of the lack of a comprehensive, authoritative treatment of the description of spores and spore types, many biodiversity studies are either in error or not understandable owing to misidentifications or confusing descriptions. Furthermore, the lack of a consistent vocabulary with respect to spores has legal and regulatory consequences. Intellectual property rights concerning strains of microorganisms require strict definitions and accurate, understandable descriptions of the properties of the strains, especially if a strain or its use is to be patented. Scientists, government regulators, patent officials, and industry lawyers, among others, all require agreed upon definitions and standards for describing the various forms of spores and spore-related anatomical features of microorganisms.

    The difficulties associated with describing and defining spores applies to many other descriptors used in biology. Further, as biologists adapt words from other disciplines and branches of biology, the meanings can drift. Terms developed by botanists to describe the shapes of leaves are also used in describing cell shapes in microorganisms, albeit with subtle or even major changes in meaning. "Obpyriform" signifies a pear-shaped leaf with the stem coming out of the blunt end. Applied to algae, the same term indicates pear-shaped cells with the blunt end in the forward direction in swimming. In addition, although "pear-shaped" assumes as a model the common commercial pear sold in most of the world, it is also the case that Asian "pears" have no obvious narrow end.

    Problems of nomenclature extend to large-scale biological studies as well. For example, the lack of consistent classification schemes for land-cover vegetation and soils hampers international data exchange and can lead to errors of interpretation, especially by non-expert users. Current approaches to classifying land cover include the physiognomic, floristic, and ecosystematic approaches. It is generally agreed that existing land-cover maps cannot provide a globally consistent and up-to-date database for global change studies.49 Satellite remote sensing is the only viable approach to developing a map of vegetation that is

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    useful for global change research, and several satellite remote sensing activities now are addressing data requirements for characterizing land cover.50

    For soils, as for land cover, there is no agreed-upon classification scheme. Soil is a three-dimensional dynamic entity whose properties—physical, chemical, and biological—vary dramatically in time and space. Observations of the causes and effects of these variations provide a valuable historical record of the components and processes that have produced current soil characteristics and conditions. With today's focus on research to understand human-induced changes and rates of change in the Earth system, the challenge to the soil science community is to provide a globally credible, compatible, and usable soil and terrain information system that can be integrated with information about other components of the Earth system. Only with open access to essential terrestrial information can intelligent decisions be derived about the Earth system and what, if any, human intervention is needed to protect it. Box 3.6 provides an overview of some of the nomenclature problems in this complex area.

    The need for improved standardization of terminology, however, is not confined to biological or soil science databases. In any field, use of standardized terminology in a computerized database is vital in structuring the database, in digitizing data captured from printed sources, in accessing the database, and in interchange of data. The inflexibility and inexorable logic of the computer put new emphasis on control of terminology and on value-added system features such as thesauri, synonym files, and expanders for abbreviations and collectives terms. Without these aids, valuable information in databases may become inaccessible or incorrect, and incomplete or misleading information may be retrieved.51

    Terminology-related barriers to understanding of data can be addressed only through internationally coordinated actions. Some efforts exist, such as the SOTER project described in Box 3.6. Another notable effort, a workshop to initiate mapping of the correspondence in terms associated with spores across all of microbiology, was supported with modest funding from the United Nations Environment Programme, the Committee on Data for Science and Technology (CODATA), and the U.S. National Science Foundation. The comprehensive Dictionary of the Fungi52 is a widely accepted compendium of terms and their definitions. However, the Systematized Nomenclature of Pathology,53developed in the 1960s for computer entry of data, has not been widely adopted, either in the United States or in the rest of the world.

    An important activity in the biological area has been CODATA's Commission on Standardized Terminology for Access to Biological Data Banks. The commission's encouragement led to the International Union of Pharmacology's establishment of a body to standardize the nomenclature for receptors for drugs. The International Committee on the Taxonomy of Viruses collaborates with the commission in its efforts to standardize the descriptors for viruses. The commission also is participating with the International Union of Biological Sciences and

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    BOX 3.6 Incompatible Soil and Terrain Information Systems

    Today there is a critical need for detailed, universally compatible soil and terrain information for use in such applications as global change modeling, national resource planning and development, and plant breeding. Between 1960 and 1980 the Food and Agriculture Organization (FAO) of the United Nations, working with all nations, generated a Soil Map of the World, at a scale of 1:5,000,000, and published it jointly with UNESCO.1 This major accomplishment provided a basis on which more current and detailed maps suitable for global change modeling might be built. A soil and terrain map at a scale of 1:1,000,000 is critically needed to study and model rates of change related to human activity in terrestrial ecosystems.2

    For the industrialized countries, much more detailed soil maps exist at scales ranging from 1:10,000 to 1:500,000. However, for many of the less developed countries, the FAO-UNESCO Soil Map of the World may be the only soil map available for the whole country, although some countries may have detailed maps for some areas. Because of its lack of detail and the inadequacy of its original data, the use of the Soil Map of the World for national resource planning and development is often questionable. Unfortunately, FAO has no plans to produce a more detailed soil map of the world.

    Since 1950, many agricultural and resource development programs have been implemented throughout the developing world, with technical and financial assistance from the United Nations and governmental and private agencies from Industrialized countries. These assistance programs have included a number of localized soil mapping projects. As a result, in many of these countries soils have been mapped according to different soil classification systems. Often no attempt has been made to integrate one system with another. This situation has resulted in great confusion in many countries. The significant differences among classification systems confound the interpretation of these soil maps for national resource management, and make it difficult, if not impossible, to derive a credible national quality assessment of soil and land resources.

    In 1986 the International Society of Soil Science embarked on an ambitious project to develop the World Soils and Terrain (SOTER) digital database at a scale of 1:1,000,000.3 The first task was to develop a universal legend for describing both the cartographic units and the descriptive data for different soil and terrain categories. The second task was to develop a set of procedures that would make it possible to translate and correlate soil and terrain data (cartographic and descriptive) from any soil classification system to the universal SOTER database.4 During 8 years of effort in several countries many improvements have been made in the SOTER procedures, but the number of constraints to progress remains formidable. One of the greatest technical constraints is that there is no universally accepted coil classification system. Three of the major systems in use (i.e., systems that have been applied in relatively large areas) are the Soil Taxonomy System (a U.S. system developed with broad international collaboration), the French system, and the Russian system. These systems have been applied widely in specific projects

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    in countries where the United States, France, and Russia have had collaborative development projects or strong ties. SOTER attempts to address this constraint by providing a universal legend under which any existing soil and terrain map of acceptable quality may be translated and correlated with the SOTER legend and entered into the SOTER database. In this process the data set for any country or classification system can retain its original identity.

    Measurements of physical, chemical, and biological characteristics are essential for quantifying soil quality. One of the technical constraints in the measurement of soil properties is the lack of uniformity or standardization of analytical procedures. Arlother constraint on the comparison of soil properties from one classification system to another is that for many soil properties, different class limits are assigned for ranges of soil parameters used for classification purposes. Examples are the lack of uniformity in the classification of soil texture, that is, the amounts and distribution of different sizes of soil particles, and the absence of a universally accepted definition of slope—none-to-slight, slight, moderate, steep, very steep.

    Another barrier to access to comparable soil and terrain data sets across international boundaries is the disparity in the conceptual use of quantitative data to delineate mappable soil differences. Some classification systems integrate much more quantitative analytic data into these delineations than do other systems.

    Finally, there are intradisciplinary and interdisciplinary constraints. Within the applied discipline of soil science, there are many "subdisciplines," including soil physics, soil chemistry, soil mineralogy, soil microbiology, soil bioremediation, soil genesis, soil classification and survey, and soil degradation and reclamation. These groups often work in relative isolation from each other and may develop their own jargon, which may hinder access to data within the soil science community itself.

    Efforts have been made in recent years to "connect" soil scientists with specialists in other disciplines, such as crop geneticists. Much more could be done, however, to remove the constraints to better use of soils information by plant breed, era in the development of crops more suitable to prevailing soil conditions, such as cultivators of maize or rice that are tolerant (or resistant) to aluminum toxicity, which is prevalent in acid soils of the tropics. Many other scientists, land use planners, resource managers, civil engineers, environmental engineers, attorneys, resource economists, and others require specific kinds of information about soils and soil properties. Unfortunately, it continues to be very difficult for the non-soil scientist to effectively access and use the data.

    • 1  

      Food and Agriculture Organization (1980), FAO/UNESCO Soil Map of the World, Food and Agriculture Organization of the United Nations, Rome.

    • 2  

      M.F. Baumgardner (1993), "The Critical Need for a World Soils Database for Global Change Modeling" in Proceedings of International Workshop on Soils and Global Modeling, International Geosphere-Biosphere Programme, Stockholm.

    • 3  

      International Society of Soil Science (1988), World Soils and Terrain Digital Databases at a Scale of 1:1M (SOTER)," project proposal, M.F. Baumgardner (ed.), 18, ISSS, Wageningen, the Netherlands.

    • 4  

      International Soil Reference and Information Center (1993), Global and National Soils and Terrain Digital Databases: Procedures Manual, ISRIC, Wageningen, the Netherlands.

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    the International Union of Microbiological Sciences to establish the System 2000 network to assemble a database of scientific names of all biota.

    All these efforts, however, represent a small part of what needs to be done to make data and results more accessible. The mechanisms for establishing standards for words, formats, and storage and retrieval conventions in biological information management need to be improved. To be effective and accepted, this must be done on a truly global basis. In general, standardization of terms in science today is either carried out or validated by the appropriate ICSU body. For example, the CODATA Task Group on Fundamental Constants is the recognized authority in nomenclature for fundamental constants. In biology, the Codes of Nomenclature are promulgated by ICSU components appropriate to the discipline.

    The committee suggests that the CODATA Commission on Standardized Terminology for Access to Biological Data Banks be enhanced into a true consultative body for this purpose. The commission would need funds sufficient to provide effective standard-setting services to the biological community. Expansion of personnel and increased collaboration with other ICSU and outside scientific organizations would be necessary for both functional and political reasons. This should be an ICSU function, coordinated by CODATA, because there is no other established international source of such standard setting in the biological sciences.

    Data Compatibility in the Laboratory Physical Sciences

    The barriers to the international exchange of scientific data in the laboratory sciences generally are not as complex as those in the observational sciences, partly because of the difference in the volumes of data accumulated and used in day-to-day research and partly because of the ways in which the disciplines have evolved. In the physical and the laboratory biological sciences, for example, full compilations of data have always been published in textbooks and in articles in professional journals available throughout the world, whereas the data of the observational sciences in many cases have been accumulated only in government records.

    Barriers to international data exchange in the laboratory sciences concern ease of access to data and the use of those data. Today, the effective exchange of virtually all scientific data requires that they be in electronic format. For manuscripts, exchange can be straightforward if scientists adopt a common word processing language such as TeX or LaTeX, now used worldwide in many communities in the physical sciences. Scientists need to be able both to generate a computer-readable manuscript that can be decoded and read on all computer platforms and to decode and read whatever other scientists may similarly provide. Establishing a common set of tools to ensure such compatibility can be difficult, more so for simulations and animations than for text. Because of the volume of data they involve, simulations and animations need to be compressed

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    for storage and transmission, a problem that should be easy to resolve since they invariably originate in computer format. However, making them accessible to arbitrary platforms requires either considerable sophistication or standardization, or both.

    Converting databases created in hard copy to electronic format can be a costly enterprise, but is nevertheless far cheaper than erecting library buildings. Considerable care is needed to ensure that the original data are not compromised in the process of generating the electronic version. 54In recent years, most data transferred automatically from paper to computer have been captured and stored as images of the printed pages. The alternative is to store the data as text, apart from components that are true images, such as molecular structures; the increasing availability of optical character-reading software is making textual storage practical and economical. The large number of databases in the physical sciences that have been developed by the National Institute of Standards and Technology (NIST) have traditionally been available only in relatively expensive, hard-copy books. NIST plans to provide on-line access to all data it collects and compiles and recently has developed on-line access, with search capabilities, to databases critical to research in chemistry and physics.55 There is a great difference between data stored electronically as images, which cannot be manipulated by the user, and data stored as digitized alphanumeric information, which can be treated as normal text and tables. Data in most modern databases in the physical sciences are generally not static, especially when the databases are stored electronically and hence can be updated as the information improves. Thus there is always a continuing responsibility and expense to maintain and disseminate those databases as they evolve. If the data are stored only as images, such maintenance is difficult and costly. Storage, maintenance, and distribution become vastly easier and more efficient if the information is in the form of a true relational database, in alphanumerics, with user-friendly search capabilities, qualities that require expense and technical sophistication to implement.

    In some areas of the physical sciences, notably materials science and chemistry, the fragmentation of the data into numerous, autonomous, and often incompatible databases continues to be a considerable barrier to access. Many small data files exist, often maintained by individuals, with a plethora of formats and a range of quality levels. When there are several databases, many means of access to them, and inadequate directories to locate and search them, it is difficult to know what information a particular system of databases includes, how to locate sources for information that they do not cover, and how to assess the quality of the data. The problem is further exacerbated when some data are in journal form, others in hard-copy manuals, and still others in a variety of electronic databases, each of which may be on a different platform, often with limited search capabilities. This is in contrast to situations in areas such as atomic and nuclear physics, in which data have traditionally been compiled and disseminated from a single source, or at least in a standardized format. The dissemination of materials

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    science and chemistry databases remains fragmented, and the broad range of researchers in these fields still need better access to them.

    Most of the data in the databases of the physical sciences are needed to carry on the basic research that the funding agencies support. Just as the funding agencies support the hardware necessary to do research, these agencies also carry a responsibility to support the data components of the infrastructure necessary to conduct research. Also, just as support for basic research needs to be protected because of the likelihood of long time intervals between the conduct of the research and its eventual applications, so should the development and maintenance of databases be protected from short-term fluctuations in budgets or varying needs for the data in industrial applications. The development of databases includes the compilation and evaluation of data from the variety of sources of the data. Once developed, it is critical that databases be maintained and continuously updated as new, relevant data become available. The dissemination should be via a variety of platforms and should be in user-friendly forms, with cross-referencing to files maintained by other agencies, or available via other electronic media.

    The committee believes that science agencies should maintain responsibility either directly or under subcontract for the development, management, retention, and dissemination of electronic databases that are the product of their research programs. Within the United States, the Office of Science and Technology Policy should develop an overall policy for the long-term retention of scientific data, including a contingency plan for protecting those data that may become threatened with the loss of their institutional home.56

    ACCESS TO SCIENTIFIC DATA IN DEVELOPING COUNTRIES

    The international exchange of scientific data has a scope beyond that of the large scientific communities in the technically and economically developed parts of the world. While much of this report reflects the research atmosphere in which its contributors work, it is especially important to address aspects of the subject associated with disparities of wealth and resources among nations, the cultural differences with which nations address their societal problems, and the varying ways nations assign their priorities.

    The differences in priorities are especially marked in the spectrum of ways in which nations, from one end to the other of the scale of development, consider scientific and technical matters. More industrialized and wealthier nations choose to invest discretionary public funds in basic sciences, such as high-energy physics and astronomy, as well as in applied and developmental science and technology. Nations toward the other end of the developmental scale put little emphasis on sciences with long-term public-good payback and put most of the resources they have into applied sciences such as agriculture, aquaculture, medicine, and, recently, biotechnology.

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    As a consequence of the availability of discretionary resources, scientists in more industrialized nations traditionally have been able to obtain reliable, up-to-date research equipment, computers, communications infrastructure, and information resources. The scientific communities in developing countries have not had such advantages. In the context of this report, this has meant that the scientists in developed nations have had much better access to data and to the underlying means of communication than their colleagues in other nations, who consequently have not been able to take full advantage of their talents.

    One of the great challenges in the advancement of science that now faces the international community is to use electronic acquisition, management, storage, and distribution of scientific data to reduce the gap between those who have had easy access to the fruits of scientific progress and those who have not. Because of the decreasing costs of electronic technology, compared with the rising costs of traditional means of storing and transmitting scientific data, the opportunity is now opening to make advances in bringing scientists in developing countries much more deeply into the circle of their colleagues in developed countries. There will be problems and outright barriers to confront in the process of reducing this gap, but the situation now offers brighter possibilities than at any time since science became a major, worldwide enterprise.

    In this section, the committee reviews some of the issues in data access in the context of this asymmetrical relationship between the developed and developing world. The constraints to data access within developing countries are considered first. These include both the limited capability to generate new scientific data and the problems facing indigenous scientists who want ready access to data from outside sources. Such limitations lead to underutilization of the talents of those scientists because they cannot easily stay abreast of advances in their fields. The committee then examines the ability of scientists in developed countries to obtain useful data based on work in developing countries.

    Constraints on Data Access Within Developing Countries

    Basic to any consideration of constraints on access to data is the economic situation in developing nations. Economic limitations on access to scientific data are manifest primarily in the inadequacy of communications infrastructure and related research equipment (pointed out in Chapter 2), as well as in insufficient resources for training and education. Another important set of constraints not as deeply affected by the lack of resources is organizational inadequacies.

    In many developing countries today, gaining and maintaining access to international sources of scientific data and literature are very difficult. University libraries and research institutes in these countries cannot afford to subscribe to the major scientific journals, publications that tend to be readily available to scientists in wealthy nations. Databases that are available even at very low rates, such as the marginal cost of reproduction, can be prohibitively expensive. Con-

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    tact and sharing by scientists in non-industrial nations with scientific colleagues in other countries can be extremely limited. Although there have been some notable efforts on the part of organizations such as the American Association for the Advancement of Science and UNESCO to provide scientists in developing countries with printed copies of scientific data and information,57 much more could and should be done to improve such sharing of information. The committee therefore recommends that until affordable and ubiquitous electronic network services are available, national and international scientific societies and foreign aid agencies should establish or improve their existing efforts to send extra stocks of scientific publications to libraries and research institutions in developing countries that need them.

    Training and Education Considerations

    The governments of most countries recognize that education, particularly higher education, is vital for the creation of a solid national base for scientific endeavors and economic growth. The poorest nations typically send their students abroad for advanced education and specialized training, often in applied disciplines deemed most useful upon the students' return. Following completion of their postgraduate education and research abroad, however, a large number of these highly skilled scientists do not return to their home countries, effectively negating for the home country the immediate broader benefits of their training. 58 Many of these countries cannot provide a sufficiently supportive environment, including the necessary research infrastructure and funding, to attract and keep scientists. Further, lack of ready access to current information leads to professional obsolescence. The "brain drain" from the poorer to the wealthier nations is a serious constraint to the generation of new knowledge in the developing countries.

    In addition to the limitations of the available data management and communications technology, training in the use of available technology is limited as well. The growing sophistication of both hardware and software tends to make their use more efficient and eases the training burden in some ways, but it increases it in others. Basic functions of the computer system are becoming increasingly automated. However, the functional power of the systems increases the demand for and use of more complicated techniques for management, analysis, and dissemination. An important related problem is a lack of adequately trained personnel for servicing such complex equipment.

    At the most basic level is the lack of instructional support for the neophyte computer user. For example, in courses taught under United Nations auspices on topics such as use of computers in microbiology, the students in developing nations overwhelmingly request supplemental training in the use of computers for data management and analysis, and in on-line access to data and information resources. Generally speaking, much more instructional outreach in basic computer data management and communication skills is needed.59

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    The committee recommends that international development organizations, together with professional societies, provide targeted training programs for scientists in the use of computers, with emphasis on the management of digital data in specific disciplines.

    Organizational Issues

    There are many organizations that provide bilateral and multilateral assistance to scientists in developing countries, although few are focused primarily or exclusively on scientific data issues. These organizations support scientists through a variety of mechanisms. Some provide scientific data and services directly to researchers in developing countries, others provide access to data through journal subscriptions and travel grants to international scientific conferences, some provide Internet connections and information technology services, and others promote and provide training and education. Examples of national and international government institutions, nongovernmental organizations, not for-profit organizations, professional societies, and private-sector firms that provide these types of services are described briefly below:60

    • U.S. government. Within the United States, the federal government, primarily through the U.S. Agency for International Development (USAID), provides foreign assistance for activities of scientists and engineers in less developed nations. 61Other federal agencies such as NASA and the Department of Agriculture assist scientists by providing data resources and data management services. 62Finally, the Department of State, through its Bureau on Oceans, Environment, and Science, indirectly provides assistance through negotiating and monitoring environmental agreements and conventions that have significant cooperative research and data exchange provisions.63
    • Intergovernmental organizations. Many intergovernmental organizations provide assistance to scientists and researchers in developing countries by providing data and information, training and education, and assistance with information technology. The lead player in this arena is the United Nations, primarily through the United Nations Educational, Scientific, and Cultural Organization (UNESCO), United Nations Development Programme (UNDP), United Nations Environment Programme (UNEP), Food and Agriculture Organization (FAO), World Health Organization (WHO), World Meteorological Organization (WMO), and the World Bank.64

    Regional organizations, such as the Organization of American States (OAS)65 and the Pan American Health Organization, also promote science and technology in developing countries through regional activities. The European Community pursues scientific and technological cooperation with developing countries as well, particularly with the aim of generating knowledge and technologies needed to help achieve sustainable development. 66

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    Finally, various ad hoc intergovernmental groups and committees have been organized to coordinate activities related to major international research programs as discussed above in this chapter. Many of these groups have subgroups devoted to different data management issues, including activities focused on developing countries. For example, the Committee on Earth Observation Satellites (CEOS) coordinates all spaceborne Earth observation missions among the spacefaring nations. CEOS has established a ''Plan of Action for Support to Developing Country Activities by CEOS Participants."67 Box 3.7 presents a number of useful "lessons learned" by CEOS participants in providing support to developing countries.

    • Nongovernmental organizations (NGOs). International NGOs, such as the Third World Academy of Sciences,68 the International Council of Scientific Unions,69 and the International Foundation for Science, 70 collaborate with U.N. programs and agencies to provide scientific and technological support to developing countries.

    The Consortium for International Earth Sciences Information Network (CIESIN) is an example of a national NGO, with broad international scope, that provides data and services to scientists in the developing countries. In addition to providing "global and regional network development, science data management, decision support, and training, education, and technical consultation services," CIESIN is the World Data Center A for Human Interactions with the Environment.71

    Many national and international not-for-profit organizations also assist scientists in developing countries via different mechanisms. The Sabre Foundation's Scientific Assistance Project provides educational materials in the form of books and journal subscriptions and an Internet-based technical assistance program to institutions and individuals in the former Soviet Union and Eastern Europe.72 The International Science Foundation was established by George Soros in 1992 to assist scientists in the former Soviet Union and the Baltic States by promoting contacts with the international scientific community, providing access to scientific data and information, and establishing international communications links.73 The International Research and Exchange (IREX) Board promotes academic exchanges between the United States and the former Soviet Union and provides professional training, technical assistance, and policy programs.74 Other organizations, such as the Volunteers in Technical Assistance (VITA), contribute information services and technology to developing countries to improve their quality of life. 75

    National and international scientific and engineering societies and associations play an important role as well. For example, in addition to the African libraries program described above, the American Association for the Advancement of Science has promoted regional collaborations between scientists in developing countries.76 Some professional organizations provide travel grants to allow individual scientists from developing countries to attend international sci-

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    Box 3.7 CEOS “Lessons Learned” Regarding Support to Developing Countries

    The Committee on Earth Observation Satellites (CEOS) compiled the following list of principles based on the experiences of its members in providing technical assistance to developing nations:

    • Development projects should be planned in a partnership between donors and local institutions in response to real needs of in-country decision-makers. Decision-makers need to be convinced of the utility of such activities in order to create the appropriate environment for sustainable operation.
    • Projects supported by joint efforts of space agencies with development assistance organizations can benefit from combining important skills in both science/technology and sustainable development.
    • Pilot projects should be selected with their later operational requirements in mind. To be considered successful, a pilot project will provide the foundation for ongoing routine application of the demonstrated capability. This suggests the use of affordable technology and readily available data. Projects aimed at improving indigenous capability to perform already ongoing operations are more likely to succeed.
    • Documentation prepared for use by developing country users should be available in a language readily understood locally, using minimal technical jargon, to be easily understandable by the target audience.
    • Data and information for developing countries should be on media appropriate for the users, avoiding electronic formats requiring equipment that is not available. Easily reproducible text and imagery will often be more readily usable than sophisticated digital products. At the same time, consideration should be given to improving local infrastructure so that media such as CD-ROMs can progressively be used in developing country applications.
    • Expertise in developing countries must not only be created but also be sustained. This suggests holding local training courses and emphasizing “training the trainer.” Improving existing educational institutions rather than creating new training centers can enhance the sustainability of the educational process.
    • Local reception of satellite data can be an effective tool in identifying practical applications and demonstrating the value of the data in the local environment. Equipment installed in developing countries must be designed to be rugged and easily maintained with locally available capabilities.
    • Satellite data alone will not contribute to development unless it is transformed into useful information and disseminated.
    • Countries that have successfully applied satellite technology to development problems can serve as examples and share their experience and expertise within their regions and more broadly. Their experience may be more relevant and more applicable to other developing country situations than the approaches used in industrialized countries.
    • Development assistance projects should be structured with sufficient flexibility to respond to the unexpected events that often occur.

    SOURCE: CEOS; available on-line at <http://gds.esrin.esa.it:80/0xcafc3d_0x000291f2>;international$sk=041858E7.

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    entific and technical conferences. Other efforts include the American Society for Mechanical Engineers' partnership with the Mechanical Engineering Research Institute of the Russian Academy of Sciences to promote the application of environmental and energy-related technologies to establish a technology transfer mechanism between the two organizations.

    • Private sector. A number of private sector organizations also provide assistance to scientists in developing countries. This assistance is usually indirect, through the financial support of international NGOs such as the Third World Academy of Sciences and ICSU.

    Many of the problems cited above in this section are exacerbated by a lack of effective organizational structures or institutional mechanisms for involving scientists within developing countries in the decision-making process regarding scientific research, much less data access issues. However, foreign aid agencies in the developed countries and intergovernmental development organizations are known not to involve scientists in their decision-making process either. For example, U.N. funding agencies respond almost exclusively to requests from the foreign ministries of member countries. The foreign ministries in developing countries almost never utilize scientists in decisions. The result is a dearth of funding applications for scientific infrastructure capacity building, which is essential not only to support indigenous scientific research efforts, but also to encourage economic development. An analogous situation is evolving in USAID, where science once flourished, but where the involvement of scientists in internal planning and funding decisions is eroding rapidly.

    Of course, some success stories do exist. For example, Vietnam, concerned about environmental pollution as well as the need to build biotechnology capacity, arranged for scientists at many levels to collaborate in developing a national plan in microbiology and biotechnology infrastructure capacity building for submission to the Global Environment Facility of the U.N. through the United Nations Industrial Development Organization.

    With regard to improving access to scientific data in developing countries, the committee makes the following recommendations:

    • Scientists in developing countries should be encouraged to organize to promote the policy of full and open access to scientific data in their own countries, as well as to make their data available internationally.
    • Foreign aid agencies should (i) make available to individual scientists in developing countries more direct, peer-reviewed grants that include support for access to data, and (ii) facilitate the involvement of scientists in such nations in their own countries' capacity-building initiatives, research policy decisions, and national database construction efforts.
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    Constraints on Access to Data from Developing Countries

    The constraints caused by inequities among nations in access to scientific data are felt most severely in those sciences concerned with inherently international issues, such as food production, biodiversity, the prevention and cure of communicable diseases, global climate change, and other Earth system processes. Each of these areas of concern requires international research and approaches to problem solving. As discussed above in this chapter, developing this essential understanding requires the generation of globally compatible, accessible, and usable data sets related to terrestrial ecosystems, the physical environment, and human activities. Collaboration of the scientific community in every nation, rich and poor, in the generation of global observational data sets and the subsequent full and open transnational flow of those data is imperative; its need cannot be emphasized too strongly.

    For example, in the Earth and environmental sciences, particularly in global change research, it is essential to integrate remote sensing data with "ground truth" in situ observational data in the creation of consistent and valid data sets. Without this integration, the value of the data products and research results can be undermined considerably. The in situ data are generated by individual workers and organizations in many different countries. Maintaining cooperative activities through which the in situ data are reliably supplied is essential for the success of international research projects. Many of the gaps in the collection and dissemination of in situ data occur in developing countries, where a lack of resources and other barriers make such cooperative activities difficult.

    As this report documents, the more wealthy industrialized nations have developed a broad range of international research initiatives, largely supported by a policy of full and open access to scientific data. Although significant problems remain and new barriers to effective collaboration continue to arise, there are sufficient incentives and resources for sharing of data by scientists in the developed world. However, the sharing of scientific data—particularly data for fundamental research—tends to be a much lower priority for many of the less wealthy, nonindustrialized nations. Success in ensuring full and open transnational flow of scientific data among these nations may depend on the degree to which the industrialized nations are responsive to the needs of the nonindustrialized nations and can provide incentives for their participation. An example of such a need involves the inability of developing nations to pay for large-scale disease treatment programs. Box 3.8 describes the mutual benefits possible in collaborative data sharing.

    As discussed above, incentives for participation by developing nations might include assistance in the development of human resources, the provision of equipment, and the general improvement of the research and communications infrastructure. Significant efforts should be made by the scientific community of the wealthier nations to include scientists from the nonindustrialized nations as part-

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    BOX 3.8 Examples of Successful Transnational Data Collaborations

    The World Data Centers (WDCs) have sponsored several "data rescue" projects in developing countries. In some cases, modest funds and sometimes equipment have been provided to help local scientists digitize older time-series data as part of the effort to build the International Geosphere-Biosphere Programme's and other studies' global digital databases to document trends and changes. Local scientists thus get their own data back in digital form on diskettes or CD-ROMs, depending on the technology they have. These efforts further not only their own work, but also the work of global change researchers internationally. Also, the WDCA Oceanography (NODC/NOAA) data rescue project provides the opportunity for local research groups to help to produce for the first time historical and highly useful analyses and maps of global ocean climatic changes.

    ners in the global scientific enterprise, particularly in research initiatives that are fundamentally dependent on the availability of global data sets or in studies addressing basic needs such as disease control and prevention.

    For example, some attention is being paid to the searching of genomic information useful for preventing tropical diseases,77 and some of this research is being carried out in developed nations. However, greater emphasis on understanding such diseases would follow from enhancement of the infrastructure for expertise in biology and biotechnology in developing nations. Developed countries' promotion of such advances would not be purely altruistic. Leishmaniasis, a disease usually associated with the tropics, infected troops in Desert Storm and is present on both sides of the Texas-Mexico border. Tuberculosis usually is not perceived as a tropical disease, per se, yet resistant strains of the bacteria from the tropics have found their way into populations in the developed nations. The motivation for trying to identify and locate genes possibly conferring resistance in populations where the diseases are common would be deepened by researchers' proximity to and familiarity with the effects of such diseases, if the resources and personnel were available in the affected countries. Economic constraints might also be lessened for studies in developing nations, where labor costs, even for highly trained research scientists, are much lower.

    Another barrier to the collection of data is evident in field studies related to biodiversity in developing countries. It is well known that the greatest concentrations of the planet's biodiversity occur in developing countries. However, the resources to study and exploit the diverse gene pools for biotechnology lie largely in the developed nations. In this area, as in all of science and technology, professionals in developing countries generally lack access to all the data and information needed to support their work. Further, considerations regarding intellectual property are more complicated in biology than in most other disci-

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    plines, because the biological materials themselves are repositories for scientific information. For this reason, bioprospecting for new gene pools in tropical countries by commercial and other interests from industrialized nations has become a contentious issue on a global scale.78 For example, Brazil will no longer allow the sampling of biota by non-Brazilians and will not allow export of biota.79 In such cases, the study of these materials is limited to what the country can do with local resources. Data that are produced in this way are sequestered rather than shared with the general scientific community. Other unanticipated problems can arise in this context as well, as Boxes 3.9 and 3.10 illustrate.

    With regard to in situ data collection efforts in developing countries, the committee recommends the following actions: the ICSU, together with funding agencies and nongovernmental bodies, should strengthen its efforts to assist developing countries in undertaking their own scientific studies and encourage scientists engaged in such studies to take active roles in the international scientific community, where their efforts can be appreciated and used. Legal and procedural protocols must be developed to provide for fair and equitable sharing of any resulting intellectual property. This would not only help create indigenous

    BOX 3.9 A Hobson's Choice

    The following example of a trade-off between two unpalatable options was provided by the Consortium for International Earth Sciences Information Network (ClESIN) in response to the committee's "inquiry to Interested Parties":

    One unexpected experience is in the balancing of data access privileges with the access of researchers to pursue their research in specific countries. Our experience includes an instance where a multi-year program to collect and integrate socioeconomic and environmental data in an African country was successfully completed, the data conveyed to CIESIN for sustaining access, then the government of the subject nation was ousted through a violent and protracted coup. The successor government did not agree with the predecessor government, in terms of allowing open access to those data collected and provided by its agencies to the CIESIN-sponsored researchers. Thus, they wanted to prohibit future release Of data already out of their physical custody.

    The clear implication was that failure to comply with these newly implemented restrictions would cause further restrictions of follow-on research projects of the type CIESIN initially supported with UNEP (United Nations Environment Programme) and others. The trade of between restricting data access and restricting research access for future collection is an unsavory and unforeseen challenge that is likely to recur in that region and elsewhere, as political instability ensues. Future governments may decline to honor the information sharing policies of their country. This dilemma threatens the free and open access of data on a sustaining. basis and raises significant questions about where the locus of ownership of data Is after governments are replaced, peacefully or through violent actions.

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    BOX 3.10 Can Data Be Too Accurate?

    The following is an excerpt from a message that is part of a discussion on the Internet list server, <biodiv-1@bdt.org.br>. This discussion group emphasizes global biodiversity, conservation of habitat and biota, and information regarding these areas. The author of this message is Jeff Waldon. The message illustrates an important but little appreciated aspect of the tension between free dissemination of information, and commercial and nonscientific private interests:

    The debate is whether release or restriction of sensitive locational information is the best thing for conservation. There are cases of collectors using such information to decimate rare and endangered species at a site (e.g., the recent arrest of butterfly poachers that targeted National Parks in the western United States). On the other hand there are other examples of species protection because the landowner was informed of the existence of a rare animal or plant. I have been involved in the development of information systems for about 10 years, and I have heard both sides argued strenuously. My personal feeling is that the "boogie man" collector is real, but in most cases we overreact to his presence. We are losing many more populations of threatened and endangered species because of ignorance rather than malice.

    We have developed a compromise in our systems whereby we release sensitive information on species, but the locational data accuracy is reduced to help reduce the likelihood that a collector might successfully collect at that site. If a development project for the pubic is reviewed, and more accurate information is required, that information is provided at the, discretion of the biologist working with the requester. I come from the academic school of thought that relies on the free interchange of information, and this compromise strikes me as still too restrictive at times. On the other hand, government employees are bound by laws and policies that make them accountable for their actions Including the consequences of releasing information on the location of threat , and endangered species, and I see their dilemma.

    SOURCE: Jeff Waldon, personal communication, 1995, used with permission.

    data resources and promote a greater interest within nations of the developing world in obtaining a more thorough understanding of their own resources, but also lead to more fruitful international cooperative research.

    RECOMMENDATIONS ON DATA ISSUES IN THE NATURAL SCIENCES

    The recommendations set forth below are addressed to all individuals and organizations with responsibilities for managing scientific data acquired with public funds.

    1. Governmental science agencies and intergovernmental organizations
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    1. should adopt as a fundamental operating principle the full and open exchange of scientific data. By "full and open exchange" the committee means that the data and information derived from publicly funded research are made available with as few restrictions as possible, on a nondiscriminatory basis, for no more than the cost of reproduction and distribution.
    2. The International Council of Scientific Unions (ICSU), together with the scientific Specialized Agencies of the United Nations, the Organization for Economic Co-operation and Development Megascience Forum, and the national science agencies and professional societies of member countries, should consider developing a distributed international network of data centers. Such a network should draw on the strengths of successful examples of international data exchange activities as described in Appendix C of this report, including, in particular, the ICSU World Data Centers, and become a prominent part of the global information infrastructure that has been proposed by the "Group of Seven" nations. To facilitate the international dissemination and interdisciplinary use of scientific data, all public scientific data activities, including the network of data centers, should plan for and commit to providing the human and financial resources sufficient for carrying out the following functions:
      1. Involve experts from the relevant disciplines, together with information resource managers and technical specialists, in the active management and preservation of the data;
      2. Develop and maintain up-to-date, comprehensive, on-line directories of data sources and protocols for access;
      3. Provide documentation (metadata) adequate to ensure that each data set can be properly used and understood, with special attention given to making the data usable by individuals outside the core discipline area. This problem is particularly acute within the biological sciences, in which imprecision and variations in taxonomic definitions and nomenclature pose significant barriers to communication, even among the biological subdisciplines. The committee suggests that the CODATA Commission on Standardized Terminology for Access to Biological Data Banks be enhanced into a true international consultative body and that similar mechanisms be developed for other disciplines, as needed;
      4. Incorporate advances in technology to facilitate access to and use of scientific data, while overcoming incompatibilities in formats, media, and other technical attributes through vigorous coordination and standardization efforts;
      5. Institute effective programs of quality control and peer review of data sets; and
      6. Digitize all key historical data sets and ensure that every important condition for the long-term retention of data be met, including the adoption
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    • of appropriate retention and purging criteria and the timely transfer of all data sets to new media to prevent their deterioration or obsolescence.
    1. The ICSU and other professional scientific societies should encourage the study of, and publication of peer-reviewed papers on, effective data management and preservation practices, as well as promote the teaching of those practices in all institutions of higher learning.
    2. All scientists conducting publicly funded research should make their data available immediately, or following a reasonable period of time for proprietary use. The maximum length of any proprietary period should be expressly established by the particular scientific communities, and compliance should be monitored subsequently by the funding agency.
    3. As a corollary to recommendation 2.a above, publicly funded scientific databases should be maintained either directly or under subcontract by the government science agencies with the requisite discipline mission and need. In the United States, the Office of Science and Technology Policy should develop an overall policy for the long-term retention of scientific data, including a contingency plan for protecting those data that may become threatened with the loss of their institutional home.80
    4. With regard to improving access to scientific data in developing countries, the committee makes the following recommendations:
      1. International development organizations, together with professional societies, should provide targeted training programs for scientists in the use of computers, with emphasis on the management of digital data in specific disciplines.
      2. Foreign aid agencies should (i) make available to individual scientists in developing countries more direct, peer-reviewed grants that include support for access to data, and (ii) facilitate the involvement of scientists in such nations in their own countries' capacity-building initiatives, research policy decisions, and national database construction efforts.
      3. Scientists in developing countries should be encouraged to organize to promote the policy of full and open access to scientific data in their own countries, as well as to make their data available internationally.
      4. The ICSU, together with funding agencies and nongovernmental bodies, should strengthen its efforts to assist developing countries in undertaking their own scientific studies and encourage scientists engaged in such studies to take active roles in the international scientific community, where their efforts can be appreciated and used. Legal and procedural protocols must be developed to provide for fair and equitable sharing of any resulting intellectual property.
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    1. Until affordable and ubiquitous electronic network services are available, national and international scientific societies and foreign aid agencies should establish or improve their existing efforts to send extra stocks of scientific publications to libraries and research institutions in developing countries that need them.
    1. Finally, the ICSU, together with the principal national and international scientific organizations mentioned in Recommendation 2 above, should convene a series of major international meetings to initiate meaningful action on these recommendations.

    NOTES

    1.  

    Privacy issues, which become especially important in the social sciences and clinical research, were judged to be of only tertiary concern in the context of most of the disciplines examined in this study, and thus are not addressed in any detail in this report.

    2.  

    In some areas of the experimental sciences, it is standard practice for researchers to publish general results, such as structures of protein molecules, but retain details, such as precise coordinates of the atoms, for some limited period of time, during which they may pursue the implications of their own measurements. In many instances, particularly in the observational sciences, principal investigators are allowed to keep data sets proprietary for some specified period of time in order to be able to analyze them and publish their results first. This issue is discussed below in this chapter.

    3.  

    National Research Council (1995), Preserving Scientific Data on Our Physical Universe: A New Strategy for Archiving the Nation's Scientific Information Resources, National Academy Press, Washington, D.C.

    4.  

    Cosmic-ray research is an exception here. While it is based largely on observations rather than experiments, it has been classified traditionally in physics, rather than astronomy or space science. It overlaps all of these, of course.

    5.  

    For a more detailed discussion of the differences between experimental and observational data, see National Research Council (1995), Preserving Scientific Data, note 3.

    6.  

    For a comprehensive listing of most internationally available data sets from space missions, see the NASA Goddard Space Flight Center's National Space Science Data Center home page at <http://nssdc.gsfc.nasa.gov/>.

    7.  

    For a broad listing of international WWW servers covering all aspects of Earth science data and information, see the NASA Global Change Master Directory at <http://gcmd.gsfc.nasa.gov/cgibin/pointers/>; see also <http://gds.esrin.esa.it:80/>.

    8.  

    A. Maddison (1995), Monitoring the World Economy: 1820-1992, OECD, Paris, 255 pp.

    9.  

    T.F. Malone (1995), "Reflections on the Human Prospect," in Annual Review of Energy and the Environment (R.H. Socolow, ed.) 20:1-29, Annual Reviews, Palo Alto, California.

    10.  

    See <http://www.usgcrp.gov> for additional information on the U.S. Global Change Research Program and related data activities, and <http://www.igbp.kva.se/index.html> for information on the International Geosphere-Biosphere Programme.

    11.  

    See the WWW Virtual Library for a comprehensive index of biological data and information at <http://golgi.harvard.edu/biopages>. See also a listing of sources of international biological information on the Internet on the Web site of the U.S. Geological Survey's Biological Resources Division at <http://www.its.nbs.gov/nbii/iao/ibii.html>; and the Biotechnology Indus-

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

       

    try Organization's compilation of biotechnology databases at <http://www.bio.org/educ/dbasef.html>.

    12.  

    See, for example, National Research Council (1996), Statistical Challenges and Possible Approaches in the Analysis of Massive Data Sets, National Academy Press, Washington, D.C.

    13.  

    Genevieve J. Knezo, (1994), ''Major Science and Technology Programs: Megaprojects and Presidential Initiatives, Trends Through the FY 1995 Request," Congressional Research Service, Washington, D.C., March 29, p. 1.

    14.  

    Congressional Budget Office, July (1991), "Large Non-Defense R&D Projects in the Budget: 1980-1986," CBO, Washington, D.C. Unfortunately, more recent statistics are not available.

    15.  

    For a detailed review of the various large international research projects and programs currently under way, see Organization for Economic Co-Operation and Development, OECD Megascience Forum (1993), Megascience and Its Background, Paris. See also the OECD Megascience Forum Web site at <http://www.oecd.org/dsti/mega/>.

    16.  

    15 United States Code, Section 5652 (1992).

    17.  

    See General Accounting Office (1990), Environmental Data-Major Effort Is Needed to Improve NOAA's Data Management and Archiving, Washington, D.C.; and General Accounting Office (1990), Space Operations-NASA Is Not Archiving All Potentially Valuable Data, Washington, D.C. It should be noted that both agencies have taken significant measures to rectify these past problems.

    18.  

    National Research Council (1995), Preserving Scientific Data, note 3.

    19.  

    National Research Council (1995), Preserving Scientific Data, note 3, at pp. 47-48.

    20.  

    See Gary Taubes, (1996), "Science Journals Go Wired," Science 271(February 9):764; and UNESCO Expert Conference on Electronic Publishing in Science (1996), ICSU Press at <http://www.lmcp.jussieu.fr/-fabrice/icsu/information/index.html>. See also, Steve Hitchcock, Leslie Carr, and Wendy Hall, "A Survey of STM On-line Journals 1990-95: The Calm Before the Storm," at <http://journals.ecs.soton.ac.uk/survey/survey.html>.

    21.  

    <http://www.aip.org:80/>.

    22.  

    <http://www.iop.org/>.

    23.  

    See <http://eij.gsfc.nasa.gov>.

    24.  

    See, for example, the NASA-funded Astrophysics Data System Abstract Service at <http://adswww.harvard.edu/ads_abstracts.html>.

    25.  

    See Richard T. Kouzes, James D. Myers, and William A. Wulf (1996), "Collaboratories: Doing Science on the Internet," IEEE Computer 29(8):40-46.

    26.  

    For additional insights in this area, see the Proceedings of the '96 UNESCO Conference on Electronic Publishing in Science, held at the UNESCO Headquarters in Paris, February 19-23, 1996. A summary of the results from that conference was presented by D.F. Shaw (1996), "Electronic Publishing in Science," Science International, ICSU Paris, May, pp. 1-3.

    27.  

    See Nahum Gershon and Judith R. Brown (1996), "Computer Graphics and Visualization in the Global Information Infrastructure," a Special Report in IEEE Computer Graphics and Applications, March, pp. 60-75; and Robert Braham (1995), "Math & Visualization: New Tools, New Frontiers," a Focus Report in IEEE Spectrum, November, pp. 19-65.

    28.  

    Canadian Global Change Program (1996), "Data Policy and Barriers to Data Access in Canada: Issues for Global Change Research," The Royal Society of Canada, Ottawa. National Research Council (1993), 1992 Review of the World Data Center A for Rockets and Satellites, National Space Science Data Center, Board on Earth Sciences and Resources, National Academy Press, Washington, D.C.; National Research Council (1992), Toward a Coordinated Spatial Data Infrastructure for the Nation, Board on Earth Sciences and Resources, National Academy Press, Washington, D.C.; National Academy of Public Administration (1991), The Archives of the Future: Archival Strategies for the Treatment of Electronic Databases, A report for the National Archives and Records Administration; General Accounting Office (1990), Environmental Data-Major Effort Is Needed to Improve NOAA's Data Management and Archiving,

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

       

    Washington, D.C.; General Accounting Office (1990), Space Operations—NASA Is Not Archiving All Potentially Valuable Data, Washington, D.C.; National Research Council (1990), Spatial Data Needs: The Future of the National Mapping Program, Board on Earth Sciences and Resources, National Academy Press, Washington, D.C.; National Research Council (1988), Geophysical Data: Policy Issues, Committee on Geophysical Data,, National Academy Press, Washington, D.C.; National Research Council (1988), Selected Issues in Space Science Data Management and Computation, Space Science Board, National Academy Press, Washington, D.C.; National Research Council (1986), Atmospheric Climate Data: Problems and Promises, Board on Atmospheric Sciences and Climate, National Academy Press, Washington, D.C.; National Research Council (1986), Issues and Recommendations Associated with Distributed Computation and Data Management Systems for the Space Sciences, Space Science Board, National Academy Press, Washington, D.C.; J.K. Haas, H.W. Samuels, and B.T. Simmons (1985), Appraising the Records of Modern Science and Technology: A Guide, Massachusetts Institute of Technology, Cambridge, Mass.; National Research Council (1984), Solar-Terrestrial Data Access, Distribution and Archiving, Space Science Board and Board on Atmospheric Sciences and Climate, National Academy Press, Washington, D.C.; National Research Council (1982), Selected Issues in Space Science Data Management and Computation, Space Science Board, National Academy Press, Washington, D.C.

    29.  

    Committee on the Future of Long-term Ecological Data (FLED), (1995), Final Report of the Ecological Society of America, Katherine L. Gross, Chair, Ecological Society of America, Washington, D.C.; National Research Council (1993), A Biological Survey for the Nation, Committee on the Formation of the National Biological Survey, National Academy Press, Washington, D.C.

    30.  

    National Research Council (1986), Toward a Geosphere-Biosphere Program, National Academy Press, Washington, D.C.; National Research Council (1988), Toward an Understanding of Global Change: Initial Priorities for U.S. Contributions to the International Geosphere-Biosphere Program, National Academy Press, Washington, D.C.; and National Research Council (1990), Research Strategies for the U.S. Global Change Research Program, National Academy Press, Washington, D.C.

    31.  

    Additional information on the Global Climate Observing System can be found at the World Meteorological Organizations's Web site at <http://www.wmo.ch/web/gcos/gcoshome.html>; and see <http://www.wmo.ch/web/www/www.html> for the World Weather Watch, and <http://www.wmo.chlweb/arep/gaw.html> for the Global Weather Watch. See <http://www.unesco.org/ioc/goos/iocgoos.html> for the Global Ocean Observing System and <http://www.wsl.ch/ wsidb/gtos/gtos.html> for the Global Terrestrial Observing System.

    32.  

    See the Carbon Dioxide Information Analysis Center's Web site at <http://cdiac.esd.ornl.gov/cdiac>.

    33.  

    For extensive discussion of data quality control and assurance procedures and recommendations in the context of interdisciplinary environmental research, see National Research Council (1995), Finding the Forest in the Trees: The Challenge of Combining Diverse Environmental Data, National Academy Press, Washington, D.C.

    34.  

    For a general overview of issues and requirements in archiving digital data and information, see Task Force on Archiving of Digital Information (1995), Preserving Digital Information, the Commission on Preservation and Access and the Research Libraries Group, Inc., at<http://lyra.rlg.org./ArchTF/>.

    35.  

    See, for example, Committee on the Future of Long-term Ecological Data (FLED), (1995), Final Report of the Ecological Society of America, Katherine L. Gross, Chair, Ecological Society of America, Washington, D.C.

    36.  

    National Research Council (1995), Preserving Scientific Data, note 3.

    37.  

    National Research Council (1994), Facing the Challenge: The U.S. National Report to the IDNDR World Conference on Natural Disaster Reduction, Yokohama, Japan, May 23-27, 1994,

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

       

    U.S. National Committee for the Decade for Natural Disaster Reduction, National Academy Press, Washington, D.C.; National Research Council (1991), A Safer Future: Reducing the Impacts of Natural Disasters, U.S. National Committee for the Decade for Natural Disaster Reduction, National Academy Press, Washington, D.C.

    38.  

    National Research Council (1995), Finding the Forest in the Trees, note 33.

    39.  

    See, generally, National Research Council (1992), Toward a Coordinated Spatial Data Infrastructure for the Nation, Board on Earth Sciences and Resources, National Academy Press, Washington, D.C.

    40.  

    Secondary users, such as researchers in other fields, policymakers, educators, and the general public, do not collect and create data sets, but they perform tasks with, analyze, and interpret the data. For a discussion of distinctions between user categories, see National Research Council ( 1995), Study on the Long-term Retention of Selected Scientific and Technical Records of the Federal Government-Working Papers, National Academy Press, Washington, D.C.

    41.  

    See National Research Council (1995), Finding the Forest in the Trees, note 33, and Preserving Scientific Data, note 3. For general information on metadata issues, see the Lawrence Livermore National Laboratory Metadata and Data Management information page at <http://www.llnl.gov/liv_comp/metadata/metadata.html>.

    42.  

    For more information regarding IGBP-DIS data activities, see <http://www.cnrm.meteo.fr:8000/igbp/outline.html>. Additional information is provided in the "Summary Report on the 7th IGBP-DIS Scientific Steering Committee Meeting Manual" (1996) at <http://www.cnrm.meteo.fr:8000/igbp/meetingssummary_repsscfeb96_verhtml.html>. See also the NASA Global Change Master Directory for another example of a successful on-line indexing effort at <http://gcmd.gsfc.nasa.gov/cgi-Bin/pointers/>.

    43.  

    Michael Carlowicz (1997), "New Data from Cold War Treasure Trove," EOS, American Geophysical Union, Vol. 78, no. 9, March 4, p. 93.

    44.  

    See "Corona: America's First Satellite Program" (1995), Kevin C. Ruffner, ed., Central Intelligence Agency History Staff Center for the Study of Intelligence, Washington, D.C.; and Robert A. McDonald (1995), "Opening the Cold War Sky to the Public: Declassifying Satellite Reconnaissance Imagery," Photogrammetric Engineering and Remote Sensing, pp. 385-390. For a listing of declassified satellite data products, including information about missions, dates, and resolution, see the United States Geological Survey's EROS Data Center Web site at <http://edcwww.cr.usgs.gov/glis/hyper/guide/disp>.

    45.  

    See William J. Broad (1996), "Anti-Sub Seabed Grid Thrown Open to Eavesdropping," New York Times, July 2, p. CI.

    46.  

    See National Research Council (1988), Ozone Depletion, Greenhouse Gases and Climate Change: Proceedings of a Joint Symposium by the Board on Atmospheric Sciences and Climate and the Committee on Global Change, National Academy Press, Washington, D.C.

    47.  

    National Research Council (1995), On the Full and Open Exchange of Scientific Data, National Academy Press, Washington, D.C. The appendix to the report also presents a collection of other similar supporting policy statements.

    48.  

    See Information Infrastructure Task Force (1994), The Global Information Infrastructure: Agenda for Cooperation, Washington, D.C., at <http://www.iitf.nist.gov/documents/docs/gii/giiagend.html>.

    49.  

    J.R.G. Townshend, C. Justice, W. Li, C. Gurney, and J. McManus (1991), "Global Land Cover Classification by Remote Sensing: Present Capabilities and Future Possibilities," Remote Sensing and the Environment 35:243-355.

    50.  

    See T.R. Loveland, J.W. Merchant, D.O. Ohlen, and J.F. Brown (1991), "Development of a Land Cover Characteristics Database for the Conterminous U.S.," Photogrammetric Engineering and Remote Sensing 57:1453-1463.

    51.  

    These points have been discussed in some detail for the field of materials databases, where the high volume and critical importance of metadata, the broad scope of the materials field, the rich

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

       

    vocabulary of materials technology, and the international character of materials information give special importance to the subjects. See J.H. Westbrook and W. Grattidge (1992), "Terminological Standards for Materials Databases," in Computerization and Networking of Materials Databases, Vol. 3, T.J. Barry and K.W. Reynard, eds., American Society for Testing and Materials, Philadelphia, pp. 15-33.

    52.  

    D.L. Hawksworth, B.C. Sutton, and G.C. Ainsworth (1983), Dictionary of the Fungi (including the Lichens), seventh edition, Commonwealth Mycological Institute, Kew, Surrey, England.

    53.  

    College of American Pathologists, Committee on Nomenclature and Classification of Disease (1965), Systematized Nomenclature of Pathology, first edition, American Cancer Society and American Medical Association, Chicago.

    54.  

    A reference with numerous examples from the field of chemistry is J.H. Westbrook (1993), "Problems in the Computerization of Chemical Information: Capture of Tabular and Graphical Data," Journal of Chemical Information and Computer Sciences 33:6-17.

    55.  

    See <http://www.nist.gov/srd/>.

    56.  

    See the recommendations in National Research Council (1995), Preserving Scientific Data, note 3.

    57.  

    For example, the American Association for the Advancement of Science sponsors the Project for African Research Libraries in partnership with U.S. scientific societies to provide subscriptions for core scientific and technical journals in 35 sub-Saharan African countries (see <http://www.aaas.org/international/ssa-l.htm> for general information on international programs). For UNESCO's programs on the advancement, transfer, and sharing of knowledge in the natural sciences, see also <http://www.unesco.org/ch-intern/programmes/science/highlights.html>.

    58.  

    These statistics vary over time and according to country and discipline, and are available for only a few major countries. See Science & Engineering Indicators (1996), National Science Board, Washington, D.C., pp. 2-28 to 2-30. The statistics indicate that 35 to 75 percent of foreign graduate students surveyed intend to stay in the United States upon completion of their studies.

    59.  

    For an overview of potential educational activities to improve the management of scientific and engineering data, see National Research Council (1986), Improving the Treatment of Scientific and Engineering Data Through Education, National Academy Press, Washington, D.C.

    60.  

    This is not an exhaustive list of organizations that provide assistance to scientists in developing countries. Several organizations, such as the International Development Research Centre Library, provide extensive links to Internet sites related to international development (see <http://www.irdc.ca/library/world/world.html>).

    61.  

    USAID focuses on regional activities, such as the African Data Dissemination Service (ADDS), which is conducted in conjunction with several private organizations, the Office of Arid Land Studies at the University of Arizona, NASA, and NOAA. An example of an ADDS project is the Famine Early Warning System, which, with the help of the USGS EROS Data Center, provides information about potential famine situations to allow for proactive initiatives to prevent famine (see <http://edcsnw4.cr.usgs.gov/adds/general/> for additional information on ADDS). Other USAID activities, such as AfricaLink and the Leland Initiative, provide network connections and information management to Africa (see <http://www.info.usaid.gov/regions/afr/> for a description of these and other regional programs in Africa). The USAID sponsored U.S.-Russian NGO Cooperation Project provides small grants and equipment for individuals and institutions to link to an environmental e-mail network in Central Asia and the West Newly Independent States (see <gopher://gaia.info.usaid.gov:70/00...enis_reg/nis. factsheet/enviro3.txt>).

    62.  

    For example, the NASA Pathfinder project uses Landsat images to determine forest land cover and change for three quarters of the world. The USDA Foreign Agriculture Service provides support to the Consultative Group on International Agriculture Research (CGIAR) through the prediction of global production of major grains. For additional information on both programs, see the Proceedings of a Workshop on the Use of Remote Sensing Technologies and GIS

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

       

    Database in CGIAR Centers (1995), Environment and National Resources Information Center (see <http://www.info.usaid.gov/environment/enric/special/cgiar.htm>).

    63.  

    See <http://www.state.gov/www/global/oes/envir.html>.

    64.  

    See <http://www.unsystem.org> for the official listing of the United Nations system of organizations' Internet servers. Numerous initiatives within the U.N. programs and specialized agencies directly assist scientists in developing countries through a variety of mechanisms. One example is the Sustainable Development Network Programme of the UNDP, which links government organizations, the private sector, universities, NGOs, and individuals in developing countries through electronic networks for the purpose of exchanging information on sustainable development (see <http://www3.undp.org>). Refer to the programs' and agencies' home pages for further details on other U.N. initiatives.

    65.  

    For example, the OAS RedHUCyT program is a hemisphere-wide interuniversity scientific and technological information network created in 1991 with the objective to connect OAS member countries to the Internet, "integrating an electronic network for the exchange of scientific and technological information among professors, researchers, and specialists, at different universities in the member states" (for additional information, see <http://www.oas.orglEN/PROG/RED/covere.htm>). OAS also sponsors a regional scientific and technological development program, which carries out a number of multinational and national projects that provide member states "with an opportunity to share experiences, to provide . . . mutual support and to engage in joint activities to further the advancement of science and technology and to promote integral development" (see <http://www.oas.org/EN/PROG/pa26e.htm>).

    66.  

    See the Institute for Baltic Studies' Web site at <http://www.ibs.ee/dollar/fw4/wp/dev.html>.

    67.  

    <http://gds.esrin.esa.it:80/559DE416/TOxclcce622_0x00029290>.

    68.  

    The Third World Academy of Sciences (TWAS) was founded in 1983 to support scientific research in developing countries through provision of research grants, spare parts for scientific equipment, books and journals, and fellowships. See <http://www.ictp.trieste.it/TWAS.html/> for a description of TWAS activities and programs. The organization not only is closely coupled with the U.N., but also collaborates with the International Council of Scientific Unions (see next note).

    69.  

    The International Council of Scientific Unions (ICSU) was founded in 1933 to "bring together natural scientists in international scientific endeavor." ICSU works closely with UNESCO, WMO, FAO, and UNEP through formal or ad hoc collaborations (see <http://www.lmcp.jussieu.fr/-fabrice/icsu/> for additional information on ICSU). ICSU's Committee on Science and Technology in Developing Countries (COSTED) was created in 1966 to stimulate international scientific and technological cooperation in developing countries. It is a joint initiative co-sponsored by UNESCO and was merged with the International Biosciences Network, an activity with similar objectives, in 1994. For additional information on COSTED and its activities, see G. Thyagara (1995), "Cooperative Research for Development Is COSTED's Aim," The Hindu On-line (<http://www.webpage.com/hindu/960113/22/0820a.html>). ICSU also works to assist scientists in developing countries through its scientific unions and interdisciplinary committees; for example, CODATA recently established the Task Group on Outreach, Education, and Communication, which promotes collaboration, scientific information exchange, and technology transfer for individual scientists and technologists in developing nations.

    70.  

    Founded in 1972, the International Foundation for Science (IFS) provides support (in the form of research grants, equipment, regional workshops and training courses, and travel grants) to young scientists in developing countries in the following research areas: aquatic resources, animal production, crop science, forestry/agroforestry, food science, and natural products. See <http://ifs.plants.ox.ac.uk/ifs/> for additional information about IFS activities and programs.

    71.  

    See <http://www/ciesin.org> for additional information about CIESIN's programs and services.

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×

    72.  

    See <http://www.std.com/sabre/SAP/sap.info.html> for additional information on the Sabre Foundation.

    73.  

    See <http://www.isf.ru/index-isf.html> for additional information about the International Science Foundation and its various programs that assist scientists, such as its Library Assistance Program and the Telecommunications Program.

    74.  

    See <http://info.irex.org> for additional information about IREX programs.

    75.  

    See <http://vita.org>.

    76.  

    See American Association for the Advancement of Science, Science and Technology in the Americas: Perspectives on Pan American Collaboration (1994), 2nd ed., E. Jeffrey Stann, ed., AAAS, Washington, D.C.

    77.  

    See James M. Musser (1996), "Molecular Population Genetic Analysis of Emerged Bacterial Pathogens: Selected Insights," EID, 2(1), January-March.

    78.  

    See Biodiversity Prospecting: Using Genetic Resources for Sustainable Development (1993), Reid et al., World Resources Institute, Washington, D.C., 350 pp.; and "Bioprospecting/Biopiracy and Indigenous Peoples" (1994) RAFI Communique, Nov./Dec.

    79.  

    Personal communication from a member of the staff of the Brazilian Embassy, Washington, D.C., 1995.

    80.  

    See the recommendations in National Research Council (1995), Preserving Scientific Data, note 3.

    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 47
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 48
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 49
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 50
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 51
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 52
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 53
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 54
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 55
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 56
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 57
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 58
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 59
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 60
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 61
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 62
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 63
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 64
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 65
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 66
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 67
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 68
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 69
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 70
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 71
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 72
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 73
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 74
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 75
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 76
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 77
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 78
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 79
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 80
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 81
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 82
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 83
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 84
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 85
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 86
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 87
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 88
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 89
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 90
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 91
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 92
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 93
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 94
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 95
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 96
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 97
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 98
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 99
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 100
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 101
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 102
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 103
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 104
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 105
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 106
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 107
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 108
    Suggested Citation:"3 Scientific Issues in the International Exchange of Data in the Natural Sciences." National Research Council. 1997. Bits of Power: Issues in Global Access to Scientific Data. Washington, DC: The National Academies Press. doi: 10.17226/5504.
    ×
    Page 109
    Next: 4 Data From Publicly Funded Research--The Economic Perspective »
    Bits of Power: Issues in Global Access to Scientific Data Get This Book
    ×
    Buy Hardback | $75.00 Buy Ebook | $59.99
    MyNAP members save 10% online.
    Login or Register to save!
    Download Free PDF

    Since Galileo corresponded with Kepler, the community of scientists has become increasingly international. A DNA sequence is as significant to a researcher in Novosibirsk as it is to one in Pasadena. And with the advent of electronic communications technology, these experts can share information within minutes. What are the consequences when more bits of scientific data cross more national borders and do it more swiftly than ever before? Bits of Power assesses the state of international exchange of data in the natural sciences, identifying strengths, weaknesses, and challenges. The committee makes recommendations about access to scientific data derived from public funding. The volume examines:

    • Trends in the electronic transfer and management of scientific data.
    • Pressure toward commercialization of scientific data, including the economic aspects of government dissemination of the data.
    • The implications of proposed changes to intellectual property laws and the role of scientists in shaping legislative and legal solutions.
    • Improving access to scientific data by and from the developing world.

    Bits of Power explores how these issues have been addressed in the European Community and includes examples of successful data transfer activities in the natural sciences. The book will be of interest to scientists and scientific data managers, as well as intellectual property rights attorneys, legislators, government agencies, and international organizations concerned about the electronic flow of scientific data.

    1. ×

      Welcome to OpenBook!

      You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

      Do you want to take a quick tour of the OpenBook's features?

      No Thanks Take a Tour »
    2. ×

      Show this book's table of contents, where you can jump to any chapter by name.

      « Back Next »
    3. ×

      ...or use these buttons to go back to the previous chapter or skip to the next one.

      « Back Next »
    4. ×

      Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

      « Back Next »
    5. ×

      Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

      « Back Next »
    6. ×

      To search the entire text of this book, type in your search term here and press Enter.

      « Back Next »
    7. ×

      Share a link to this book page on your preferred social network or via email.

      « Back Next »
    8. ×

      View our suggested citation for this chapter.

      « Back Next »
    9. ×

      Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

      « Back Next »
    Stay Connected!