In response to a request from the National Institutes of Health (NIH), the Food and Drug Administration (FDA), the U.S. Department of Energy (DOE), and the National Science Foundation (NSF), the National Research Council convened a committee to assess the importance and impact of glycoscience, explore the landscape of current research, and identify the challenges that will need to be addressed to enable the field to move forward. The committee was charged to “articulate a unified vision for the field on glycoscience and glycomics” and to “develop a roadmap with concrete research goals to significantly advance [the field]” (see Statement of Task, Box 1-5). The committee’s consensus findings, conclusions, and recommendations in addressing this charge are summarized below.
Glycans are one of the four fundamental classes of macromolecules that comprise living systems, along with nucleic acids, proteins, and lipids, and are made up of individual sugar units linked to one another in a multitude of ways. Understanding the structures and functions of glycans is central to understanding biology. One of the most common reactions on the planet—photosynthesis—uses energy from sunlight to ultimately combine carbon dioxide and water into polymers of sugars such as starch, glycogen, or cellulose—glycans used in our metabolic pathways to provide us with energy, that provide structural support in such materials as wood, and that other animals are able to use as energy sources.
Carbohydrate, Glycan, Saccharide, or Sugar?
Carbohydrate: A generic term used interchangeably in this report with sugar, saccharide, or glycan. This term includes monosaccharides, oligosaccharides, and polysaccharides as well as derivatives of these compounds.
Glycan: A generic term for any sugar or assembly of sugars, in free form or attached to another molecule.
Saccharide: A generic term for any carbohydrate or assembly of carbohydrates, in free form or attached to another molecule.
Sugar: A generic term often used to refer to any carbohydrate, but most frequently to low molecular weight carbohydrates that are sweet in taste.
Glycans (see Box S-1) are ubiquitous. All living cells are coated on their cell membranes with glycans or include glycan polymers as integral components of their cell walls. They play diverse roles, including critical functions in the areas of cell signaling, molecular recognition, immunity, and inflammation. They are the cell surface molecules that define the ABO blood groups, influencing an individual’s ability to receive another’s blood. Glycans are attached to specific locations on many proteins, modulating aspects of their biological activity through molecular recognition or affecting their circulation time in blood. The difference between glycan molecules added by humans when they naturally produce the protein erythropoietin, which affects red blood cell production, and glycan molecules present when this protein drug is produced commercially in cell culture, serves as the basis for antidoping tests in athletes. They are also central components of plant cell walls, which enable plants to grow upright and to resist degradation from the environment and from microbes.
Advances in the life sciences over the past several decades have led to a greater understanding of many of the basic mechanisms present in biological systems. Stimulated by the Human Genome Project, there have been improvements in understanding the central dogma of molecular biology. Sequences of DNA—genes—are transcribed into RNA, which in turn are translated to form proteins. This basic understanding, along with advances in the tools used to study biology, underpins the expansion of both genomics and proteomics. The wide array of posttranslational modifications that occur on proteins are also part of this increasingly clear picture. Protein glycosylation, one of the most common forms of post-
translational modification, is important for many biological processes and often serves as an analog switch that is capable of carefully modulating protein activity.
Relatively little attention has been paid to this class of molecules, and glycoscience remains a relatively understudied field. It is hard to predict what advances in glycoscience will bring as the contributions from the life sciences and chemical sciences to numerous areas of applied science continue to expand. This report provides an overview of the current knowledge and state of glycoscience and illustrates why glycoscience is central to multiple avenues of research. An expanding understanding of glycan functions and structures will complement and strengthen other areas of research, building on advances made in such fields as genomics, proteomics, chemical synthesis, materials science, and engineering. Understanding glycans and applying this knowledge can help find problem-driven solutions to a diverse set of challenges. Examples include the early detection of cancer and other diseases through identification of disease biomarkers, protection against infectious diseases such as influenza through increased understanding of the role of glycans in host-pathogen interactions and the immune response, and creation of products and fuels derived from carbohydrate raw materials.
Much of the fundamental biology and chemistry being explored in glycoscience has the ability to influence what are often viewed as disparate fields. Researchers in health, energy, and materials science can leverage discoveries in each other’s disciplines to help strengthen the field as a whole. For example, efforts to understand the biochemical pathways of glycans and the roles of carbohydrate polymers inside cells are of use to scientists working to better understand cancer biology and plant biology alike. The conversion of biomass into novel starting materials can have implications for both materials scientists working to develop new plastics based on renewable resources or synthetic chemists working to synthesize novel drug targets. This report provides a holistic vision for glycoscience by suggesting a research roadmap for the scientific community that, while undoubtedly challenging, may ultimately help democratize the field and help realize the broad benefits from this important area. This roadmap will enable the tools to address glycoscience questions to be available to scientists and engineers who wish to incorporate them into their research. To address the roadmap goals, glycoscience will require input from researchers not currently working in this field and glycoscientists will need to reach out to bring these researchers into their community.
While genomics and proteomics have advanced rapidly, glycoscience and glycomics have made strides that are enabling scientists to better understand the role that glycans play in biological systems. Glycoscience researchers have already developed a fundamental knowledge base that can be utilized to help address many of today’s major research problems. This knowledge base, when combined with the current set of tools available to probe glycan structure and function, is a powerful resource to better understand human, plant, and microbial biology.
Glycoscience has, until recently, been explored by a small group of experts, working with a more limited set of information and resources than are available in fields such as genomics and proteomics. What is known about glycoscience and glycomics, the study of the complete set of glycans in an organism, is still incomplete. But current knowledge now makes it possible to integrate glycoscience broadly into the fields of health, energy, and materials science, and the set of available tools, while not perfect, provides a base to enable further development and discovery.
A CENTRAL FIELD WITH LINKS TO MANY DISCIPLINES
Glycoscience is a highly interdisciplinary field that aims to better understand the structures and functions of glycans and how they can be used. It is a global field with a dedicated community of researchers in the United States and abroad. Glycoscientists do not have a single training/ education background. They come from various fields, including physiology and developmental biology, where glycans are involved in processes such as cell movement and tissue development. They are in medicine, where glycans are involved in the development and progression of chronic and infectious diseases. In microbiology, glycans are key players in interactions among and between microbes and host cells. Glycoscientists are chemists developing new synthetic and analytical methods for glycans, and biochemists working to understand glycan synthesis and metabolism. In materials science, glycans can be used as polymeric materials having a wide range of properties. In computational science and informatics, modeling studies and the effective analysis of large amounts of experimental data are also necessary to better understanding glycans.
CONTRIBUTIONS TO IMPROVING HEALTH, DEVELOPING ALTERNATIVE FORMS OF ENERGY, AND CREATING NEW MATERIALS
This report focuses on three areas in which glycoscience can make significant contributions: health, energy, and materials science. The com-
mittee identified these three areas because they illustrate the diverse roles played by glycans and because glycoscience is relevant to researchers from a range of backgrounds. These focus areas demonstrate how improved understanding of glycans can make concrete impacts in society, particularly as part of the development of a bio-enabled innovation economy, as recently articulated by both the Organisation for Economic Co-operation and Development and the White House. This report does not address the roles of carbohydrates as food sources and nutritional supplements. Although these are also important areas to be explored, they are outside the scope of this study and outside the expertise of the study committee.
In human health, glycans are involved in myriad processes that are part of normal physiology, development, and cell signaling, along with the development of both chronic and infectious diseases. For example, glycans on cell surfaces are important in molecular recognition. One example of this function is their role in the movement of white blood cells through the body to a site of infection, enabling the immune system to respond where needed. Much of the information content in cells is encompassed in the glycome. Glycans contain key biological information that complements the information stored in DNA to help complete the link between genotype and phenotype or between the genome and expressed traits. Many advances in understanding human health and diseases are the result of current knowledge about nucleic acids, proteins, and glycans and how these vary in different circumstances and in different people. However, much is still unknown. Continued advances in understanding the biological roles played by glycans, along with the factors that influence or alter their functions, will have consequences for the fundamental understanding of biology and will contribute to the development of new therapeutic medicines.
Carbohydrates are fundamental to plant biology. Constituents of plant cell walls include glycans such as cellulose and hemi-cellulose combined in a matrix of other biopolymers. As society explores sources of energy that can provide alternatives to fossil fuels, harnessing the energy stored in these plant carbohydrates is one attractive option. Effectively converting plant glycans into liquid biofuels requires breaking down the structures of plant cell walls in order to release the constituent carbohydrate molecules for subsequent processing. Advances in understanding the glycans that comprise the cell wall, the enzymes that help assemble and degrade it, and how it can be altered to improve the degradation process can all make significant contributions to improving the feasibility of this energy source.
Glycans such as cellulose, starch, chitin, and others also provide the basis for creating new materials with useful physical and chemical prop-
erties. Such materials can take the form of bulk polymers or be processed into forms such as nanoparticles. In addition, other molecules can be attached to the glycans to alter the functional properties of the material or to affect how the polymer interacts in biological systems. These glycan-based materials provide potential substitutes for many petroleum-based plastics and have wide-ranging uses in medicine and industrial applications. For example, they can serve as carriers to encapsulate and deliver drugs and as scaffolds for tissue engineering, and they can be used in flexible coatings and films.
A TOOLKIT THAT INCLUDES MANY COMPONENTS BUT THAT ALSO HAS KEY GAPS
Because glycans are made of different types of individual sugar units linked in multiple ways, large numbers of different glycan structures can be created from the same constituent carbohydrate molecules. Unlike DNA and proteins, glycans are not created by following a sequence template but rather through enzymatic reactions that depend on several factors, including the concentrations of many different enzymes and many different substrates. The diversity of possible glycan structures makes them scientifically interesting. The large number of structures and the various ways in which glycans interact with other biological molecules create diversity beyond what can be encoded in an organism’s genome alone. However, these characteristics also pose challenges to probing glycan structure and function and to being able to control and manipulate them in research.
The explosion in genetic research and understanding of gene functions that has occurred over the past 25 years was enabled by the development of new tools, such as high-throughput DNA sequencers and synthesizers. Tools to study DNA are now part of the repertoire of many biologists and chemists. Glycoscience, too, relies on a toolkit of techniques that enable key questions to be explored and answered. Although much can be accomplished by using existing tools, large gaps remain in such areas as the chemical and enzymatic synthesis of glycans and analytical techniques to determine glycan structures and functions. Glycoscience also lacks accessible, integrated, and well-annotated databases similar to those that exist for proteins and nucleic acids. New tools and techniques will be needed to enable glycoscience to live up to its potential to contribute to areas in health, energy, and materials. Creating these new tools and techniques will require engaging scientists and engineers from multiple disciplines who can bring new ideas and solutions to the field to help fill these identified gaps.
Glycoscience is a broad field, and the committee’s findings capture only an overview of the information the committee considered in making its recommendations and developing a roadmap for the field. The findings are organized under the topics of human health, energy, materials science, and the toolkit needed to advance the field.
- Glycans are directly involved in the pathophysiology of every major disease.
- Additional knowledge from glycoscience will be needed to realize the goals of personalized medicine and to take advantage of the substantial investments in human genome and proteome research and its impact on human health.
- Glycans are increasingly important in pharmaceutical development.
- Plant cell walls, made mostly of glycans, represent the planet’s dominant source of biological carbon sequestration, or biomass, and are a potentially sustainable and economical source of non-petroleum-based energy.
- Understanding cell wall structure and biosynthesis and overcoming the recalcitrance of plant cell walls to conversion into feedstocks that can be transformed into liquid fuels and other energy sources will be important to achieving a sustainable energy revolution. Glycoscience research will be necessary to advance this area.
- Glycoscience can contribute significantly to bioenergy development by advancing the understanding of how to increase biomass production per hectare and how to increase the yield of fermentable sugar per ton of biomass.
- By fostering a greater understanding of the properties of glycans and of plant cell wall construction and deconstruction, glycoscience can play an important role in the development of non-petroleum-based sustainable new materials.
- Glycan-based materials have wide-ranging uses in such areas as fine chemicals and feedstocks, polymeric materials, and nanomaterials.
- There are many pathways to create a variety of functionalities on a glycan, creating a wide range of options for tailoring material properties.
Based on the above, the committee makes the following findings regarding the toolkit needed to advance glycoscience:
- Scientists and engineers need access to a broad array of chemically well-defined glycans.
- Over the past 30 years, tremendous advances have been made in chemical and enzymatic synthesis of glycans, but these methods remain relegated to specialized laboratories capable of producing only small quantities of a given glycan. For glycoscience to advance, significant further progress in glycan synthesis is needed to create widely applicable methodologies that generate both large and small quantities of any glycan on demand.
- A suite of widely applicable tools, analogous to those available for studying nucleic acids and proteins, is needed to detect, describe, and fully purify glycans from natural sources and then to characterize their chemical composition and structure.
- Continued advances in molecular modeling, verified by advanced chemical analysis and solution characterization tools, can generate insights for understanding glycan structures and properties.
- An expanded toolbox of enzymes and enzyme inhibitors for manipulating glycans would drive progress in many areas of glycoscience.
- A centralized accessible database linked to other molecular databases is needed to fully realize advancements in knowledge generated by an expanded effort in glycoscience. Glycan information is not currently accessible to the research community in an integrated and centralized manner similar to other biological information.
A ROADMAP TO ADVANCE GLYCOSCIENCE
Based on these findings, the committee makes the following recommendations in order to achieve a more complete understanding of glycoscience and to realize its impacts on health, energy, and materials science. Each recommendation is followed by a series of roadmap goals.
The capabilities created by the achievement of these recommen-
dations will ensure that all interested researchers can efficiently and effectively incorporate glycoscience into their work.
1. The committee recommends that the development of transformative methods for the facile synthesis of carbohydrates and glycoconjugates be a high priority for NIH, NSF, DOE, and other relevant stakeholders.
Within 7 years, have synthetic tools to be able to synthesize all known carbohydrates of up to octasaccharides, including substituents (e.g., acetyl, sulfate groups). This goal encompasses human glycoprotein and glycolipid glycans and proteoglycans, which are currently estimated to be 10,000 to 20,000 structures, along with plant and microbial glycans and polymers.
Within 10 years, have synthetic tools to be able to synthesize uniform batches, in milligram quantities, of all linear and branched glycans that will enable glycan arrays for identifying protein binding epitopes, provide standards for analytical methods development, and enable improved polysaccharide materials engineering and systematic studies for all fields to be conducted. This includes methods for synthesis of structures with isotopic enrichment of specific desired atoms that may be needed for a wide variety of studies.
Within 15 years, be able to synthesize any glycoconjugate or carbohydrate in milligram to gram quantities using routine procedures. Community access should be available through a web ordering system with rapid delivery.
2. The committee recommends that the development of transformative tools for detection, imaging, separation, and high-resolution structure determination of carbohydrate structures and complex mixtures be a high priority for NIH, NSF, DOE, FDA, and other relevant stakeholders.
Over the next 5-10 years, develop the technology to purify, identify, and determine the structures of all the important glycoproteins, glycolipids, and polysaccharides in any biological sample. For glycoproteins, determine the significant glycans present at each glycosylation site. Develop agreed upon criteria for what constitutes the acceptable level of structural detail and purity.
Within 10 years, have the ability to undertake high-throughput sequencing of all N- and O-linked glycans from a single type of cell in a single week.
Within 10 years, have the ability to routinely determine the complete carbohydrate structure of any glycan or polymer repeat sequence including branching, anomeric linkages between glycans, and substituents.
Within 15 years, have the ability to determine glycoforms (a complete description of molecular species within a population that have the same polypeptide sequence) of any glycoprotein in a biological sample.
For example, one specific achievable step could be to apply the tools developed in the roadmap to characterize the set of glycomes in blood, including those of blood cells and plasma.
3. The committee recommends that the development of transformative capabilities for perturbing carbohydrate and glycoconjugate structure, recognition, metabolism, and biosynthesis be a high priority for NIH, NSF, DOE, and other relevant stakeholders.
Within 5 years, identify the genes involved in glycan and glycoconjugate metabolism in any organism whose genome has been sequenced, and identify the activities of at least 1,000 enzymes that may have utility as synthetic and research tools.
Within 10 years, be able to use all glyco-metabolic enzymes (e.g., glycosyltransferases, glycosidases) as well as other state of the art tools for perturbing and modifying glyco-metabolic pathways (knockouts, siRNAs, etc.) of utility to the biomedical and plant research communities.
Within 10 years, develop methods for creating specific inhibitors to any human, plant, or microbial glycosyltransferase suitable for in vitro and in vivo studies in order to perturb the biology mediated by these enzymes.
Within 15 years, develop imaging methods for studying glycan structure, localization, and metabolism in both living and non-living systems.
4. The committee recommends that robust, validated informatics tools be developed in order to enable accurate carbohydrate and glycoconjugate structural prediction, computational modeling, and data mining. This capability will broaden access of glycoscience data to the entire scientific community.
Within 5 years, develop an open-source software package that can automatically annotate an entire glycan profile (such as from a mass spectrometry experiment) with minimal user interaction.
Within 5 years, develop the technology to perform computer simulations of carbohydrate interactions with other entities such as proteins and nucleic acids.
Within 10 years, develop the software to simulate a cellular system to predict the effects of perturbations in glycosylation of particular glycoconjugates and polysaccharides.
5. The committee recommends that a long-term-funded, stable, integrated, centralized database, including mammalian, plant, and microbial carbohydrates and glycoconjugates, be established as a collaborative effort by all stakeholders. The carbohydrate structural database needs to be fully cross-referenced with databases that provide complementary biological information (e.g., PDB and GenBank). Furthermore, there should be a requirement for deposition of new structures into the database using a reporting standard for minimal information.
Within 5 years, develop a long-term-funded, centralized glycan structure database with each entry highly annotated using standards adopted by the community and all the world’s repositories of glycan structures. The database should be cross-referenced and open source to allow the community to develop database resources that draw on this resource and improve its utility to investigators that wish to incorporate glycoscience in their work.
Within 5 years, employ an active curation system to automatically validate glycan structures deposited into a database so that journals can provide authors with an easily accessible interface for submitting new glycan structures to the database.
To achieve the roadmap goals articulated in its recommendations, the committee notes that it will be of critical importance for the field to reach agreement on the standards of evidence and the nature of the assumptions that will be used to annotate and validate glycan structures within the next 2 to 3 years.
Finally, the committee notes that there is widespread lack of understanding and appreciation of glycoscience in the scientific and medical communities and among the general public. Glycans are integral components of living organisms, whether human, animal, plant, or microbe,
and glycan products have applications in health, energy, and materials science.
The committee concludes that integrating glycoscience into relevant disciplines in high school, undergraduate, and graduate education, and developing curricula and standardized testing for science competency would increase public as well as professional awareness.
Within 5 years, integration of glycoscience as a significant part of the science curriculum would include glycoscience as both lecture materials and hands-on experiments or activities.
Within 10 years, glycoscience will be integrated and taught at every level wherever it is relevant to understand the scientific content. Competency in glycoscience could also be included in all standardized testing wherever relevant (for example, as part of the SAT and GRE Subject Tests, the MCAT, and Medical Board Exams).
Glycoscience is a vibrant field filled with challenging problems. It can make contributions toward understanding and improving human health, creating next-generation fuels and materials, and contributing to economic innovation and development. Now is the time for glycoscience to be embraced broadly by the research community. Drawing in members from the full spectrum of chemistry, biology, materials science, engineering, medicine, and other disciplines will be needed to address the technical challenges described here. Although these challenges are substantial and complex, the results of achieving these goals have the potential to impact science in exciting ways.