National Academies Press: OpenBook

Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief (2023)

Chapter: Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief

Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
images Proceedings of a Workshop—in Brief

Toward Sequencing and Mapping of RNA Modifications

Proceedings of a Workshop—in Brief


Cells have a set of intricately coordinating mechanisms that they use for regulation and homeostasis. One strategy cells use for regulation is modifying proteins, DNA, and RNA to control their structure, function, and stability. For years, research has focused on the reversible modifications to proteins and DNA. However, RNA, can also be highly modified, and more than 170 types of modification to RNA have been identified so far. Current methods for mapping and sequencing RNA and its modifications are limited, partly because available sequencing technologies can detect only a small number of these RNA modifications. This limits the understanding of different molecular processes and leaves a gap in knowledge related to human diseases and disorders.

To address the limitations and develop a roadmap for the sequencing of RNA with all of its modifications—known as the epitranscriptome—the National Academies of Sciences, Engineering, and Medicine convened an ad hoc committee to provide a consensus report and recommendations on whether and how best to successfully map and sequence the epitranscriptome. A workshop held on March 14–15, 2023 was one part of an information-gathering effort by the committee and is summarized in this proceedings. This Proceedings of a Workshop—in Brief provides the rapporteurs’ high-level summary of the topics addressed at the workshop. It should not be viewed as consensus conclusions or recommendations of the National Academies or the study committee.

Throughout the workshop, several key themes were discussed and expanded upon through the course of the speakers, panels, and breakout sessions. These key themes include:

  • Improving methods for identification of modified bases during different types of RNA sequencing. This includes better detection, standards, algorithms, and compatible datatypes.
  • The importance of computation and modeling in all aspects of RNA modification determination.
  • Understanding the specifics of any large-scale project before one is initiated. This includes goals, metrics of success, governing bodies, and potential impact for different stakeholders.
  • The ability to translate results into medical needs such as diagnostics and treatment.
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×

FRAMING THE WORKSHOP

The workshop opened with remarks from committee co-chair Brenda Bass (University of Utah School of Medicine). Bass posed the question, is there a scientific need to map and sequence the epitranscriptome? During the workshop, many speakers and panelists answered this question by explaining the importance of such an effort. Bass also posed a series of questions that were echoed and addressed by workshop speakers. What is the state of current experimental and computational methods for sequencing and mapping RNA modifications? What do current RNA databases provide and how will they need to be organized and curated to meet the needs of an RNome project?1 What are the challenges to applying the knowledge gained for scientific, clinical, and public health needs? What are the policy and workforce challenges?

Following Bass, Fred Tyson of the National Institute of Environmental Health Sciences (NIEHS) provided highlights from a multiday NIH workshop that took place in 2022 entitled Capturing RNA Sequence and Transcript Diversity, From Technology Innovation to Clinical Application.2 The NIH workshop explored many of the same challenges, rationales, benefits, and needs for comprehensively characterizing the human transcriptome and epitranscriptome.

Following this, Kristin Koutmou (University of Michigan) expanded on the questions raised by Dr. Bass by providing four “grand challenges” for the RNA modification field:

  • Establishing the chemical diversity of modifications in both noncoding RNAs and protein-coding mRNAs.
  • Mapping where those modifications reside and determining how they are added enzymatically to RNA.
  • Quantifying how much of each modification is present on a transcript or in the transcriptome.
  • Determining the function of individual sites so that there is an understanding of how RNA modifications contribute to biology in health and disease.

Koutmou pointed out that in some cases, recent deep sequencing technologies have revealed more than 10,000 RNA modification sites in both noncoding and protein-coding RNA molecules comprised of at least 150 types of nucleoside modifications. These modifications have the potential to impact every step in the post-transcription RNA life cycle. She also noted that RNA modifications can redistribute in response to perturbations that lead to stress in cells. Koutmou connected these dynamic RNA modifications to disease noting that perturbations in noncoding RNAs and redistribution of mRNA modifications are linked to a wide of variety of diseases and negative outcomes for human health, including neurological diseases, cancers, mitochondrial disorders, and vascular diseases.

TECHNOLOGIES FOR MAPPING AND SEQUENCING RNA MODIFICATIONS

A better understanding of RNA modifications may benefit from collecting samples, identifying de novo and known modifications in a transcript, direct sequencing, followed by storage, curation, and analysis of the sequencing data. While many workshop participants suggested nanopore technologies will eventually be the best way to sequence all RNAs, it was noted that mass spectrometry (MS), as well as many indirect sequencing methods, is excellent for providing foundational datasets on shorter and more abundant RNAs, like tRNAs. These data can be used to enhance the nanopore algorithms by improving the reliability of nanopore sequencing, and by providing the sequencing information that is currently not accessible to nanopore.

Direct Sequencing Technologies

Nanopore Sequencing

Nanopore technology utilizes a nanopore embedded in a synthetic, electrically-resistant membrane. The applied current that is passed through the membrane is disrupted when a single molecule passes through its pore. Sequencing is accomplished when differences in chemical structure cause different shifts in the ionic current. The current shifts are recorded and a basecalling algorithm converts them into a sequence. The conversion of current

__________________

1 The RNome refers to all RNA species in a cell at a given time (Arora, 2018).

2https://www.niehs.nih.gov/news/events/pastmtg/2022/rnaworkshop2022/index.cfm.

Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×

information to sequence information, a process referred to as “basecalling,” requires a basecalling model, which can be trained using synthetic and/or biological sequencing data.

Eva Novoa (Centre for Genomic Regulation) noted that nanopore sequencing has numerous advantages over earlier RNA sequencing methods. Nanopore sequencing has the potential to detect all modifications using the same series of steps, so it is theoretically not necessary to create modification-specific protocols. This and the fact that no PCR is required, eliminate two sources of bias. Using some of the specific methodologies that Novoa outlined, one can determine the percentage of bases that have been modified, as well as isoform-specific RNA modification information, poly-A tail length, and RNA modification co-dependencies.

Nonetheless, several challenges need to be overcome to realize the full potential of nanopore sequencing of RNA modifications, Novoa said. She focused on one: the lack of a basecaller for RNA modifications. Without basecalling algorithms trained on a wide range of RNA modifications, nanopore sequencing generates a significant number of false positives. Novoa said what is missing are tools that accurately basecall RNA modifications de novo, what she calls, a modification-aware basecalling model, trained to detected modified bases along with the four canonical bases. Novoa and her colleagues have created a model that recognizes m6A.3 While the results were promising for de novo detection of m6A, training data for other modifications is very limited. The lack of data for other modifications makes the construction of models for nanopore very challenging, she said.

Another limitation of nanopore sequencing technology, according to Shuo Huang (Nanjing University), is the lack of sufficient spatial resolution to resolve each individual base in a sequence. The current output signals are complicated, requiring a lot of bioinformatics processing to deconvolute and resolve the sequence. When nucleosides are modified, Huang noted, this exacerbates the problem by further complicating the signal output. To address this problem, Huang and colleagues engineered a novel type of nanopore that uses some of the same principles as those from Oxford Nanopore Technologies (ONT), but is fundamentally different and still in early-stage research. Applying his novel approach to improve the read resolution of RNA nucleoside monophosphates, Huang and colleagues created a hetero-octamer of the MspA porin and inserted a reactive site in the pore using 3-(maleimide) phenylboronic acid. The phenylboronic acid reacts reversibly with the ribose ring of RNA nucleoside monophosphates. They showed their engineered nanopore could differentiate among adenine, guanine, cytosine, and uridine, as well as pseudouridine, inosine, dihydrouridine, and multiple methylated bases with an accuracy of ~99.6%.4 The next step, he said, is to develop ways to use this technology to completely sequence RNA with all its modifications.

Marcus Stoiber (ONT) discussed the epi-basecalling algorithm, Remora, that he and his colleagues are developing at ONT. He emphasized, as have others, that the choice of data types used for training basecalling models is extremely important for the utility of the algorithm. He uses five data types: synthetic oligonucleotides, synthetic randomers, native biological sequences, biologically-derived sequences that have been “doped” with modifications, and biologically-derived sequences with specific nucleosides that have been enzymatically converted to a modified base. Stoiber noted that ONT is working on basecalling model training and new nanopores and motor proteins that result in faster reads, use less starting material, and have read accuracies of modified RNA sequences around 95%. He noted though that much work remains in developing basecallers toaccurately and routinely detect modified RNA bases in any sequence read and under a variety of conditions.

Mass Spectrometry Sequencing

The “gold standard” for quantifying modifications in RNA is to inject digested RNA samples into a mass spectrometer, permitting identification of known modifications, discovery of new modifications, and quantification of the modifications relative to the total

__________________

3 Novoa manuscript in preparation.

4 Wang et al. (2022) Nature Nanotechnology 17: 976–983.

Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×

sample, Koutmou said. One downside to this approach is that information about where the modification occurs in the sequence, i.e., its sequence context, is lost. RNA can be directly sequenced in a mass spectrometer, though it is much more difficult than sequencing, for example, proteins by MS. Currently, the best that can be done reliably is sequencing purified short RNAs, like tRNA molecules, though some efforts are underway to sequence semi-complex mixtures of short RNA species.

Benjamin Garcia (Washington University in St. Louis) pointed out that MS can theoretically detect any modification in an RNA molecule. Although this can be done while sequencing RNA, there are some practical challenges to overcome that make the sequencing of large transcripts difficult. RNA is difficult to separate by liquid chromatography, a technique that is often coupled to MS, and RNA is relatively unstable. Additionally, there is a dearth of computational resources in the MS field devoted to detecting RNA modifications.

Garcia presented research and methodologies for overcoming some of these challenges. To address the challenge of liquid chromatography separations, Garcia and his colleagues developed a permethylation strategy that attaches deuterated iodomethane to oligo- or mononucleosides. This change increased the hydrophobicity of the nucleosides so that they spread out during liquid chromatography, which improved MS quantification and led to the detection of 60 to 80 modifications for most cell types.5 In addition, Garcia’s lab developed protocols to separate RNA with polygraphite carbon chromatography using electrochemical elution, which enhanced the ability to identify many more RNA modifications within the sequences. To address the paucity of computational tools for analyzing RNA shotgun sequencing experiments, Garcia and colleagues have developed software that matches MS/MS oligonucleotide spectra with a database.6

Mass spectrometry and nanopore technologies are somewhat overlapping methods, but largely complementary. Several speakers and participants emphasized that these two orthogonal approaches can be used to support one another. For example, when asked about developing basecalling training sets for a larger number of RNA modifications, Novoa responded that the training sets need to be validated by mass spectrometry to confirm the stoichiometry of modifications in a sample of oligo standards.

INVESTIGATING MODIFICATIONS BY PROBING RNA STRUCTURE

Many speakers noted that there have been significant successes in determining the structures of RNA molecules using X-ray crystallography, cryogenic electron microscopy (cryo EM), and nuclear magnetic resonance (NMR). Each of these techniques has advantages, and disadvantages, and the workshop explored how these and other techniques can be used for structural determination.

Cryogenic Electron Microscopy

The development of cryogenic electron microscopy (cryo EM) has revolutionized the ability to determine structures of macromolecules and large macromolecular complexes, such as ribosomes. At resolutions below 2.0 angstroms, it is possible to locate RNA modifications from the cryo EM density maps. Jeffrey Kieft (University of Colorado School of Medicine) and separately Ada Yonath (Weizmann Institute) determined the structure of the Giardia lamblia 80S ribosome using cryo EM. The 0.25-angstrom difference in the resolution of their structures enabled Kieft to assess at what resolution modifications can be accurately detected from cryo EM density maps. Additionally, Kieft described some of his work with the group I intron RNA of Tetrahymena thermophilia. Using this solved structure as an example, he noted that cryo EM maps typically have higher resolution in the core than on the exterior of the molecule, because exterior exhibit greater dynamic conformational changes and mobility. This makes solving the structure of RNA chemical modification in the core much more accessible. Kieft’s group also noted the difficulty in getting high resolution structures of smaller RNAs; to solve this problem, they have been appending the T. thermophilia structure to smaller RNA molecules to obtain higher resolution cryo EM images.

__________________

5 Xie et al (2022) Analytical Chemistry doi.org/10.1021/acs.analchem.2c00471.

6 Wein et al. (2020) Nature 1:926; doi 10.1038/s41467-020-14665-7.

Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×

Danica Fujimori (University of California, San Francisco) described the software tool called quantifying posttranscriptional modifications (qPTxM), developed by Iris Young (Lawrence Berkeley National Laboratory), that automates the interrogation of a cryo EM map to detect RNA modifications based on modification geometry. Automated detection narrows the search space of possible modifications, making it simpler to manually inspect the RNA structure at the identified sites to determine whether they are in fact modified.

At the end of his talk, Kieft noted that while cryo EM can be used to detect modifications, the technique does not currently allow that to be done routinely and may be limited to molecules that are larger and well-folded. For cryo EM to be a tool that contributes to mapping the epitranscriptome, Kieft posed several questions: What is the gold standard for resolution? How good do the maps need to be before the data is placed into a database? What do different modifications look like in a cryo EM density map? and What orthogonal techniques will be best to validate modifications identified from cryo EM?

Artificial Intelligence

Raphael Townshend (Atomic AI) pointed out that many traditional techniques for structural determination can take a long time and have a high cost. Thus, Townshend said, the ability to quickly and accurately predict RNA structures has emerged as an important need. Artificial intelligence (AI) predictors, like AlphaFold, have been successful in the realm of proteins, largely because there are hundreds of thousands of experimentally determined protein structures and much protein evolution data available to train models. The same is not true for RNA because there are only hundreds to a few thousand experimentally determined RNA structures. Townshend and his colleagues have instead developed a deep neural network trained on 18 RNA structures called ARES, or the atomic rotationally equivariant scorer. Analysis of the trained ARES network revealed ARES was able to find tertiary motifs such as T loops, base triplets, and triple helices, and recovered key RNA features such as the amount of hydrogen bonding per base and the percentage of Watson-Crick base pairing, despite never training on them. Townshend noted that though this geometric deep learning approach has excellent predictive powers, there remains a need for additional experimental data to improve the training of ARES and similar AI methods for three-dimensional RNA structure prediction.

Related to the importance of additional data, a question arose about the milieu in which structures are determined. Townshend said that it is important to acknowledge that crystal structures may not fully represent the in vivo structure of an RNA. He extended that by saying that being able to model the structure across different cell contexts, such as in the presence of proteins, within macromolecular complexes, and so on is a key need. He noted this is a major active area of research at Atomic AI, and in some cases these complexities are being built into their algorithms.

Mary McMahon (ReviR Therapeutics) discussed how her company uses computational biology and AI to (1) identify small molecules that modulate RNA splicing and regulate posttranscriptional gene expression and (2) identify RNA drug targets. By identifying the target site for a genetically defined disease, AI is used to build an RNA-focused compound library. Their ultimate goal is to identify and predict potential functional sites within the transcriptome that could be targeted with drugs. When a hit is obtained, the dynamics of the small-molecule binding site on the RNA can be probed, as can the binding parameters and some of its in vivo properties within a cell system. In summary, McMahon said that their computational methods and machine learning algorithms are advancing their drug discovery workflow by expanding the chemical space that can be explored.

Computational Investigation of RNA-Protein Interactions

Reversible interactions between RNA and proteins play essential roles in cellular functions and regulation, and these interactions can be affected by RNA modifications, said Phanourios Tamamis (Texas A&M University). Thus, it is important to understand the dynamic interplay among modified RNA and reader, writer, and eraser proteins. Tamamis’ work, in collaboration with Contreras’ work, brings together computational and experimental methods, respectively, to probe these interactions. One example looked at the spectrum of RNA

Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×

modifications that could be present that would enhance binding between the RNA and a particular protein. To answer this question, Tamamis and colleagues developed a computational tool that uses short molecular dynamics simulations of an RNA-protein complex and energy calculations to search trees of RNA modifications, selecting modifications that prove energetically favorable while discarding those that are not, and subsequently investigating them using longer simulations. The output is a set of RNA modifications that are predicted to enhance interactions between the RNA and the protein, along with an ensemble of three-dimensional structures of the RNA modified nucleotides complexed with the protein.7 Tamamis noted that their computational approach requires information on molecular mechanics parameters for RNA modifications as well as the structure of an RNA-protein complex. When answering a question on the use of computational tools to assist with mapping RNA modifications, Tamamis suggested that results from his approach could potentially lead to the ability to engineer proteins that act as modification sensors by detecting the presence of specific RNA modifications.

OLIGONUCLEOTIDE STANDARDS AND OTHER QUALITY CONTROL CONCERNS

Mark Lowenthal (National Institute of Standards and Technology (NIST)) provided the definitions of standards and reference materials as they are used in a metrology center, such as NIST. The three types of standards relevant for RNA research are quantitative, identity, and data standards. A quantitative standard is a specific measure with a precisely known mass fraction or molar concentration including an expressed uncertainty, and a description of its degree of purity and impurities if any. When referring to reference materials, an identity standard means materials in which its composition is known to a high degree of certainty, though a roster of its impurities may not be well defined. Data standards are frequently libraries of data that are searched or used for comparisons with experimental data, such as NIST’s mass spectral library. Lowenthal said that reference standards should be homogeneous, stable, commutable, and traceable. But the key to developing reliable standards is the standard’s application and ensuring it is fit for its purpose.

Lowenthal’s group and others at NIST are trying to understand what a reference material in the RNA world looks like. He also outlined the hurdles to establishing an RNA reference standard. The first step is defining the measure you are interested in and characterizing it sufficiently based on its fit for purpose, determining its stability, and identifying its impurities. Other factors of importance are the cost of producing standards and the time it takes to have them ready to go to market.

Chanfeng Zhao (TriLink Biotechnologies) outlined the three methods of synthesizing RNA oligonucleotides: solid-phase chemistry, liquid-phase chemistry, and non-template based enzymatic synthesis, noting that solid-phase phosphoramidite chemical synthesis remains the method of choice for the community. Important to obtaining high yields of full-length oligos is the synthesis efficiency of each addition cycle. The efficiency is, among other things, dependent on the quality of the reagents, technical prowess of the synthesizer machines, ability to minimize side reactions, efficiency of purification, and the quality of available modified phosphoramidite nucleotides. Longer oligos are more challenging to work with at every step of this process, and at this time, 100mers up to 150mers are the limit of RNA oligonucleotide synthesis. Zhao believes that future developments of chemical and enzymatic synthesis methods could push the upper bound of RNA oligo lengths from 100mers to 1000mers.

Jia Sheng (State University of New York at Albany) discussed the synthesis of a variety of RNA oligonucleotides containing modifications, including single- and dimethylcytidines at the N4 position. To create these modified RNA oligos, Sheng and his colleagues synthesized both m4C and m42C phosphoramidites to use in solid-phase oligonucleotide synthesis. Their approach of combining synthesis of modified oligos with structural analysis may continue to be a fruitful approach for understanding how RNA modifications impact RNA structure, and hence RNA function. Sheng noted that one of the challenges

__________________

7 Orr et al. (2018). Methods, 143, 34–47. doi:10.1016/j.ymeth.2018.01.01.

Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×

in their approach is the scarcity of modified RNA mononucleotides.

Gene Yeo (University of California, San Diego) discussed the logistics of data standards, particularly in the case of producing RNA-protein interactome maps. Their five-lab consortium developed standardization procedures that addressed validating reagents; creating consistent biological, experimental, and data processing pipelines; developing consistent data quality standards; and resolving problems like batch effects. Based on the Encyclopedia of DNA Elements consortium effort to standardize a chromatin immunoprecipitation schema for profiling RNA binding protein-RNA targets8, Yeo and colleagues standardized an enhanced crosslinking and immunoprecipitation (eCLIP) schema that included a standardized eCLIP protocol, standards for analysis, and threshold criteria for datasets to be considered a reliable resource to use with eCLIP. Numerous types of experiments performed by the different labs using shared reagents meant implementing extensive quality control, meticulous recordkeeping, and various types of standardization and validation processes. Each step of their pipeline was formed out of discussions among consortium members, working groups, and the larger community.

In closing, Yeo emphasized the importance of creating rigorous standards for both experimental and computational workflows. The standards that their consortium have developed are being adopted by other academic labs and by the biotechnology and pharmaceutical industries. In addition, these users and the broader research community provide welcome feedback that allows further refinement of standards.

UNDERSTANDING THE SCALE NEEDED FOR A PROGRAM ON RNA MODIFICATIONS

Robert Cook-Deegan (Arizona State University) started the discussion on large-scale projects by providing a brief history of the Human Genome Project (HGP).

Cook-Deegan shared several questions raised about the project: Was the technology up to the task? At such an enormous cost, was it worth spending the money on a human genome reference sequence? Prior to the start of the HGP, large-scale biology research projects were not common in the US, and, as Cook-Deegan said, many people thought it would be better to fund a thousand smaller grants and “let a 1000 flowers bloom,” leading eventually to a reference sequence.

In 1988, two reports were published that helped the scientific community and the federal government coalesce around supporting a human genome project. The U.S. Office of Technology Assessment produced Mapping Our Genes, Genome Projects: How Big, How Fast?, and the National Academies published Mapping and Sequencing the Human Genome.9 The OTA report was requested by Congress, who wanted guidance on their policy options for funding genome projects, the rationale for such projects, coordination of such efforts, and its impacts on the bioeconomy. The National Academies report served as an initial roadmap for the federal government to support, fund, evaluate, and maintain the many components of the HGP. The National Academies proposed that a single federal agency be the lead agency and work closely with a scientific advisory board. They proposed that an additional $200 million in research funding be made available annually for 15 years. The National Academies committee also coalesced around the ideas of sequencing the genomes of model organisms along with the human genome, building a genetic linkage map along with a reference genome, advancing sequencing and other molecular biology technologies, and creating databases and the supporting computational infrastructure to house all the generated data.

In 1996, a meeting was held in Bermuda to discuss policies for how the awarded labs were going to coordinate and collaborate on this project. One result of this meeting was what is now known as the Bermuda

__________________

8 Consortium EP. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74; and Landt SG, Marinov GK, Kundaje A, Kheradpour P, Pauli F, Batzoglou S, Bernstein BE, Bickel P, Brown JB, Cayting P, et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Res. 2012;22:1813–31.

9 U.S. Congress, Office of Technology Assessment, Mapping Our Genes-The Genome Projects. How Big, How Fast? OTA-BA-373 (Washington, DC: U.S. Government Printing Office, April 1988). National Research Council. 1988. Mapping and Sequencing the Human Genome. Washington, DC: The National Academies Press. https://doi.org/10.17226/1097.

Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×

Principles, which set out rules for the rapid and public release of the sequencing data the labs were going to be generating. The rules required that sequencing data be shared at the end of each day. Cook-Deegan said that this radical departure from the practice of hoarding one’s data until after publication expanded the ethos of open and transparent science while it also enhanced the degree of accountability that was needed among the labs to ensure quality control of the generated sequences and facilitated incremental improvements in the project’s technology.

The HGP created a path for genomics to be ubiquitous in basic and clinical biology, the biotechnology ecosystem, and all different life sciences research sectors, Cook-Deegan said. However, as numerous speakers pointed out, the human genome is essentially a single static entity, while the transcriptome and the epitranscriptome are diverse and dynamic. Thus, an epitranscriptome project would start with a much broader set of goals than obtaining a sole reference sequence.

To continue the conversation about large-scale research efforts, Mark Helm (Johannes Gutenberg University) discussed the RMaP (RNA modification and processing) project, an example of collaborative research among a number of research centers in Germany. As with the consortium described by Yeo, the researchers working in RMaP invested time and resources in technology and quality control to support their research. Technology development included mass spectrometry, data management and data science, and sequencing methods, especially nanopore sequencing. Regarding databases, the German-funding agency, DFG, requires that data be placed in open repositories available to all.

In a later session in the workshop on “Major Concerns and Possible Pitfalls” of the field, Todd Lowe (University of California Santa Cruz) noted that any proposed effort would require a large effort, and since it is not possible to do everything, sample selection decisions will be critical. Lowe praised RMaP for its practice of producing data and then pausing to evaluate the results to determine what gaps to fill and what priorities to shift. He also suggested integration of sequencing and computational technologies. Lowe said that the goal should be the ability to predict function based on having annotated the entire transcriptome. Accomplishing this would likely involve mapping multiple organisms and looking at the evolution of both modifications and their regulations.

Wendy Gilbert (Yale University) cautioned against focusing too much on cataloging RNA modifications without sufficient rigorous functional studies, a point that was raised by several other speakers. Functional studies are important because evidence from studies of RNA modifying enzymes have shown that they are critical for human health, and thus, the RNAs targeted by the enzymes are also likely critical in health and disease. Gilbert also emphasized that the research community knows so little about which modified RNAs are critical for health and disease, so it is vital that any large-scale project looks at all transcripts, whatever their expression levels, since expression level does not necessarily correlate with biological significance. Schraga Schwartz (Weizmann Institute) posed a similar question, asking how abundant a modification must be to be included in a mapping project.

Rachel Green (Johns Hopkins University School of Medicine) took a more expansive view of studying the transcriptome, wanting to know more than just its modifications. She noted, for example, a need to understand what happens in the untranslated regions and to identify all the splicing isoforms. In her remarks, she also noted that it is important to evaluate which modifications were most likely to play an important role on different RNAs. She and others also raised concerns about the cost of an RNome project and how it might divert funds from individual lab funding.

Schwartz said that an RNomics project is rich with opportunities to build community, reach agreement in the field, and establish a set of standards that would bolster both the computational and experimental sides of research on RNA and RNA modifications. Schwartz raised the question, “When we’re talking about the epitranscriptome, what are we specifically talking about?” He noted that posing the question has practical implications for a large-scale project because how the epitranscriptome is defined will affect what to

Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×

prioritize, what approaches to take, and how success will be measured. He punctuated that point by saying, “every modification is a world of its own and has a set of techniques of its own.”

RNA IN TRANSLATIONAL SCIENCE AND ITS IMPACTS ON SOCIETY

One of the driving forces for mapping and sequencing RNA modifications is to expand the understanding of how the modifications are connected to health and disease. To what extent and by what avenues can this increase in knowledge translates into medicines, therapeutics, and diagnostic tools, and ultimately improves quality of life for all people.

Eckhard Jankowsky (Moderna) discussed RNA modifications in mRNA medicines. Jankowsky noted that one of the hurdles in using exogenous mRNA as a drug was a need to suppress the resulting host immune response. Researchers reported that modifying mRNA reduced the innate immune response triggered through toll-like receptor (TLR) signaling.10 Subsequent work showed that N1-methylpseudouridine was a better suppressor of the immune response than was mRNA containing pseudouridine, and furthermore, the mRNA containing methyl-substituted pseudouridine showed enhanced protein expression.11

Jankowsky posed and answered the question, how would advancing our understanding of RNA modifications be best able to aid in designing the next generation of mRNA medicines? It would be most helpful, he said, if insights gained from mapping, sequencing, and revealing the functional impacts of RNA modifications were able to provide detailed RNA engineering specifications that

__________________

10 Kariko et al 2005 Cell 23:163-175.

11 Andries et al. 2015 Journal of Controlled Release 217:337-344.

Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×

could be used in pharmaceutical development to create improved mRNA medicines.

Kathy F. Liu (University of Pennsylvania Perelman School of Medicine) talked about RNA modifications in tRNA, rRNA, and mRNA and the associated writers and erasers implicated in human disease. For example, the methyltransferase TRMT10A modifies tRNA via its canonical function and modulates mRNA via interacting with another mRNA demethylase FTO. TRMT10A is abundant in the brain and pancreas. A nonsense mutation in TRMT10A abolishes TRMT10A protein production, leading to alterations in the stability of more than ten tRNAs and possibly altering mRNA modification patterns, resulting in microcephaly, mild intellectual disability, and young onset diabetes.12 Liu asked the workshop participants to consider whether the research community observes and studies the canonical substrate RNA modification changes when we focus on the reader and erase proteins. When looking at modified mRNA, she said, there is a need to improve both mapping at single-nucleotide resolution and the ability to quantify the percentage of methylation at a given site with the aim of measuring the thresholds that trigger a disease phenotype. Liu said it is time to look at how the patterns of tRNA, rRNA, and mRNA modifications are coordinated and how that collectively impacts translation parameters. Finally, Liu asked if the power of different RNA modifications could be harnessed in a combinatorial fashion to treat disease.

Dylan Simon (EveryLife Foundation for Rare Diseases) and Philip Yeske (United Mitochondrial Disease Foundation) suggested turning knowledge about the roles of RNA modifications in health and disease into diagnostics, medicines, treatments, and cures, especially for people with rare diseases.13 Simon noted that there are ~10,000 rare diseases in the US—of which 70% to 80% have a genetic basis—that impact ~30 million people, but only about 500 of them have an FDA-approved treatment. Both Simon and Yeske said that patients with rare diseases are looking for cures, and they are also often looking for treatments that will manage their symptoms and improve their quality of life, even when it means accepting a certain amount of risk. Both speakers highlighted the importance and value of patient-focused drug development, in which people with these rare diseases are talking with researchers, policy makers, funders, and regulators so that the community understands what treatments are wanted and what symptoms they want to alleviate.14 This knowledge can focus basic and translational research efforts. Yeske emphasized that success in developing therapeutics for rare diseases will occur only in an environment of regulatory flexibility, such as accelerated FDA approvals of relevant drugs. Both Yeske and Simon said that patient advocacy groups can play an important role in funding for research. UMDF, for example, has awarded more than $15 million in seed research grants, designed to generate sufficient early-stage data that can be leveraged into applications from the NIH, National Science Foundation, or Department of Defense.

EXAMPLES OF IMPORTANT TAKEAWAYS FROM BREAKOUT SESSIONS

Breakout groups were organized by topical area to discuss specific parts of the committee’s statement of task. In a group that was exploring different goals for mapping and sequencing RNA modifications, the group members discussed several near-term goals, including obtaining new tools like nucleases to cut RNA at defined sites, engaging with core facilities to map and sequence epitranscriptomes, and getting more researchers working on this project.

Another group covering topics related to education and workforce, discussed expanding and focusing undergraduate training to increase the number of researchers in the pipeline, such as integrating computational biology into undergraduate biology majors.

The group discussing databases, standards, and infrastructure first noted that the field may benefit from many small projects completed in order to identify the breadth of what may be needed in databases. They also discussed having multiple databases available for areas

__________________

12 Zung et al. 2015. Am J Med Genet Part A 167A:3167–3173; doi: 10.1002/ajmg.a.37341.

13 A rare disorder is a disease or condition that affects fewer than 200,000 Americans. [National Institute on Deafness and Other Communication Disorders; https://www.nidcd.nih.gov/directory/national-organization-rare-disorders-nord (accessed 4-26-2023)].

14 Voice of the Patient Report 2019 https://www.umdf.org/voice-of-the-patient/.

Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×

such as transcriptomics, mass spectrometry, and RNA structure, and how these could be integrated with one another. Additionally, the group members emphasized the importance of accessibility for biologists working on a broad array of subjects. This group also discussed who might be responsible for such an effort around databases, noting that there are several options including the National Center for Biotechnology Information. The group discussed standards, and noted their importance in the field, and discussed how entities such as funding bodies or journals could enforce good practices, while groups such as NIST, industry, or an international body like the International Union of Pure and Applied Chemistry could assist with standard setting.

One group explored potential metrics of success, including for example, standard experimental and computational workflows that have been adopted by industry partners; low-cost sequencing and mapping technology that reliably works on any particular modification with an established accuracy and bias; an operational framework for integrating technologies for combinatorial modification detection; a framework for sharing data; and established protocols and quality control procedures that ensure reproducibility among users. Several group members also noted that success can be measured by an external review board, during annual reviews by program managers and through internal reviews by project participants.

DISCLAIMER This Proceedings of a Workshop—in Brief was prepared by Steven Moss and Michael Zierler as a factual summary of what occurred at the workshop. The statements made are those of the rapporteurs or individual workshop participants and do not necessarily represent the views of all workshop participants; the planning committee; or the National Academies of Sciences, Engineering, and Medicine.

WORKSHOP PLANNING COMMITTEE1 Brenda L. Bass, University of Utah (Co-Chair); Taekjip Ha, Johns Hopkins University (Co-Chair); Nicholas M. Adams, Thermo Fisher Scientific; Juan D. Alfonzo, Ohio State University; Jeffrey Baker, NIIMBL; Susan Baserga, Yale University; Lydia M. Contreras, University of Texas at Austin; Markus Hafner, NIH; Sarath C. Janga, Indiana University Purdue University at Indianapolis; Patrick A. Limbach, University of Cincinnati; Julius B. Lucks, Northwestern University; Mary A. Majumder, Baylor College of Medicine; Nicole M. Martinez, Stanford University; Kate D. Meyer, Duke University; Keith Robert Nykamp, Invitae Corporation; and Tao Pan, University of Chicago.

REVIEWERS To ensure that it meets institutional standards for quality and objectivity, this Proceedings of a Workshop—in Brief was reviewed by Brenda L. Bass, University of Utah; David Garcia, University of Oregon; and Jin Billy Li, Stanford University. Lauren Everett, National Academies of Sciences, Engineering, and Medicine, served as the review coordinator.

SPONSORS This workshop was supported by The National Institutes of Health and the Warren Alpert Foundation.

For additional information regarding the workshop, visit https://www.nationalacademies.org/our-work/toward-sequencing-and-mapping-of-rna-modifications.

SUGGESTED CITATION National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop—in Brief. Washington, DC: The National Academies Press. https://doi.org/10.17226/27149.

__________________

1This workshop was planned as one part of information-gathering efforts for the consensus study on Mapping and Sequencing Chemical Modifications of RNA. Nicholas Adams has a conflict of interest in relation to his service on the consensus committee because he is an employee of Thermo Fisher Scientific. The National Academies has determined that the experience and expertise of Dr. Adams is needed for the consensus committee to accomplish the task for which it has been established. For additional information, see https://www.nationalacademies.org/our-work/toward-sequencing-and-mapping-of-rna-modifications.

Division on Earth and Life Studies

Copyright 2023 by the National Academy of Sciences. All rights reserved.

images
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
Page 1
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
Page 2
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
Page 3
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
Page 4
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
Page 5
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
Page 6
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
Page 7
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
Page 8
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
Page 9
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
Page 10
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
Page 11
Suggested Citation:"Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop - in Brief." National Academies of Sciences, Engineering, and Medicine. 2023. Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief. Washington, DC: The National Academies Press. doi: 10.17226/27149.
×
Page 12
Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief Get This Book
×
 Toward Sequencing and Mapping of RNA Modifications: Proceedings of a Workshop–in Brief
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

One strategy cells use for regulation is modifying proteins, DNA, and RNA to control their structure, function, and stability. For years, research has focused on the reversible modifications to proteins and DNA. However, RNA can also be highly modified, and more than 170 types of modification to RNA have been identified so far. Current methods for mapping and sequencing RNA and its modifications - also known as the epitranscriptome - are limited, partly because available sequencing technologies can detect only a small number of them. This limits the understanding of different molecular processes and leaves a gap in knowledge related to human diseases and disorders.

To address these limitations and develop a roadmap for the sequencing of RNA with the epitranscriptome, the National Academies of Sciences, Engineering, and Medicine convened an ad hoc committee to provide a consensus report. A workshop held on March 14-15, 2023 was one part of an information-gathering effort by the committee and is summarized in this proceedings.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!