Appendix B
Database Descriptions
Examples of Genome, Protein, Genetic Variant, and Toxicology Databases, as of the Beginning of 2000:
Protein, Nucleotide, 3D Structures, Genomes, Taxonomy, and PubMed Literature. The Entrez Browser at National Center for Biotechnology Information of the National Library of Medicine and National Institutes of Health
http://www.ncbi.nlm.nih.gov/Entrez/
Established in 1988 as a national resource for molecular biology information, NCBI creates public databases, conducts research in computational biology, develops software tools for analyzing genome data, and disseminates biomedical information - all for the better understanding of molecular processes affecting humans and disease. Linked databases for protein sequences, nucleotide sequences, 3D structures of proteins and nucleic acids, genomes, taxonomy, and the PubMed literature.
Biochemical Nomenclature
http://alpha.qmw.ac.uk/~ugca000/iupac/jcbn
The Biochemical Nomenclature Committee Web site has links to the International Union of Biochemistry and Molecular Biology and the International Union of Pure and Applied Chemistry.
Enzyme Nomenclature
The Enzyme Nomenclature Database. This site is a repository of information relative to the nomenclature of enzymes. It is primarily based on the recom-
mendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB) and it describes each type of characterized enzyme for which an EC (Enzyme Commission) number has been provided.
Protein Sequences
http://www.expasy.ch/sprot/sprot-top.html
Home of the SWISS-PROT Annotated Protein Sequence Database. SWISS-PROT is a curated protein sequence database that strives to provide a high level of annotations (such as the description of the function of a protein, its domains structure, post-translational modifications, variants, etc), a minimal level of redundancy, and a high level of integration with other databases.
Mendelian Inheritance in Man
http://www.ncbi.nlm.nih.gov/omim/
This database, developed by the National Center for Biotechnology Information (NCBI), is a catalogue of human genes and genetic disorders authored and edited by Dr. Victor A. McKusick and his colleagues at Johns Hopkins and elsewhere. It contains textual information, pictures, and reference information as well as links to NCBI’s Entrez database of MEDLINE articles and sequence information.
Nomenclature for Human Gene Mutations
http://www.interscience.wiley.com/jpages/1059-7794/nomenclature.html
This Internet site contains recommendations for a Nomenclature System for Human Gene Mutations.
Human DNA Polymorphisms
http://research.marshfieldclinic.org/genetics
This site contains a significant amount of information on human DNA polymorphisms and their analysis.
Phosphodiesterase (PDE) Gene Family
http://depts.washington.edu/pde/
Recent phosphodiesterase (PDE) gene family nomenclature recommendations. Includes updates for nearly all of the gene families and many of the subfamilies.
Human Gene Nomenclature
http://www.gene.ucl.ac.uk/nomenclature/
Web site for the Human Gene Nomenclature Committee. It includes a nomenclature database and guidelines. In addition, the site contains information on gene families and access to other relevant links.
The Mouse Genome
http://www.informatics.jax.org/support/nomen
The Mouse Genome Database (MGD) contains information on mouse genetic markers, molecular segments, phenotypes, comparative mapping data, experimental mapping data, and graphical displays for genetic, physical, and cytogenetic maps. MGD is updated daily.
Mouse Knockout Mutants
http://www.bioscience.org/knockout/knochome.htm
The Knockout Mouse Database at this Web address presents information on the phenotypes rendered by the knockout of various molecules. Gene knockouts are classified according to the viability of the mice: (1) gene knockouts that are compatible with viability; (2) gene knockouts that result in prenatal mortality; (3) gene knockouts that result in postnatal mortality; and (4) gene knockouts that result in perinatal mortality.
Drosophila Genome
http://flybase.bio.indiana.edu/
FlyBase is a comprehensive database of information on the genetics and molecular biology of Drosophila. It includes data from the Drosophila Genome Projects and data curated from the literature. FlyBase is a joint project with the Berkeley and European Drosophila Genome Projects.
Rat Genetics and Genes
http://ratmap.gen.gu.se/ratmap/wwwnomen/nomen.html
The International Rat Genetic Committee (RGNC) is dedicated to developing an internationally accepted standard genetic nomenclature for rats and to bring this nomenclature to the attention of scientists working in the field of rat genetics. On its web site there is information on rat gene symbols, DNA symbols, chromosome nomenclature, and a brief summary of rat locus symbol nomenclature rules.
Chicken Genetics and Genes
http://www.ri.bbsrc.ac.uk/chickmap/nomenclature.html
This Web site is the home of Chickmap, a chicken gene mapping project, maintained by the Roslin Institute. It includes information on nomenclature for naming loci, alleles, linkage groups, and chromosomes to be used in poultry genome publications and databases.
Zebrafish Development
http://zfishstix.cs.uoregon.edu/
Home of Fish Net, a gateway to Zebrafish Research Databases, provided by the Institute of Nueroscience at the University of Oregon. Its links include
information on: embryonic and larval anatomy, genomics, genetic staging, molecular probes, and ZFIN, the on-line database of zebrafish information.
Zebrafish Gene Nomenclature
http://zfish.uoregon.edu/zf_info/zfbook/chapt7/7.1.html
An exerpt from Chapter 7 (Genetic Methods) of the Zebrafish Book. It address conventions for naming zebrafish genes, including genes identified by mutation as well as the use of abbreviated names and alleles. It also discusses priority in naming.
Nematode (C. elegans) Genes, Transcripts, and Proteins
http://www.sanger.ac.uk/Projects/C_elegans/blast_servers.html
Home of the Sanger Centre’s C. elegans BLAST server. This site allows the searching of a DNA database containing sequence data from both the Cambridge and St. Louis sequencing groups. You can also search the current database of C. elegans ESTs and the C. elegans protein database wormpep.
American Type Culture Collection (ATCC)
ATCC is a global nonprofit bioscience organization that provides biological products, technical services, and educational programs to private industry, government, and academic organizations around the world. The mission of the ATCC is to acquire, authenticate, preserve, develop, and distribute biological materials, information, technology, intellectual property, and standards for the advancement, validation, and application of scientific knowledge.
Plant Genome Nomenclature
The Mendel Bioinformatics Group at the John Innes Center in the UK maintains this database of plant genome nomenclature.
Yeast (S. cerevisiae) Genome, Genes, and Proteins
http://genome-www.stanford.edu/saccharomyces/
Home of the Saccharomyces Genome Database (SGD), covering the molecular biology and genetics of the yeast Saccharomyces cerevisiae, commonly known as baker’s or budding yeast.
Human Cytochrome P450 Polymorphisms
http://www.imm.ki.se/CYPalleles/
Recommended nomenclature for the polymorphisms of human cytochrome P450 enzymes.
Signal Transduction
Sponsored by the journal Science and Stanford University, this site provides information on all signaling pathways, the variety of ligands, the protein intermediates, the variety of protein kinases, and the cross-talk of pathways.
Proteomics
http://expasy.nhri.org.tw/ch2d/
The Swiss database for identity of proteins by 2D polyacrylamide gel electrophoresis, as well as techniques and links.
Mouse Transgenic and Targeted Mutation Database
The TBASE database attempts to organize information on transgenic animals and targeted mutations generated and analyzed worldwide.
Expressed Sequence Tag Web Site
http://www.ncbi.nlm.nih.gov/dbEST/index.html
The expressed sequence tag database contains sequence data and other information on expressed sequence tags from a number of organisms.
TOXNET (Toxicology Data Network)
TOXNET is a collection of databases on toxicology, hazardous chemicals, and related areas. The databases include TOXLINE, which contains citations from 1965 to the present on the pharmacological, biochemical, physiological, and toxicological effects of drugs and other chemicals; the Developmental and Reproductive Toxicology (DART) Database; and the Hazardous Substance Data Bank (HSDB).
Cancer Gene Anatomy Project
http://www.ncbi.nlm.nih.gov/ncicgap/
The Cancer Gene Anatomy Project (CGAP), administered by the National Cancer Institute, was established to generate information and technical tools needed to study the molecular anatomy of the cancer cell. Much of the information generated is presented in several databases maintained by the National Center for Biotechnology Information. These databases can be accessed through the CGAP Web site.
Environmental Genome Project
http://www.niehs.nih.gov/envgenom/
The goal of the Environmental Genome Project (EGP), administered by the National Institute of Environmental Health Sciences, is to understand the im-
pact and interaction of environmental exposures on human disease. Specifically, polymorphisms of environmental disease susceptibility genes are being identified and a central database of polymorphisms for these genes is being developed. This database, developed in conjunction with the University of Utah Genome Center, integrates gene sequence and polymorphism data; see www.genome.utah.edu/genesnps/. The EGP Web site also offers links to databases maintained by other organizations, such as Gene Map 99 (a map of more than 30,000 human genes) and the Oak Ridge National Laboratory Human Chromosome Map.