Skip to main content

Currently Skimming:

2 Potential Value of a Digital Mathematics Library
Pages 28-54

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 28...
... There is a largely unexplored network of information embedded in the connections of mathematical objects, and formalizing this network -- making it easy to see, manipulate, and explore -- holds the potential to vastly accelerate and expand currently mathematical research. This network would consist of information from traditional resources, such as research papers published in journals, and content dispersed in other Internet-based resources and databases.
From page 29...
... It also shows how useful it would be to be able to pull much of this information into a unified source and make additional connections to other, lesser known resources and aspects of the literature. The DML could aggregate and make available collections of ontologies, links, and other information created and maintained by human contributors and by curators and specialized machine agents with significant editorial input from the mathematical community.
From page 30...
... If a similar search is done via Google Scholar, a list of research articles and books on the subject appear and are ordered by "popularity," which usually reflects some version of page ranking. While some of the references provided by Google Scholar can be viewed, including some books on Google books, others are behind ­ paywalls or are books that must be purchased before reading.
From page 31...
... The committee identified a number of basic desired library capabilities, including aggregation and documentation of information, annotation, search and discovery, navigation, and visualization and analytics. Properly implemented across the domain of mathematics research literature, these capabilities and resulting enhanced functionalities would not only facilitate better and more efficient search and discovery, but also allow mathematicians to interact with the research literature in new ways and at new levels of granularity.
From page 32...
... Aggregation and Documentation Mathematicians want to be able to make searchable and sharable collections or lists of various kinds of mathematical objects easily, including bibliographies of the mathematical literature, perhaps with annotations. This is an area where it should be very easy to make rapid progress.
From page 33...
... 6  RDF is a standard model for data interchange on the Web and facilitates data merging even in the case of differing underlying schemas. See WC3 Semantic Web, "Resource Description Framework (RDF)
From page 34...
... The markup languages used by automatic theorem provers could also be useful because they are sufficiently flexible to encode many important theorems, but they might not do enough to encourage reuse of terms. The theorem and lemma repository would benefit from being accessible to programs via an application programming interface, which is a protocol used to allow software components to easily communicate with each other and may include specifications for routines, data structures, object classes, and/or variables.
From page 35...
... Links to discussions and comments on research papers and theorems could be a way to expand research discussions to a new level. Senior mathematicians could provide some general background information to research papers, such as a basic prerequisite for understanding the paper and some suggested readings; this would assist students and people starting out in a new direction.
From page 36...
... 14  Andrew W Mellon Foundation, Bamboo DiRT, http://dirt.projectbamboo.org, accessed January 16, 2014.
From page 37...
... But the committee sees first steps toward realizing such capabilities in the innovative work of Wolfram|Alpha in the restricted domain of continued fractions.15 Wolfram|Alpha prototyped and built a technological infrastructure for collecting, tagging, storing, and searching mathematical knowledge of continued fractions and presents it through a Wolfram|Alpha-like natural language interface. The main types of knowledge provided in this work are theorems, mathematical identities, definitions and concepts, algorithms, visualizations and interactive demonstrations, and references.
From page 38...
... The committee sees enormous potential for developments in this area by some concerted research effort involving a team of people with complementary expertise in machine learning, natural language processing, humancomputer interaction, and mathematical knowledge representation.
From page 39...
... ⊕ H(Λk ) d 1.5 Period integrals of weight 3/2 unary theta functions 23 FIGURE 2-1  An example of complex mathematical typography.
From page 40...
... . For ¯ any Lie group G , we have simplicial manifolds NG, N G and simplicial G40 bundle γ : N G → NG as follows: 21ST CENTURY MATHEMATICS LIBRARY ¯ DEVELOPING A q−times � �� � NG(q)
From page 41...
... . Graphical representations of chemical substances are automatically converted into unique InChI labels, which can be created independently of any organization, built into any chemical structure drawing program, and created from any existing collection of chemical structures (Heller et al., 2013)
From page 42...
... While this system is currently under revision due to the widespread availability of genomic information, it is still a useful example of how ontologies can play an important role in establishing structures that promote the ease of comparing and searching. As compared to some scholarly domains, mathematics is fortunate to have a significant de facto standard ontology around which its research literature is organized.
From page 43...
... In the context of linked data and the Semantic Web, this database also can be used (and has been used by MathSciNet) to generate such things as collaboration graphs, which are useful (for example)
From page 44...
... In this sense, the LaTeX code for a formula would seem to fall short as a direct template for a putative international mathematical formula identifier (as discussed 18  "TeX," Wikipedia, last modified January 7, 2014, http://en.wikipedia.org/wiki/TeX. 19  LaTeX -- A document preparation system, last revised January 10, 2010, http://www.
From page 45...
... This LaTeX metadata search, while not quite a LaTeX formula search, is fairly successful in dealing with dynamic notation and terminology change in the literature of special functions. An option for semantic representation of mathematical formulas can be provided by MathML,21 which allows for mathematics to be described for machine-to-machine communication and is formatted so that it can easily be displayed in webpages.
From page 46...
... is not yet possible, significant benefit can be realized by utilizing existing scalable methods and algorithms to assist human agents in identifying important math ematical concepts contained in the research literature -- even while fully automated recognition remains something to aspire to. Navigation Mathematicians want the ability to navigate and explore the corpus of mathematical documents available to them, be it through institutional library services or through free services.
From page 47...
... But with modern browser extension capabilities, such as those provided by Scholarometer,28 which harvests data from Google Scholar, it is straightforward to write a dedicated browser extension for mathematical search 27  Elsevier B.V., Scopus, http://www.scopus.com/home.url, accessed January 16, 2014. 28  Indiana University, Scholarometer, http://scholarometer.indiana.edu/, accessed January 16, 2014.
From page 48...
... ,29 and can be obtained from open search systems such as Lucene30 or ElasticSearch.31 Because today's widespread availability of all kinds of data is increasing attention on the need for better visualization tools, the committee anticipates that greatly improved open-source tools for graphical displays will become widely available and easily deployable to demonstrate interesting and novel features of the graphical relations in bibliographic 29  "Scalable Vector Graphics," Wikipedia, http://en.wikipedia.org/wiki/Scalable_Vector_ Graphics, accessed January 16, 2014. 30  Apache Software Foundation, "Welcome to Apache Lucene," http://lucene.apache.org/, accessed January 16, 2014.
From page 49...
... (This type of linking 32  Collaboration graphs are already attractively viewable on Microsoft Academic Search with the proprietary Microsoft Silverlight software. 33  Altmetrics, "Altmetrics: A Manifesto," v 1.01, September 28, 2011, http://altmetrics.
From page 50...
... Such incremental improvements may not be very interesting from the perspective of machine learning research, but they are potentially useful in production applications of machine learning algorithms that the DML could provide. The Mathematical Concept Graph Mathematical research can also be aided by considering mathematical objects other than papers, through exploration of their connections in a directed graph.
From page 51...
... Visualization and Analytics One way to help mathematicians learn from the large, complex, and rapidly growing and evolving literature base is to employ tools that are being developed to analyze data in a wide variety of settings, including both visualization tools and other analytical and statistical approaches. These tools could exploit the natural graphical structure of co-authorship and citation graphs and the relations among various kinds of mathematical objects and the parts of the literature that discuss these objects (as described in the previous section)
From page 52...
... Computational Capabilities The committee wishes to promote cooperation between the DML and computational service providers to allow users functionality, such as being able to cut a formula out of a mathematical document and paste it into a computing environment. This can already be done to some extent for simple formulas by cutting, massaging, and pasting a formula into Wolfram|Alpha, which uses natural language processing methods to match natural language queries with more formal knowledge representations.
From page 53...
... These could serve as a framework for research programs to explore promising technologies and services, including extraction and identification of mathematical objects and applications of tagging or classification (including, perhaps, community-sourced approaches)
From page 54...
... Gjunter on Polynomial Ideal Theory.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.