Artificial Intelligence to Assist Mathematical Reasoning Proceedings of a Workshop (2023) / Chapter Skim
Currently Skimming:

4 Current Challenges and Barriers
Pages 23-48

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 23...
... DATA FOR ARTIFICIAL INTELLIGENCE AND MATHEMATICS: CHALLENGES AND NEW FRONTIERS Sean Welleck, University of Washington, discussed the development of datasets specific to the mathematical sciences. He focused on the intersection of language models (LMs)
From page 24...
... . He and his team asked experts to evaluate proof steps based on correctness and usefulness and to identify errors, resulting in a dataset of annotated proof steps.
From page 25...
... Srinivas and Welleck discussed the interaction between LMs and interactive theorem provers. Welleck expressed that LMs excel in translating between informal and formal expressions, whereas a traditional tool might have more difficulty generating natural language.
From page 26...
... Machine Learning, Formal Methods, and Mathematics Avigad outlined formal methods in mathematics and the benefits and challenges of collaboration between mathematics and computer science. Before introducing formal methods, Avigad distinguished between two fundamentally different approaches in AI: symbolic methods and neural methods.
From page 27...
... Digital technologies like proof assistants provide new platforms for researchers to cooperate and collaborate. The Liquid Tensor Experiment exemplifies this collaboration: formalization was kept in a shared online repository; participants followed an informal blueprint with links to the repository; participants communicated through the chat software Zulip; and a proof assistant ensured that pieces fit together, so each participant could work locally on individual subproblems.
From page 28...
... He suggested that mechanisms to assess new kinds of mathematical contributions are required for advancement in the field, and better institutional support should be built for collaborative, cross-disciplinary work. Challenges and Ingredients for Solving Mathematical Problems with Artificial Intelligence Fawzi discussed using AI to solve mathematical problems from his perspective as an ML researcher.
From page 29...
... Discussion Macbeth led a discussion on collaboration between mathematics and computer science and how the domains may bolster one another. Noticing that both speakers' presentations mentioned differences between mathematicians and computer scientists, Macbeth prompted them to discuss the cultural differences they perceived further.
From page 30...
... He also described his work on ProofNet, which gathered problems from undergraduate mathematics textbooks to serve as benchmarks for automatic theorem proving (Azerbayev et al.
From page 31...
... Martin described the Polymath Project, which demonstrated a new, collaborative way of developing mathematics instead of employing the traditional publication model. Started by Timothy Gowers in 2009, the Polymath Project invites people to collaborate online to solve a mathematical problem (Gowers 2009)
From page 32...
... She concluded that mathematics is not just about writing proofs; acknowledging that mathematics lives in a broader context shaped by human factors will better allow stakeholders to take initiative and collaborate to create new mathematical practices -- for example, using AI to assist mathematical reasoning. Can Neural Networks Mimic Human Mathematical Intuitions?
From page 33...
... In the study, a convolutional NN could capture baboon behavior but generally not human behavior; a symbolic model fit human behavior much better. Sable-Meyer and colleagues believe that humans rely on both an intuitive processing of shapes, like other primates, as well as a unique symbolic model.
From page 34...
... Martin highlighted that the current mathematical knowledge base is so large that it is easy to imagine new mathematics being generated from what already exists. A few questions prompted Dehaene to delve further into the limits of current AI and mathematical reasoning.
From page 35...
... CONCENTRATION OF MACHINE LEARNING CAPABILITIES AND OPEN-SOURCE OPTIONS Stella Biderman, Booz Allen Hamilton and EleutherAI, presented in the next session on the concentration of ML capabilities within few institutions and the potential for open-source practice. She focused on large-scale generative AI technologies, explaining that her work is primarily on training very large, open-source language models such as GPT-Neo-2.7B, GPT-NeoX-20B, and BLOOM, as well as other kinds of
From page 36...
... Biderman asserted that research conducted with privately held ­models creates a bottleneck for AI research for the larger scientific com munity, because research with these models does not have practical appli cations for most people due to lack of access to the models. Even open-source models are highly concentrated within a few companies, Biderman recognized.
From page 37...
... The community's assistance and support are critical for maintenance Datasets Training Finetuning Evaluation Deployment • C4 • Megatron‐ • Transformers  • LM Evaluation  • HuggingFace  • The Pile DeepSpeed Library Harness Hub • Red Pajamas • GPT‐NeoX • PEFT • BIG Bench • AutoGPT • MetaSeq • trlX • DeepSpeed  Inference FIGURE 4-2 Various open-source resources for developing large language models. SOURCE: Stella Biderman, presentation to the workshop.
From page 38...
... She invited a broader range of people to join open-source AI efforts, emphasizing that much of the infrastructure and work underpinning large-scale AI systems is standard software engineering and data science -- one does not need to be an AI expert to contribute. A workshop participant asked whether there are LLMs that have been trained on mathematical material and are widely available to mathematicians.
From page 39...
... She explained that a naïve way to estimate risk could be to study how many examples in the training data were mis­ classified; however, this systematically underestimates risk. Instead, one of the most common methods used to estimate risk in traditional ML is the K-fold cross-validation method, which splits data into some number of sections K
From page 40...
... Mathematical Foundations of Machine Learning To highlight the importance of theory, Willett began by stating that investing in ML without understanding mathematical foundations is like
From page 41...
... Willett first shared success stories of mathematical foundations improving ML practice. She explained that ML often requires training data to set parameters of a model (e.g., NN weights)
From page 42...
... Since work in such areas still lacks mathematical foundations, leaving many open problems, developing foundations can solidify this work. Finally, Willett pointed to how advancing the frontiers of ML can benefit broader science because of the overlap between fields; learning from simulations requires understanding data assimilation and active learning, for example, and studying these benefits both ML research and other sciences.
From page 43...
... Tao began by asking each panelist to discuss the main challenges for the community using AI to assist mathematical reasoning. Explaining that the community is comprised of few tenured professors and more postdoctoral, graduate, and undergraduate students, Avigad identified the main challenge as determining how to create space for the younger people to succeed.
From page 44...
... Biderman answered that the issue is about the political willingness to spend rather than about a lack of funding; winning grants for large-scale academic AI research is difficult. In response to another workshop participant's question about where funding for interactive theorem proving can be found, Biderman named Google in addition to Microsoft, as well as the Defense Advanced
From page 45...
... Biderman indicated that students could start with smaller AI models that require less expertise, but above all she urged mathematics students to take introductory computer science courses and learn Python. Workshop planning committee member Talia Ringer, University of Illinois at Urbana-Champaign, expressed that collaborating with AI researchers can be intimidating because of how large and powerful the community is, and one often ends up needing to work on their terms.
From page 46...
... Another question sparked discussion of the ethical issues involved when researching the interplay of AI and mathematical reasoning. Martin highlighted potential ethical issues of bias in the training of models, and Biderman highlighted similar issues in model development.
From page 47...
... 2022. "Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs." arXiv preprint arXiv:2210.12283.
From page 48...
... 2021. "MiniF2F: A Cross-System Benchmark for Formal Olympiad-Level Mathematics." arXiv preprint arXiv:2109.00110.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.