National Academies Press: OpenBook
« Previous: Appendix C: Workshop Statement of Task
Suggested Citation:"Appendix D: Capability Technology Tables." National Academies of Sciences, Engineering, and Medicine. 2017. Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/24900.
×

D

Capability Technology Tables

TABLE D.1 Current Capabilities

Presenter Current Capabilities
Tom Dietterich, Oregon State University
  • Apply deep learning to signals-type data if they are stationary and if there is enough training data
  • Adapt quickly to new problems using fine tuning and, to some extent, anomaly detection
  • Participate in rapidly growing Kaggle challenges across a huge range of problem domains
Joseph Mundy, Vision Systems, Inc.
  • Global-scale object detection and recognition in satellite image enabling broad-area search anywhere on the Earth’s surface
  • End-to-end deep net production of three-dimensional stereo reconstructions yielding significant improvement in disparity accuracy and completeness
  • Application of the AlphaGo game search strategy to other rule-driven processes such as geometric modeling, a promising avenue to inject theory into data-driven methods
Rama Chellappa, University of Maryland, College Park
  • Object detection with much improved performance for thousands of objects
  • Face detection with much improved performance for frontal faces and faces with >15 pixels
  • Object recognition with much improved performance for well-defined objects
  • Face recognition/verification with 91 percent TAR at 10-4 FAR for face verification on JANUS–CS4 data set—Can handle 1 million-sized gallery
Dragomir Radev, Yale University
  • Integrating deep learning and reinforcement learning (cf. AlphaGo)
  • Automating architecture learning
  • Hyperparameter search
  • Improving generative models
  • Memory-based networks and neural Turing machines
  • Semi-supervised learning
  • Mix-and-match architectures
  • Privacy-preserving models
Suggested Citation:"Appendix D: Capability Technology Tables." National Academies of Sciences, Engineering, and Medicine. 2017. Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/24900.
×
Presenter Current Capabilities
Peter Pirolli, Institute for Human and Machine Cognition
  • Commercial off-the-shelf interactive visual analytics now pervasive for big data analysis
  • Mature research platforms for visualization grammars and toolkits
  • Early versions of interactive visualization for machine learning experts and learners (e.g., TensorFlow, Playground)
  • Emerging Standard Model of Cognition for constrained, reasonably well-defined tasks/domains (limited transfer/generalization)
  • Cognitive models of users/students in long-term tutoring with capabilities for induction/fine-tuning of student models
  • User modeling and personalization is a mature field for well-defined tasks/domains
  • Models of joint interdependent activity in non-learning tasks (e.g., human-robot interaction)
  • Interactive task learning for constrained robot tasks and contexts
  • Programming by demonstration
  • Mobile phone digital assistants
Chris Callison-Burch, University of Pennsylvania
  • Recent emergence of start-ups focused on annotation such as CrowdFlower, which is being leveraged by the Department of Defense and the intelligence community
Mark Riedl, Georgia Institute of Technology
  • Value alignment is an unsolved problem—no examples of aligned agents except in toy worlds
  • Virtually no understanding of sociocultural conventions that reduce human–human conflict
  • Can learn commonsense knowledge on curated corpora in limited a priori known contexts
  • Currently data is bottleneck employing curated corpora—deep learning is needed to learn human values/conventions from stories on the Internet
  • Curated data sets do not currently exist, and researchers still do not know how to solve these problems using deep neural networks
Panel on Evaluation of Machine-Generated Products; Anthony Hoogs, Panel Moderator
  • Robust activity from Intelligence Advanced Research Projects Activity in creating data challenges
  • Widespread public awareness of and participation in data notation and labeling (e.g., Mechanical Turk)
Suggested Citation:"Appendix D: Capability Technology Tables." National Academies of Sciences, Engineering, and Medicine. 2017. Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/24900.
×

TABLE D.2 Short-Term (3–5 years) Capabilities

Presenter Short-Term (3–5 years) Capabilities
Tom Dietterich, Oregon State University
  • Open category classification
  • Automatic detection of biased and untrusted sources
  • Anomaly detection on time-varying and network data
  • Initial methods for validation and system monitoring
Joseph Mundy, Vision Systems, Inc.
  • The rapid growth of web-linked data sources based on semantic ontologies such as geographic and functional knowledge should be exploited to label training data and to provide context for test time
  • A major effort in incorporating theoretical knowledge into data-driven decision making
  • Using machine learning to drive logical and grammatical structure formation is a promising first step
  • Extend current deep learning three-dimensional modeling algorithms to four dimensions and combine with object recognition networks to achieve functional designs such as building information models
Rama Chellappa, University of Maryland, College Park
  • Handle pose, illumination, and low-resolution challenges in unconstrained scenarios
  • Handle full motion videos
  • Sharing/transfer learning of features for different but somewhat related tasks
  • Model/understand context and occlusion
  • Handle “reasonable” amount of noise in data
  • Limited robustness to adversarial data
  • Vision and language
  • Other multi-modal data
Anthony Hoogs, Kitware, Inc.
  • Generative adversarial networks effectively applied to video
  • Increased model transfer between video domains
  • Human-level accuracy for action recognition in single-action, temporal clipped videos
    • UCF 101
    • Human-level accuracy achieved for ImageNet object recognition in primary subject images 1,000 categories
  • Major improvements in action and complex activity recognition in surveillance video
    • Person tracking and re-identification
    • Defense Intelligence Agency’s IARPA project
  • Super-human performance in video search and retrieval for objects and primary subjects in internet (consumer) videos
    • Cataloguing of large, common objects and events
    • Similar accuracy to humans but much faster
    • Free-form text-based queries with limited syntax and vocabulary
  • Large graphical processing unit farms required to keep up with video generation
Dragomir Radev, Yale University
  • Language data will be more and more in speech form
  • Massive multilingulairty
  • One-shot learning (e.g., Tigrinya and Oromo)
  • Understanding short texts
  • Interpretability of models
  • Deep learning personalization
  • Trust analytics
  • Ethical language processing
Mikel Rodriguez, MITRE
  • More work needs to be done on short- and long-term prediction; recent models such as Long Short-Term Memory Networks (LSTMs) and recurrent neural networks have just begun to address this gap
  • More work needs to be done on complex activities, temporal sequences, rare targets, and modalities unique to the Department of Defense/intelligence community, as well as counter artificial intelligence assurance
Suggested Citation:"Appendix D: Capability Technology Tables." National Academies of Sciences, Engineering, and Medicine. 2017. Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/24900.
×
Presenter Short-Term (3–5 years) Capabilities
Peter Pirolli, Institute for Human and Machine Cognition
  • Multi-level models across cognitive, rational, and social bands of interactive phenomena for well-defined, “stationary” sensemaking tasks
  • Sensemaking tasks (but still intelligence community-relevant)
  • Explainable artificial intelligence for visual analytics and simulated drone operations
  • Interactive task learning for usable soft bots for well-defined tasks (clear goal, well-defined constraints, clear operators)
  • Improved visual analytics for machine learning programming
Chris Callison-Burch, University of Pennsylvania
  • Tighter integration of crowdsourcing with machine learning
    • Correct/confirm output from models
    • Active learning
    • Domain adaptation
  • New crowdsourcing platform for natural language processing
    • Remove hassles of Mechanical Turk
    • Cultivate groups of language experts
    • Create standing pools of language workers
    • Deployable on inside of the intelligence community
Mark Riedl, Georgia Institute of Technology
  • Agents expected to start using commonsense knowledge and world knowledge to address human needs
  • Conversational agents + expectation; need to engage in longer conversations and rely on computational imagination
  • Agents expected to differentiate behavior based on cultural context
  • Computational creativity as part of mainstream content creation
Panel on Evaluation of Machine-Generated Products; Anthony Hoogs, Moderator
  • Open source tools for efficient annotation
  • Semi-automated labeling to efficiently fuse computed labels with manual adjudication
  • Large-scale annotation of operationally representative data sets in domains of interest, made available to researchers, particularly multi-modal data sets

NOTE: IARPA, Intelligence Advanced Research Projects Activity.

Suggested Citation:"Appendix D: Capability Technology Tables." National Academies of Sciences, Engineering, and Medicine. 2017. Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/24900.
×

TABLE D.3 Long-Term (> 5 years) Capabilities

Presenter Long-Term (>5 years) Capabilities
Tom Dietterich, Oregon State University
  • Defense against adversarial examples
  • Integration of large knowledge bases with machine learning
  • Use of meta reasoning to develop and test hypotheses about data source reliability
  • Multi-scale machine learning
Joseph Mundy, Vision Systems, Inc.
  • Drive towards global continuous learning from the emerging constellations of overhead data, including aerial video, incorporating adaptive formation of new relations and concepts not present in the original training data. The resulting ontological knowledge will support all forms of intelligence analysis.
Rama Chellappa, University of Maryland, College Park
  • Theoretical investigation of why and when deep learning works
  • Minimum training set needed given empirical distributions of training set in the absence/presence of domain shift; error bounds; incorporating invariances
  • Image formation, blur, geometry, shape, motion, texture, and occlusion models that can generalize using data
  • Develop machine learning methods that generalize from small data
    • Humans generalize well from small data
  • Develop machine learning methods that can deal with parts, interconnections of parts, adversarial data
  • Incorporate domain knowledge, common sense reasoning
Anthony Hoogs, Kitware, Inc.
  • Structure learning of deep networks on a large scale
  • Human-level accuracy for action and complex activity recognition in surveillance video
    • DVA data
  • Super-human performance in video search and retrieval for complex activities and abstractions in internet (consumer) videos
    • Cataloguing of all objects, scene elements, and events
    • Similar accuracy to humans but much faster
    • Free-form text-based queries with open syntax and vocabulary
  • Large graphical processing unit farms required to keep up with video generation
    • Computation speed increases offset by data growth
Kathy McKeown, Columbia University
  • Interpretation, analysis, and generation from informal text (social media, online discussion, online narrative), multi-modal sources, and streaming data
  • Learning without big data (low resource)
  • Understanding machine learning: explanation/linguistics
  • Discourse, context
  • Applications in multi-lingual environments without machine translation
  • Robust and flexible language generation
  • Fake news, cyber bullying, inappropriate content
Suggested Citation:"Appendix D: Capability Technology Tables." National Academies of Sciences, Engineering, and Medicine. 2017. Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/24900.
×
Presenter Long-Term (>5 years) Capabilities
Amanda Stent, Bloomberg
  • Explore and understand: sponsor research on machine learning for conversational analytics
    • Identifying “problematic” dialogues and when dialogues go wrong
  • Produce: challenge the community to move beyond (single-domain) slot filling and (context-free) chat
    • Situated dialogue
    • Negotiation dialogue
    • Complex and flexible tasks
    • Ongoing/perpetual dialogue work at Carnegie Mellon University
  • Encourage methods that combine advanced machine learning with programmatic control or otherwise infuse machine learning with human knowledge
  • Encourage methods that include “human-in-the-loop”
    • Active learning
    • Just-in-time human computation
  • Provide the community with shared portals for evaluation (and crowdsourcing of data)
  • Encourage the development machine learning as a service rather than a puzzle or an art
    • Where should machine learning live in relation to data and services?
    • How can machine learning be reproducible?
  • Sponsor work on world knowledge induction and situational awareness
  • Collaborate on ameliorating data availability issues
    • Privacy-preserving methods for mining
Peter Pirolli, Institute for Human and Machine Cognition
  • Foundational science of Human-Autonomy Collaboration
    • Transition from “programming” to “learning to work together”
    • Sensemaking, autonomous vehicles, precision behavioral medicine, workforce
    • Education, pervasive personalized assistants
    • Multi-level models of sensemaking that include dynamically adapting humans and machine learning in joint activity
  • Continuous in-the-loop explainable artificial intelligence and interactive task learning
    • Usable, scalable, generalizable, dynamic interactive task learning
  • Evaluation frameworks on joint human–autonomy tasks
    • Move beyond current machine learning-centric optimization metrics
  • Open source data and code for tasks that have some relevance and validity to the intelligence community
Mark Riedl, Georgia Institute of Technology
  • Increasing presence of autonomous systems in social contexts
    • Chicken-and-egg problem; the problem of humans understanding artificial intelligence and artificial intelligence understanding humans will be far from solved
    • Explainable artificial intelligence
  • Broader artificial intelligence safety concerns and “big red button problem”
  • Use of military teammates could become more common
  • Computational creativity: Autonomous creation of images and video that are difficult to detect
Suggested Citation:"Appendix D: Capability Technology Tables." National Academies of Sciences, Engineering, and Medicine. 2017. Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/24900.
×
Page 53
Suggested Citation:"Appendix D: Capability Technology Tables." National Academies of Sciences, Engineering, and Medicine. 2017. Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/24900.
×
Page 54
Suggested Citation:"Appendix D: Capability Technology Tables." National Academies of Sciences, Engineering, and Medicine. 2017. Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/24900.
×
Page 55
Suggested Citation:"Appendix D: Capability Technology Tables." National Academies of Sciences, Engineering, and Medicine. 2017. Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/24900.
×
Page 56
Suggested Citation:"Appendix D: Capability Technology Tables." National Academies of Sciences, Engineering, and Medicine. 2017. Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/24900.
×
Page 57
Suggested Citation:"Appendix D: Capability Technology Tables." National Academies of Sciences, Engineering, and Medicine. 2017. Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop. Washington, DC: The National Academies Press. doi: 10.17226/24900.
×
Page 58
Next: Appendix E: Acronyms »
Challenges in Machine Generation of Analytic Products from Multi-Source Data: Proceedings of a Workshop Get This Book
×
Buy Paperback | $55.00 Buy Ebook | $44.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The Intelligence Community Studies Board of the National Academies of Sciences, Engineering, and Medicine convened a workshop on August 9-10, 2017 to examine challenges in machine generation of analytic products from multi-source data. Workshop speakers and participants discussed research challenges related to machine-based methods for generating analytic products and for automating the evaluation of these products, with special attention to learning from small data, using multi-source data, adversarial learning, and understanding the human-machine relationship. This publication summarizes the presentations and discussions from the workshop.

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!