Why Do Ontologies Matter?
Why worry about behavioral ontologies, when scientists and those who apply scientific knowledge may have many other pressing concerns? To understand why this seemingly arcane idea is so important, it is helpful to consider what problems arise when ontologies are not in place. For example, imagine a mental health professional who treats adolescents in a practice where productivity demands are high. This professional knows there is a wealth of research related to their work but struggles to identify answers to specific questions that arise in the course of their practice. Having very limited time to keep up with journals that document the latest research findings, health care providers still want to be able to determine what evidence-based treatment approaches are new and relevant, which ones could help their patients, and how their patients might respond to these approaches. Seeking answers in the literature, they usually encounter a bewildering array of ideas, measures, and treatments.
Researchers have developed many terms and models for studying mental health disorders, as well as possible treatments. But this evidence about research developments exists in thousands of research papers published every month, which may be classified in varying ways and venues, using a wide array of possible search terms. If one wishes to retrieve and consider even a nearly complete list of all clinical trials relevant to an adolescent with a diagnosed DSM-5 major depressive disorder, for instance, the task is nearly impossible without a strategy to link the inconsistent definitions of depression. Detecting patterns across depression-relevant clinical trials (i.e., being able to make a statement about the features that effective treatments used with younger depressed youth tend to have) would similarly require
ontological tools. Statistical aggregation of relevant results would depend on both a shared conceptualization of the variables of interest from those trials and an indication of which inferences were of interest. Without such a shared conceptualization, it is difficult for anyone to draw on an established and evolving evidence base relevant to this particular domain. Although this example highlights the predicament of a busy mental health professional, researchers and others seeking literature relevant to many kinds of questions in many domains routinely face similar dilemmas.
There are no easy answers for challenges such as these, but they partly reflect limitations in the way ontologies are used—or not used—in the behavioral sciences. This chapter will examine how ontologies can help to address these challenges and facilitate the synthesis and application of knowledge produced by scientists.
CHALLENGES WITH SYNTHESIZING AND APPLYING KNOWLEDGE
Behavioral scientists produce a vast amount of research every year, but the publication of the results is only an initial step in the process by which scientific knowledge can bring benefits to society. For new knowledge to benefit patients, clinicians, investigators, professionals in education, business, law, and others, it has to be tested and reproduced, and the findings have to be managed, synthesized, disseminated, and applied. Without ontologies, all of these functions are more difficult.
Conscientious scholars and clinical practitioners are expected to keep abreast of literature in their field. But is this expectation realistic? There are 23,000 scientific journals that collectively publish more than two million peer-reviewed scientific articles each year. In the United States alone, an estimated 422,000 papers were published in 2018. No human being could sift through the volume of literature relevant to even a single domain to retrieve the information they need to stay current or identify nuances and trends in research findings. Without ontologies to frame the scientific discourse, it is practically impossible for stakeholders to reliably identify the most important developments in their fields.
Figure 3-1 illustrates the scope of the challenge, showing the rate of science growth since the mid-1600s, based on data from the Web of Science and the number of cited references identified per year. Acceleration has been most rapid over the last 70 years, with the greatest inflection within the most recent decade; there are no signs the trend will decelerate.
One illustration of challenges facing the behavioral sciences was provided by a recent historical review of treatments for mental health problems that affect young people (including anxiety, disruptive behavior, and depression) over the last 50 years. The review examined unique treatment
protocols (i.e., treatment manuals) across numerous clinical trials. Such manuals specify the nature of clinical interventions; they have long been used as written codification of psychotherapy procedures and as a tool for defining different approaches being studied in clinical research trials. The researchers identified the component practices, or “practice elements,” defined within the treatment manuals.
The relationship between practice elements and manuals is basically one of component membership. Although they include other material, manuals are collections of practice elements, in the same way that a playlist represents a collection of songs. Figure 3-2 shows that while the number of treatment manuals (referred to as “protocols” in the figure) for disruptive behavior grew sharply in the period examined, the number of new practice elements remained relatively flat over the last half of the period.
In other words, many new manuals have been developed and tested, but these manuals largely appear to be new combinations of existing practice elements, rather than conceptually distinctive approaches to intervention.
It has been difficult to perceive this lack of progress because of the ways the field describes and classifies the elements being studied. In short, a lack of ontological clarity makes it difficult to perceive significant trends and highlight valuable developing knowledge about effective combinations of practice elements.
This example also illustrates two key challenges that are relevant to most, if not all, areas of behavioral science. The first is the potential for inefficiency of ongoing research. For example, without an ontology that explicitly specifies relationships among different entities within a domain, researchers run the risk of engaging in empirically or conceptually siloed retesting of the same research questions. The second challenge is a manifestation of the more general and ubiquitous “wealth of information” problem, that as a body of knowledge grows, its retrievability and action-ability decreases. Research in the behavioral sciences is producing a body of knowledge so extensive that it is increasingly difficult to apply efficiently and effectively.
CHALLENGES WITH GENERALIZING RESEARCH FINDINGS AND BUILDING AND STRUCTURING KNOWLEDGE
Ideally, researchers hope to design studies that yield results that are equivalently applicable across different sets of study participants and can be generalized to apply much more broadly. Problems with generalizability can arise from number of factors, several of which are relevant to a potential role for ontologies. One difficulty is in generalizing results obtained using a particular method or measure to other contexts or populations where different methods or measures of the same constructs were used. The challenge of identifying comparable measures for phenomena is in part an ontological one.
Another challenge is that a causal or predictive association may be found in one population or set of circumstances (e.g., predominantly White undergraduate students at a large state university) but not in others (e.g., Black adults in Chicago), even if the same measures are used. In such cases it may be unclear what conclusion to draw. Perhaps there was some sort of critical flaw or limitation in the original study. Alternatively, perhaps the original study was valid for the study population but there are additional, unrecognized moderating or contextual variables that affect other populations differently. Another possibility is that although the results from the original study were replicable under the same laboratory conditions, they were formulated in terms of treatments and measures rarely present in real-world settings. Ontologies that define the entities that are important in a field of study will support the development of measures and research designs that can be replicated and generalized.
Ultimately, problems with generalizability create difficulties for consumers of research—whether other researchers, health care providers, or other stakeholders. Ontologies can help by providing a framework for accurately describing and comparing conclusions, describing how measures are related to conclusions, identifying moderating variables, and distinguishing domains or regimes in which relationships hold from those in which they do not.
A basic function of science is labeling and classifying the phenomena that are observed and organizing them for study in a particular domain. Even in situations in which there is no sanctioned ontology scientists still rely on classification (grouping of phenomena) as the basis for the organization of knowledge through the formation of hypotheses, the design of experiments, the modeling and interpretation of data, and the integration of findings. As a relatively young group of disciplines, the behavioral sciences are still developing and refining many of the sets of concepts and classifications on which they are based (and by which each organizes the knowledge that is created), as well as the constructs useful for scientific study. Further,
the challenge of discerning whether two constructs actually describe the same entity or phenomenon has been an issue in the behavioral sciences for more than a century. Ontologies that explicitly identify agreed-upon definitions for constructs can support clear conclusions drawn from disparate research into them and their features.
HOW ONTOLOGIES FACILITATE SCIENCE
Ontologies are essential to science because they identify and clarify the entities and concepts that people want to talk about and study, and they identify the key relationships among those concepts. Understanding of these entities may change over time but identifying shared names for phenomena is an essential basis for all scientific work—as it is for any constructive communication. Psychologists today do not investigate the ego and the id as they were defined by Sigmund Freud in 1923, but the labelling of these terms opened the door to new ways of talking about psychological phenomena that had not previously been topics of study.
The capacity to accurately refer to behavioral phenomena is a basic pillar of the behavioral sciences, allowing researchers to be precise about what they are studying and how they are conceptualizing their domain. A shared ontology is particularly salient for behavioral scientists, who rely heavily on constructs to guide research, because many of the phenomena they study are challenging to organize and investigate. Ontologies also need to have pragmatic features that allow them to support the particular goals that researchers have, such as classification, communication, data integration and sharing, bibliographic retrieval, and the comparison and analysis of data.
Ontologies are used to sort individuals, objects, and events into different groups. For example, an ontology might classify psychiatric disorders as various types of mental illness or classify organisms as representing particular species and higher taxonomic categories. As ontology developers encode the classifications that they are aware of directly into the structure of an ontology, they also discover new classifications through the application of reasoning systems that determine the logical implications of the ways in which the entities have been defined. An ontology would allow investigators to test hypotheses derived from the logical structure of individual constructs and their relations, though as far as the committee could determine, this benefit of developing and using ontologies has yet to be widely realized in the behavioral sciences.
By enumerating phenomena of interest, an ontology allows people to communicate clearly and efficiently about the ideas that are represented. For scientists and researchers, a shared ontology makes it possible to accurately describe and express their constructs, theories, experiments, and methods. In particular, shared ontologies are critical for comparison of results from different experiments and observations. Scientific communities and scientific progress depend on investigators being able to evaluate one another’s theories and to build on one another’s work, but these fundamental aspects of the scientific process cannot be achieved in the absence of shared terms and the ability to communicate in a consistent manner. Ontologies also support communication about theories, experiments, knowledge, and insights among disparate communities of practice. Shared terms provide a standard mechanism for workers from many different stakeholder groups—who have their own customs, their own jargon, their own world views—to embrace a piece of common information and to act on it accordingly.
Shared ontologies make it possible for researchers to integrate their data with those of other scientists, pooling results and making it possible to explore hypotheses with larger sample sizes and different sets of subjects. Such pooling is possible only if investigators use the same terms to describe the same phenomena or if there are clear mappings between the idiosyncratic terms that one investigator may use and the standard terms used generally in the scientific community. Unfortunately, the absence of widely shared ontologies in the behavioral sciences has been at the root of debates among investigators that focus on differences in measurement and operationalization. Ontology development could help to move research domains toward more broadly accepted nomenclature for a topic of interest.
Optimal searching of the scientific literature (and datasets) is facilitated by indexing of the contents of bibliographic databases using controlled terms, which enables search engines to use those terms to find appropriate content. Use of controlled terms is important for bibliographic retrieval, as authors frequently describe their research in inconsistent ways, and there is no way for an individual searching the literature to know all the idiosyncratic ways in which researchers describe and refer to their work. In general, formal ontologies support searches that can be more tailored to the user’s needs because they allow more abstract or more granular terms
related to an initial term of interest to be easily identified and used. Such ontologies may also be able to incorporate external ontologies, making it easy to identify new search terms and easing the maintenance of the search engine as the external ontologies evolve.
Comparison and Analysis of Data
Researchers can take advantage of the hierarchies inherent in an ontology to assist with data analysis and interpretation. An ontology allows researchers to categorize observations and to identify the general principles that those points represent. The formal specification of both the essential elements of a scientific discipline and the key relationships among them enables its practitioners to clarify their shared world view and to communicate with one another with the clarity needed to advance scientific knowledge.