National Academies Press: OpenBook

Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science (2021)

Chapter: 5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams

« Previous: 4 Data Science in DoD Acquisition
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

5

Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams

The Department of Defense (DoD) acquisition community has a history of incorporating data analytics but recognizes the further potential in today’s rapidly evolving data science environment. In Chapter 3, the committee introduced the data life cycle and how its full utilization would better prepare the defense acquisition workforce to extract value from data. While not all inclusive, Chapter 4 identified a number of data-rich opportunities still available to further inform decisions in the acquisition community.

Extracting value from data requires a collective data life cycle mindset, skillset, and toolset. In this chapter, the committee explores the constantly evolving ways industry, government, and academia are shaping the mindset, skillset, and toolsets of its employees and data science teams as well as how these best practices and trends yield opportunities for defense acquisition.

MINDSET

Businesses are adopting a new collective mindset that recognizes the inherent value of data. According to the NewVantage 2019 Big Data and Artificial Intelligence (AI) Executive Survey, “92% of the respondents are increasing their pace of investment in big data and AI.” Of the 65 leading finance, healthcare, and manufacturing firms that responded, 88 percent felt a greater sense of urgency to invest in big data and AI, and 75 percent attributed their urgency to fear of disruption by new entrants in their marketplace (NewVantage 2019). These responses represent a growing sentiment in industry that every company is (Yueh and Bean 2018; Orad 2020), or

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

will eventually be, a “data company” where data will be a central elements of the way they do business (Orad 2020).

Similar to industry, successful U.S. government-affiliated efforts have found that a strong, visible leadership commitment has begun to help overcome institutional inertia regarding new data tools and applications. In remarks given at the National Academies of Sciences, Engineering, and Medicine in March 2020, Sezin Palmer articulated challenges and lessons learned as a Johns Hopkins Applied Physics Lab data science team created a Precision Medicine Analytics Platform. The new platform promised to revolutionize patient diagnosis, prognosis, and treatment; yet, along with hurdles in technology, data quality, data privacy and security, there were cultural barriers. Per Palmer, visible senior leadership buy-in and commitment from inception through implementation of this innovative data tool were required to help overcome organizational challenges.

In 2018, Michael Conlin became DoD’s first-ever chief data officer (CDO) followed by CDOs in place in each military department by Fall 2019. With a focus on data and a sense of urgency given the near-peer threats to U.S. national security, Mr. Conlin declared in his April 2020 presentation to the committee that he wanted DoD to move more rapidly in enabling digital operations and decision making, adopting a collective data mindset with the expectation that every DoD employee eventually be digitally savvy (Conlin 2020).1 In addition, and as was noted in Chapter 1, DoD released its new data strategy in October 2020 with an emphasis on decision making at the senior level.

In 2019, Anton et al. found that “[s]ome of the biggest barriers to expanding and refining the use of data analytics in the acquisition sphere include the lack of data sharing because of cultural, security, and micromanagement concerns; inconsistent data access across DoD and for FFRDCs [federally funded research and development centers] and support contractors; and difficulty installing modern analytic software because of security concerns.” In addition, they note that “[l]ong-term investments and strategic planning are needed—both for data governance and for analytic capabilities—as well as concerted efforts by Congress and DoD to address the culture of not sharing data” (Anton et al. 2019a).

To remedy the issues identified by Anton et al., in the 2020 Defense bill Congress mandated that the DoD CDO “shall have access to all Department of Defense data, including data in connection with warfighting mis-

___________________

1There is no single definition of what it means to be digitally savvy, but “savviness” relates to practical knowledge and ability (Oxford English Dictionary, https://dictionary.cambridge.org/us/dictionary/english/savvy). There are digital literacy frameworks that, like those for data literacy, help convey the essential skills in understanding how to use and apply digital technologies (see, e.g., Van Deursen et al. 2014; MediaSmarts 2016; Hadziristic 2017; Huynh and Do 2017; Kelly 2018).

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

sions and back-office data,” giving the CDO responsibility for “providing for the availability of common, usable, Defense-wide data sets.” In response to this Congressional direction, in early 2020, the DoD Chief Information Officer (CIO), Dana Deasy, assumed responsibility for the CDO, and released a memo that “the CIO’s office will take charge of all of the department’s existing data governance bodies and create a new Data Governance Board” (Serbu 2020), a strong shift toward addressing the complexities introduced by having similar data referenced with different terminologies across DoD.

Simply put: businesses, institutions, and government agencies are transforming in response to a data-rich world; the transformations are ongoing and challenging; and barriers routinely include cultural or collective mindset.

However, even within this group mindset, we often face an additional barrier in the mindset of the individual. In the NewVantage survey foreword, Thomas H. Davenport and Randy Bean note that “[i]t is particularly striking that 77% of respondents say that ‘business adoption’ of big data and AI initiatives continues to represent a challenge for their organizations.” Further, Davenport and Bean say, “Respondents clearly say that technology isn’t the problem—people and (to a lesser extent) processes are. We hear little about initiatives devoted to changing human attitudes and behaviors around data. Unless the focus shifts to these types of activities, we are likely to see the same problem areas in the future that we’ve observed year after year in this survey” (NewVantage 2019).

Similar attitudes and individual mindsets are reflected among members of the defense acquisition workforce. As in industry, defense acquisition professionals use data in their everyday work. They, for example, monitor financial data, evaluate testing data, and create and process contracting data. Unfortunately, some people may not recognize their often significant roles and value within the data life cycle of a program as anything different from their usual tasks in the acquisition process, or inaccurately believe that data science is only the purview of data scientists with extensive technical backgrounds. The bottom line is that an acquisition professional does not have to be a data scientist to have a significant role in the data life cycle.

Their mindsets also tend to keep a static view of what data they use today and have always used instead of seeking new data that may help improve acquisition processes and insights. In addition, an acquisition professional who is not aware of his or her role in the data life cycle is unlikely to be aware of the other roles in the data life cycle—and may not recognize that other roles are not being performed, or are being performed inadequately.

As explained in Chapter 3, the data life cycle has 10 different phases: question, define, coordinate, generate, collect, curate and manage, analyze,

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

visualize, disseminate and interpret, and assess. A gap or deficiency in any one of these phases can impact the quality of performance in all other phases. For this reason, even a lack of simple awareness of the data life cycle has the potential to undermine the quality of decision making in the acquisition process.

Having a sense of how data matures at each step of its life cycle and who may or may not have responsibility for the data at any given point in its life cycle will likely be a new mindset for the broader acquisition community to understand and appreciate. Without an understanding of how they fit in a broader data life cycle and how their actions affect other parts in the cycle, the defense acquisition workforce may be missing or slowing opportunities to improve acquisition as well as advance their careers.

Connected to these challenges in institutional and functional culture—and lurking in the background—is a perceived distinction between those that have data science skills and those that do not. “I am not a math person,” is a common refrain and is indicative of a substantial barrier in STEM education. As was noted in 2016, “The culture of science, technology, engineering, and mathematics (STEM) education has an effect on many students’ interest, self-concept, sense of connectedness, and persistence in

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

these disciplines” and “STEM ‘gateway’ courses continue to negatively impact STEM student persistence” (NASEM 2016 p. 59). Thus, members of the current and future DoD acquisition workforce may exhibit discomfort with data science due to cultural, institutional, and educational barriers. Many may also not be aware of how they already interact with and use data regularly to make data-informed decisions. And, unfortunately, despite having an aptitude and/or initial interest, it is important to recognize that DoD may need to help individuals overcome data-centric deficiencies in their education and other circumstances beyond their control that may also be affecting their mindset toward data and its use. These barriers especially affect those who identify as a woman or an ethnic or racial minority (NASEM 2016).

SKILLSET

As DoD contemplates preparing its workforce to more fully embrace and incorporate data science, it is worth considering the types of skills needed to develop and shape decision-informing data insights. As shown in Appendix B, there is no acquisition career field named “data scientist” (DoD 2019). There is also no federal career field identified as a data scientist, although in 2019 the Office of Personnel Management (OPM) did allow agencies to add data science titles to a number of positions within their organizations (Wagner 2019). Essentially, the data science tag could apply to jobs within existing federal job types, including operations research, statisticians, cost analysts, or IT specialists. This “job title” approval essentially set the stage and expectation that data scientists do not work for themselves or operate in isolation looking for “good ideas,” but are part of a team and will be embedded with a “domain” of some kind needing or wanting to make better data-informed decisions for the questions they have.

With this permission to establish “unofficial” data scientist job titles, OPM Associate Director for Employee Services Mark Reinhold wrote that “[d]ata scientist work is multifaceted and requires talent from interdisciplinary backgrounds.” He went on to say that “[d]ata scientists are defined as practitioners with sufficient knowledge in the areas of business

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

needs, domain knowledge, analytical skills and software and systems engineering to manage the end-to-end data processes in the data life cycle” (Wagner 2019). That said, there is a distinction between the “data scientist” role and the range of skills needed throughout the data life cycle that could be provided by other functions in the acquisition community.

Before delving into the broader data life cycle skillset, the committee highlights two critical considerations from Chapter 3.

  • The data life cycle is a bi-directional workflow or process.
  • For any given project, full utilization of all the phases of the data life cycle requires varied skills, and a single person is unlikely to have all of those skills.

From these, the committee concludes the following:

Data Literacy

In response to the growing importance and ubiquity of data and data-related tasks, the private sector is increasingly prioritizing data literacy for all employees, not just those in data analytics or data science roles. When it comes to the U.S. workforce, “data isn’t used in a vacuum: it touches many other roles, and those employees need the literacy to handle it effectively” according to Laurence Bradford, creator of Learn to Code with Me. Amy O’Connor, chief data and information officer at Cloudera conveyed to Bradford in a 2018 interview that “organizations need a broad set of data skills, and they need to be in various different roles across the organization.” CDO roles, such as O’Connor’s, are now common in the workplace, and their responsibilities have grown over time. They often report that “poor data literacy” is among the top challenges in achieving company goals (Bradford 2018).

Similarly, higher education is focusing on defining data literacy and developing related training since data-centric skills contribute to almost every discipline. Some institutions of higher education are, in general, shifting to systems wherein students demonstrate core competencies (as opposed to completing specific courses). Data and information literacy tend to appear on these short lists of core competencies along with more common competencies such as reading/writing literacy and quantitative literacy.

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

Using slightly different language, the National Academies 2018 report on Data Science for Undergraduates: Opportunities and Options recommended that some form of data science be taught to all undergraduate students:

To prepare their graduates for this new data-driven era, academic institutions should encourage the development of a basic understanding of data science in all undergraduates. (Rec 2.3)

Exposure to the data life cycle now occurs in introductory general education courses, undergraduate and master’s degrees in data science and related topics, and online courses and certificates for a rapidly growing number of students in two-year and four-year colleges (NASEM 2018).

What these skills are is more difficult to define. Because we are in the early stages of curriculum and program development for data science, it is difficult to build consensus. Borrowing from NASEM (2018), here the committee refers to data literacy as a collection of baseline data science skills similar to those that would be taught at the undergraduate level (Box 5.2).

Data literacy can be achieved in a variety of ways; ideally, there are multiple pathways to data literacy that accommodate all learners that include concepts common to many introductory data science and data literacy courses. For example, during the April 2020 workshop on “Improving Defense Acquisition Workforce Capabilities in Data Use,” Matthew Rattigan (University of Massachusetts Amherst) stated that data literacy includes exploratory data analysis skills, data visualization and summarization skills, familiarity with experimental design, conceptual comparison of prediction vs. causality, the scientific method, and an understanding of data ethics, fairness, transparency. While differing slightly from the National Academies definition in Box 5.2, there remains a strong emphasis on how data are collected, stored, and used as well how to communicate with data.

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

As data literacy continues to proliferate and relevant training becomes more widely available in higher education, or becomes available within the DoD education and training community, the acquisition workforce should be given similar opportunities.

Six Roles for the Data Life Cycle

Workforce-wide data literacy is necessary but not sufficient for improved data use within the defense acquisition workforce. To see why, it is important to understand the data, analyses, and—most importantly—decisions that are made in acquisition as discussed in Chapter 4. For example, Anton et al. (2019a) argue that teams are required to make these decisions and members of the team have different levels of technical skills. In January 2020 testimony to the National Academies, Sallie Ann Keller noted that “data science is a team sport” and “there are many levels of data acumen.” Further discussion of team structures and roles is in the section below titled “Team Structures for Data.” Across industry and government, and as taught in academia, executing the data life cycle generally requires multiple roles with varying skillsets. Keller and Bethany Blakey alerted the committee to notable data science roles at the National Institutes of Health in their testimony in January and March 2020.

The defense acquisition workforce may not find the following six job titles in its workforce, but the six roles that follow describe the data-related skillsets required to support a project utilizing the data life cycle.

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

Data Engineers

Data engineers are specialists with a technically specialized skillset. Typically, there are few data engineers within an organization. They tend to be matrixed across teams or deployed to support critical domain teams. Data engineers are broadly responsible for supporting the best methods, tools, and interfaces to prepare data and make it accessible. They establish and curate data sets so that they are maintained and available for subsequent analysis and decision making. They create the data, computing, networking, memory, and storage platforms (often called the “data enterprise”). They support cloud and on-premises storage and computing, manage data security, develop systems specifications, implement system interfaces to access, retrieve, and process data, and support the data architecture implementation using the big data tools identified for the data analytics platform that allows to access, integrate, and store data.

Data Scientists

Data scientists are also specialists. They sometimes lead a data science team and have a technically specialized skillset and deep analytics knowledge. Their skillset spans the data life cycle, with emphasis on advanced techniques for data collection, curation, management, analysis, and visualization. Like data engineers, organizations employ few data scientists, and they tend to be matrixed across teams or deployed to support critical domain teams. Because debate continues regarding skills needed to be identified as a data scientist, data science degrees from different institutions can equip students with very different (but useful) skills. In general, students from data science programs will learn mathematical, computational, and statistical foundations; data management and curation; data description and visualization; data modeling and assessment; workflow and reproducibility; communication and teamwork; domain-specific considerations; and ethical problem solving. Their academic experience will typically require exposure to real-world problems, data, tools, and ethical considerations with different programs sometimes aligning with a specific domain expertise.

The committee emphasizes that the role of data scientists will likely not be ubiquitous within the acquisition workforce. Expectations for skills for data scientists have evolved significantly over the past decade. Early on, as is noted in Chapter 3, there were discussions about what skillsets defined data science and whether it was a unique field of study or a natural evolution of data-intensive domains such as statistics or computer science. With these discussions came a reckoning (and a slow acceptance) that the expectations for data scientists were different and often more interdisciplinary than what members of the U.S. workforce saw in traditional academic

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

programs. Appendix D details the skillset needed to achieve mastery of data acumen presented in the National Academies 2018 report Data Science for Undergraduates: Opportunities and Options.

Finding 5.4: Data scientists are experts across the data life cycle, with special emphasis on advanced techniques for collection, curation, management, analysis, and visualization.

Data Analysts

Across the acquisition workforce, staff use data to conduct analyses in support of acquisition functions. These analysts are commonly embedded in domain teams. In acquisition, a domain team might be an aircraft cost estimating team, or a contractor-specific quality assessment team, or a missile test and evaluation team, for example. Data analysts on these teams have skills in the analysis, visualization, interpretation, and communication of data. They obtain, clean, and transform data in preparation for specific analysis. Analysts compute summary and descriptive statistics (e.g., measures of frequency, central tendency, variation, and position), perform statistical modeling (including contingency tables and regressions), and apply more advanced statistical and machine learning methods to calculate estimates and uncertainties. They create both static and time-dependent visualizations. They implement sampling strategies, design surveys and statistically efficient experiments, and communicate analytic results in a specific domain. Within DoD, data analysts are often aligned with specific acquisition functions such as cost estimating and pricing, test and evaluation, and logistics.

Data Users

Staff across the acquisition workforce utilize data in their day-to-day work, whether they are generating, collecting, or interpreting data to create value for the organization. In the context of their specific roles, data users require a baseline set of data science skills, that is, data literacy, including familiarity with basic terminology, understanding data limitations and their effect on decisions, and the ability to read charts and graphs. Acquisition program managers, contracting officers, and negotiators are just a few of the many acquisition professionals who use data on a regular basis.

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

Domain Experts

Domain experts have mission-related subject-matter expertise, with skills contextual to their area of interest. Domain experts are key team members who work with data scientists, engineers, and analysts to tailor collection, management, and analyses to support key decisions. Skills also include interpreting, disseminating, and communicating results from the data life cycle. Domain experts understand the context of the data in their domain, where it comes from, what it means, and how it is used. For example, a contract or program expert will understand schedule or contractor cost performance data associated with their program. A logistician may understand data associated with supply chains or distribution channels. A test and evaluation expert will know how to instrument a missile in order to collect telemetry during missile performance.

Leaders and Decision Makers

Leaders and decision makers have or need data-related skills that fall into three broad categories: building a culture that values data; governing, managing, and protecting data; and promoting efficient and appropriate data use. Additionally, leaders and decision makers need skills in data visualization, interpretation, and communication. Leadership skills also include identifying data needs, championing data use and shaping institutional cultures to embrace data, prioritizing and funding data governance, recognizing the value of data assets, aligning quality with intended use, increasing the capacity for data management and analysis, and understanding limitations and uncertainty. The section below titled “Team Structures for Data Science” addresses the skills for supervision and management of data science projects and teams.

TOOLSET

For the DoD acquisition system to achieve the benefits of an empowered workforce in data use, it must have access to the full range of data science capabilities—from data generation, collection, and curation to data analysis, visualization, and dissemination.

DoD has capabilities in data analysis and analytic tools, especially through analytic organizations such as the Office of Cost Assessment and

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

Performance Evaluation (CAPE); the Center for Army Analysis, the Office of the Chief of Naval Operations Assessment Division (N81); the Air Force Chief Analytics Officer (A9); the Office of Acquisition, Analytics and Policy in the Office of the Under Secretary for Acquisition and Sustainment; the Office of People Analytics in the Office of the Under Secretary for Personnel and Readiness; and the Analytics Center of Excellence in the Defense Logistics Agency. Analysts are embedded with leaders in military and civilian organizations and entities throughout DoD, including acquisition commands and program offices. In addition, DoD has ready access to additional analytic support through outside organizations, including FFRDCs, University Affiliated Research Centers (UARCs), and contractors in the defense industrial base and the academic research community.

While DoD has had a chief information officer for decades, DoD has not had enterprise-level leaders dedicated solely to the collection, storage, or curation of data. In the past few years, as is noted earlier, DoD has established new positions for CDOs, with responsibility for managing DoD data assets, including the standardization of data format, the sharing of data assets, and the development of common, usable, defense-wide data sets. However, the acquisition data needs of a department in which more than 150,000 designated acquisition personnel (along with others in support of acquisition) in hundreds of organizations spending more than $300 billion annually2 are unlikely to be addressed by a single official who is largely limited to trying to persuade others to invest in data resources and data sharing. Significant investment—on the order of billions of dollars a year that DoD currently spends trying to achieve an auditable financial statement—will likely be needed to develop the structured databases and modern analytic tools needed to build a modern data-centric environment inside DoD. Estimates of total spending on modern data-centric environments or data systems are difficult to obtain based on how budgets are tracked. However, Anton et al. (2019b) estimated that DoD budgeted about $0.5 billion for major acquisition IT systems and about $2.5 billion for logistics and supply-chain management systems in FY 2017 based on public Select and Native Programming—Information Technology (SNaP-IT) budget-request exhibits. This constitutes a reduction from the $4 billion or so budgeted for these systems going back to FY 2008 (Anton et al. 2019b). DoD recently established a new ADVANA platform for assembling critical data sets and making them available for analysis, but far more investment

___________________

2The enacted DoD budget in FY 2020 included $248 billion for Research, Development, Test, and Evaluation (RDT&E) and Procurement (Office of the Under Secretary of Defense (Comptroller)/Chief Financial Officer 2020). In addition, acquisition includes spending a significant portion of the $290 billion budgeted for Operation and Maintenance and $17 billion for Military Construction for FY 2020.

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

is still needed to bring comprehensive data sets into the system and to make them available for data analysis.

TEAM STRUCTURES FOR DATA SCIENCE

From the section “Six Roles for the Data Life Cycle,” we know that executing the data life cycle is a collaborative effort. Indeed, a data science project generally requires teamwork and coordination among the team’s six data roles. Within a single organization, teams will differ—often dramatically. So, what constitutes a team for a data science project? How are they structured? How are they led?

When leveraging data as a strategic asset, it might be tempting to think that everyone needs to be a data scientist or data engineer. These technical roles are indeed critical for executing the data life cycle. Data scientists and data engineers require advanced training, and there is and will continue to be a shortage of data engineers and data scientists (Davenport and Patil 2012) in the U.S. workforce broadly. Accordingly, organizations struggle to find or hire data scientists and data engineers within their workforces. While it may not be many, the defense acquisition system, like every organization, will need data scientists and data engineers, and it will encounter challenges to employing them. Options for increasing the number of data engineers and data scientists within an organization include upskilling, hiring, and contracting. Outside contractors can provide added capabilities, but data restrictions and inherently governmental functions could limit their use.

While all six roles are critical for defense acquisition communities for executing the data life cycle, the vast majority of the data-centric workforce will not need complex technical skills. Typically, organizations of all types and sizes should have many data analysts, data users, and domain experts and fewer data engineers, data scientists, and leaders/decision makers. For DoD, the vast majority of the defense acquisition workforce can be identified as data analysts, data users, or domain experts, though identifications may change project to project, or with additional training.

In Table 5.1, the committee maps the data life cycle from Chapter 3 to the six roles and their skills from this chapter. Here, the committee notes that data scientists often oversee and support the entire process. The committee also notes that two or more roles overlap at every phase, and data users, domain experts, and leaders/decision makers are involved in similar phases albeit at the level of data literacy versus a deeper acumen mastery level.

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

TABLE 5.1 Workforce Roles within the Data Life Cycle

Role Phases in the Data Life Cycle
Question Define Coordinate Generate Collect Curate and Manage Analyze Visualize Disseminate and Interpret Assess Data Science Capability Level
Data engineer implementation specialist acumen
Data scientist assess/refine science and theory deeper capabilities
Data analyst assess/refine deeper capabilities
Data users assess/refine literacy
Domain experts pose
Leaders, decision makers, and managers pose
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

Leaders and decision makers—including those with non-technical backgrounds or training—can effectively manage data science projects by using common strategies and approaches for managing collaborative, cross-functional projects that include technical aspects. However, supervising and managing data science projects requires:

  • Valuing data and analysis in decision making;
  • Valuing and understanding the power and limitations of data use and analysis;
  • A basic familiarity with the data life cycle;
  • Ability to argue, make a case, or tell a story using data or analytic results (visualization);
  • An ability to understand and interpret data and analytic results, and communicate them to various audiences; and
  • An aptitude for asking the right questions of the team—both to determine which data and analytics are viable and to inform a decision.

Leaders and decision makers should routinely ask questions such as the following: What are the key questions that need to be answered? What data do we need? What data do we have available to us? How can we access these data and what are the related challenges? Are there data ethics, privacy, and security concerns, and if so, how might we address them? What is the quality of our collected data? What are the data’s limitations and opportunities? What are the data telling us, and how do you know? Are there limitations to inferences that we make from this project? What is the uncertainty involved in this analysis? If the questions are not answered, what additional data are needed to answer those questions?

Leaders must identify the critical data skills needed for their organization and each project by assessing current staff capacity, performing a data skills gap analysis, identifying ways to meet those needs, and making investments to right-size the team.

There are a variety of structures for teams executing the data life cycle, each intended to foster collaboration and knowledge creation and dissemination through the organization. As is outlined in Box 5.3, the type of optimal team structure will depend on the structure and goals of the enterprise or organization.

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

In a centralized model, data scientists and data engineers are grouped together and often serve at least one of the two following functions: (1) an innovation or research hub that supports the development of new prototypes or processes, and (2) an applied support center assigned to different functions for a fixed period of time or project. In these roles, data scientists and data engineers may need to adjust to several domains and business contexts and to emphasize communication skills. However, being centralized with other data scientists and data engineers can better support leveraging data for enterprise-wide innovation.

In an embedded model, each division or group within an organization hires its own data scientist(s). Here, data scientists generally have domain knowledge acquired over time spent in the division, increasing their effectiveness and quality of support. However, tools, resources, and practices tend to be heterogeneous across the entire organization, which can be more difficult to innovate data practices or optimize the use of data across an organization with siloed data scientists in limited communication. Hybrid models, of course, exist and organizations customize models to meet their needs.

However DoD or individual military departments or acquisition programs choose to organize its data science teams, the acquisition workforce should have a collective data life cycle mindset, skillset, and toolset. Each member of the workforce will need well-defined roles within the data life cycle for any given project, and a set of accompanying skills for that role. Teams will need to be established, customized, nurtured, and managed such that data use is embedded in all acquisition processes. Leaders will need familiarity with the data life cycle, management skills that optimize their data science talent, and a commitment to data-informed decision making. Data science has and will continue to evolve at a rapid pace, and so too will best practices and approaches for supporting a workforce empowered

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

to use data. These recommended workforce characteristics and structures reflect current best practices and approaches in industry, government, and academia.

This chapter concludes with Box 5.4, which revisits the defense acquisition examples first introduced in Chapter 1, with additional focus on the data-related skills and roles necessary in each situation that contributed to its success.

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

REFERENCES

Anton, P.S., M. McKernan, K. Munson, J.G. Kallimani, A. Levedahl, I. Blickstein, J.A. Drezner, and S. Newberry. 2019a. Assessing Department of Defense Use of Data Analytics and Enabling Data Management to Improve Acquisition Outcomes, Santa Monica, CA: RAND Corporation, RR-3136-OSD, August. https://www.rand.org/pubs/research_reports/RR3136.html.

Anton, P.S., T. Conley, I. Blickstein, A. Lewis, W. Shelton, and S. Harting. 2019b. Baselining Defense Acquisition, Santa Monica, CA: RAND Corporation, RR-2814-OSD. https://www.rand.org/pubs/research_reports/RR2814.html.

Bradford, L. 2018. “Why All Employees Need Data Skills in 2019 (and Beyond).” Forbes. October. https://www.forbes.com/sites/laurencebradford/2018/10/11/why-all-employeesneed-data-skills-in-2019-and-beyond/?sh=6329b8c6510f.

Conlin, M. 2020. Presentation at the Workshop on Improving Defense Acquisition Workforce Capability. Virtual. April 14.

Davenport, T.H. and D.J. Patil. 2012. “Data Scientist: The Sexiest Job of the 21st Century. Harvard Business Review. https://hbr.org/2012/10/data-scientist-the-sexiest-job-ofthe-21st-century.

“Defense Acquisition Workforce Key Information, Overall,” briefing, Washington, D.C., FY20Q3 (30 June 2019). https://www.hci.mil/docs/Workforce_Metrics/FY20Q3/FY20(Q3)OVERALLDefenseAcquisitionWorkforce(DAW)InformationSummary.pptx.

Hadziristic, T. 2017. “The State of Digital Literacy in Canada: A Literature Review.” Brookfield Institute for Innovation + Entrepreneurship. April.

Huynh, A., and A. Do. 2017. “Digital Literacy in a Digital Age.” Brookfield Institute for Innovation + Entrepreneurship. August. https://brookfieldinstitute.ca/wp-content/uploads/BrookfieldInstitute_DigitalLiteracy_DigitalAge-1.pdf.

Kelly, W. 2018. “Being Digitally Savvy in a Digital World.” Rural Development Institute. February 26. https://medium.com/@rdi_77976/being-digitally-savvy-in-a-digital-worldb7bb291be85f.

MediaSmarts. 2016. “Digital Literacy Fundamentals.” http://mediasmarts.ca/digital-medialiteracy-fundamentals/digital-literacy-fundamentals.

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×

NASEM (National Academies of Sciences, Engineering, and Medicine). 2018. Data Science for Undergraduates: Opportunities and Options. Washington, DC: The National Academies Press. https://doi.org/10.17226/25104.

NASEM. 2016. Barriers and Opportunities for 2-Year and 4-Year STEM Degrees: Systemic Change to Support Students’ Diverse Pathways. Washington, DC: The National Academies Press. https://doi.org/10.17226/21739.

NewVantage Partners. 2019. “Big Data and AI Executive Survey 2019: Executive Summary of Findings.” NewVantage Partners. http://newvantage.com/wp-content/uploads/2018/12/Big-Data-Executive-Survey-2019-Findings.pdf.

Orad, A. 2020. “Why Every Company Is a Data Company.” Forbes. https://www.forbes.com/sites/forbestechcouncil/2020/02/14/why-every-company-is-a-data-company/?sh=6ce301ef17a4.

Serbu, J. 2020. “Pentagon Racing to Establish New Chief Data Officer Within CIO’s Office.” Federal News Network. January 28. https://federalnewsnetwork.com/defensemain/2020/01/pentagon-racing-to-establish-new-chief-data-officer-within-cios-office/.

Van Deursen, A.J.A.M., E.J. Helsper, and R. Eynon. 2014. Measuring Digital Skills. From Digital Skills to Tangible Outcomes Project Report.https://www.oii.ox.ac.uk/research/projects/?id=112.

Wagner, E. 2019. “OPM Announces New ‘Data Scientist’ Job Title.” Government Executive. July 1. https://www.govexec.com/management/2019/07/opm-announces-new-datascientist-job-title/158139/.

Yueh, J., and R. Bean. 2018. “Every Company is a Data Company.” Forbes. https://www.forbes.com/sites/ciocentral/2018/09/26/every-company-is-a-data-company/#25f21bcd5cfc.

Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 42
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 43
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 44
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 45
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 46
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 47
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 48
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 49
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 50
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 51
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 52
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 53
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 54
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 55
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 56
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 57
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 58
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 59
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 60
Suggested Citation:"5 Data Life Cycle Mindset, Skillset, and Toolset: Roles and Teams." National Academies of Sciences, Engineering, and Medicine. 2021. Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science. Washington, DC: The National Academies Press. doi: 10.17226/25979.
×
Page 61
Next: 6 Preparing and Sustaining a Data-Capable Defense Acquisition Workforce »
Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science Get This Book
×
 Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science
Buy Paperback | $50.00 Buy Ebook | $40.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

The effective use of data science - the science and technology of extracting value from data - improves, enhances, and strengthens acquisition decision-making and outcomes. Using data science to support decision making is not new to the defense acquisition community; its use by the acquisition workforce has enabled acquisition and thus defense successes for decades. Still, more consistent and expanded application of data science will continue improving acquisition outcomes, and doing so requires coordinated efforts across the defense acquisition system and its related communities and stakeholders. Central to that effort is the development, growth, and sustainment of data science capabilities across the acquisition workforce.

At the request of the Under Secretary of Defense for Acquisition and Sustainment, Empowering the Defense Acquisition Workforce to Improve Mission Outcomes Using Data Science assesses how data science can improve acquisition processes and develops a framework for training and educating the defense acquisition workforce to better exploit the application of data science. This report identifies opportunities where data science can improve acquisition processes, the relevant data science skills and capabilities necessary for the acquisition workforce, and relevant models of data science training and education.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!