The workshop devoted a session on the role of individuals and teams in the innovative process, which has been less studied than the role of institutions and organizations. How people’s educations, entrepreneurial talents, and other human capital characteristics result in innovation are vast and largely unresolved research areas. Linkages among individuals and between institutions are also important to consider.
Rajshree Agarwal (University of Maryland) spoke on improving understanding of how new product innovation interacts with the career profiles of individuals who create them. She commented that entrepreneurship is not a destination, but a step in a longer career lifecycle, and thus the effect of entrepreneurial firm fates on individual career lifecycles needs more attention if it is to be fully understood. In her view, future research would benefit from a focus on human capital markets, which requires combining demand and supply factors to examine life-cycle choices and also considering selection effects as they relate to optimal allocation and reallocation of talent. She also suggested the merging of individual-level career and knowledge innovation datasets to identify systematic sources of bias and to answer questions about what kinds of people firms hire, keep, let go, and how these decisions affect career profiles.
Agarwal observed that few studies have investigated how entrepreneurship affects long-term career lifecycles (beyond the new venture directly). There are serial entrepreneurs, individuals who choose to leave and then return to a firm as an employee, and founding teams composed of individuals. One key research question is how the fates of new ventures affect long-term career outcomes of entrepreneurs and founding team members.
Research often focuses on the product market, where primacy is given to demand conditions, and on individual career choices, where primacy is given to preferences and incentives. A singular focus on either one of these ignores the fact that human capital markets need to clear. More work is needed to understand the role of mobility and entrepreneurship in the allocation and reallocation of talent. Agarwal noted that several factors, on both the demand side and the supply side, impact mobility and entrepreneurship. There are protection mechanisms and costs associated with mobility across organizations, even startups. Mobility and entrepreneurship are often impacted in the same direction, but sometimes the effects are divergent. For example, collusion among high tech firms to not poach talent may reduce mobility, but it may enhance entrepreneurship. Little is known about how other factors (listed in Table 4-1) impact mobility and entrepreneurship. For example, she queried, how does family composi-
TABLE 4-1 Factors Affecting Mobility and Entrepreneurship in Human Capital Markets
Competition versus cooperation in firm interactions
Knowledge contexts—entrepreneurship by users, employees, and academies
Individual Preferences for Job Attributes
Individual Preferences for/against Entrepreneurship
SOURCE: Workshop presentation by Rajshree Agarwal, May 19, 2016.
tion affect individuals’ mobility and entrepreneurship decisions, and the associated wage outcomes? Where do other social processes factor in?
Agarwal argued that mobility, entrepreneurship, and innovation datasets need to be linked to study how entrepreneurship and innovation relate to career lifecycles. Figure 4-1 provides an example of the kind of linkages she said would be helpful in answering a range of important questions, such as:
- What types of bias impact patent-based measures of mobility?
- Are inventors more productive when they take co-workers with them?
- Is noninventor mobility a source of knowledge diffusion?
- How much are inventors able to appropriate from a patent their employer owns? Do they see long-run wage/career impacts?
- Is there evidence that inventors are “stealing” ideas?
- Are inventors filing patents soon after they leave for a startup?
Lee Fleming (University of California, Berkeley) continued the discussion of patent data, new metrics, and data linkages by asking the question, “How can we be more clever in using our data?” He offered several observations to motivate this question:
- Patent data are overused and abused; they are very easy to observe, but not necessarily the right thing.
- Researchers need to stop relying solely on patent counts and citations to measure innovation—richer measures are not that hard to calculate.
- Advances in machine learning and natural language processing are useful, though they need thoughtful application.
- Newly available data and tools provide opportunity to advance understanding of innovation.
In developing these themes, Fleming posed several research questions, the first of which was what happens when an individual changes fields. One viewpoint, standard among economists, is that a person gives up expertise in his or her old field, which is typically harmful to earnings. A second viewpoint, emphasized by Thomas Kuhn (1962, pp. 89-90), is that “almost always the men who achieve these fundamental inventions of a new paradigm have been either very young or very new to the field whose paradigm they change.”
Fleming argued that to answer his question, novelty must be separated from value. Some products may be quite novel yet provide relatively little in terms of market value. He hypothesized that the patents invented by those who change fields are likely on average to be more novel and less valuable. One alternative for testing this hypothesis is to use citations—but they have been used to measure value and novelty, so this is unlikely to work; another alternative is to use a machine learning technique that enables hundreds of papers to be traced to inventions and inventors.
Easily accessible data on patent renewals can be used to create measures of value. But, Fleming asked, what about novelty? Looking back through the patent corpus, one can identify where key words first appear and weight them by their future use. This can be done, for example, with terms like “nontransitory,” a limitation put in mainly in the software patent field, and other important words, such as “browser and computer executable,” “http,” “java,” and so on. Combining information on new words that show up in patents with person-level data, it is possible to demonstrate that new entrants are more likely to invent patents with new
words, and less likely to invent patents that are renewed. This kind of research question would be very difficult to answer using only traditional kinds of data and analytic methods.
Fleming next posed a research question about how governance influences innovation. The decade of the 2000s provided a fertile case study period for considering this question because governance was increased (via legislation such as the Sarbanes-Oxley Act of 2002, for example) that forced firms to adopt more independent oversight. Stronger governance could increase innovation, because it requires greater focus and effort; on the other hand, it could inhibit creativity and risk-taking. In the finance literature, Fleming stated, patents, patent counts, and patent citations have not been utilized very successfully to measure this question.
Research by Balsmeier et al. (forthcoming) using Sarbanes-Oxley as an instrument finds that the signals are clear: Increased governance makes firms more productive by some measures—for example, they get more patents, although these tend to be in areas of technology where they have previously been patenting. But, the authors found, increased governance appears to lead to cutbacks on exploration and willingness to try new things—fewer patents are issued to firms in new technology classes.
Analyses relying purely on counts and citations miss the kind of details described above, Fleming asserted. The advance by him and his collaborators came with the realization that an instrument could be developed using a principal components analysis to break out two components of innovative activity: exploitation and exploration. Exploitation captures portfolio measures of firms patenting in known classes, staying close to their previous technological proximity (and inventors are getting older). Exploration captures firms entering new patent classes (and inventors are getting younger). The measure subsequently produced in this research allows for a much deeper and richer look at firms’ patenting portfolios than would be allowed by simply counting patents.
Looking at all of the firms in their study (see Figure 4-2), on average, exploration appears to slow considerably over the firms’ lifetimes. They inevitably evolve toward patterns of more exploitation as they age.
A third research question posed by Fleming was whether changes in innovation components in the larger economy and the business cycle can be seen. He noted the literature indicating that during downturns in the business cycle, there should be more innovation. The reasoning is that firms are busy making money during the upturns, which increases the opportunity cost of taking risks and exploration. However, analyzing research and development (R&D) and patenting data, the signals are not clear. In Fleming’s view, greater detail is required to discern more definitively that exploration goes up in a downturn and exploitation becomes stronger during an upturn.
Fleming concluded that (1) data on patent counts are (almost) dead as a means to measure innovation; (2) advances in machine learning and natural language processing are valuable for this purpose, although their use requires thoughtful application—crowdfunding is an example of a fertile area; and (3) newly available data and tools provide opportunity—for example, studying real-time inventor networks or exploiting natural language processing to see parts of the innovation economy that are not visible through patenting—particularly when analyzed in collaboration with computer scientists.
Alfonso Gambardella (Bocconi University) discussed the relationship between managerial practices and incentives for research and innovation, and what could be learned from experiments, from large and systematic computerized databases, and from surveys. Researchers have known for a long time that managerial practices are important in terms of performance and productivity (e.g., Bertrand and Schoar, 2003; Bloom and Van Reenen, 2007).1 Less is known specifically about the impact of manage-
ment on innovation. Garbardella referred to preliminary work by Manso (2011), Azoulay et al. (2011), and Ederer and Manso (2013), which suggests the relationship is also important. Consistent with Fleming’s observation about the business cycle, Gambardella explained that Manso found evidence that short-term rewards and penalties discourage exploration. However, managerial practices have the capacity to offset this by creating an environment that tolerates short-term failures and favors long-term compensation schemes, which suggests that special managerial practices are needed to nurture exploration as opposed to exploitation. Manso and colleagues argue that most pay-for-performance schemes are not ideal for promoting creative work because they penalize short-term failures and favor short-term success. Implementing schemes for a CEO may not be conceptually difficult—compensation can be linked to longer-term performance of the firm. But for midrange employees such as researchers, who actually produce the innovations, it is harder to create verifiable measures of their long-term performance—variables such as stock performance of the firm are too distantly related. He said other instruments and practices are needed.
Gambardella noted that very little is known about what managerial practices are effective for promoting innovation, in part because of some fundamental conceptual hurdles to overcome. It is well established that creative people like independence (Gagné and Deci, 2005; Bartling et al., 2014) and dislike pay for performance (Amabile, 1996). For example, scientists “pay to be scientists” (Stern, 2004), in the sense of accepting lower paying jobs that promise scientific work, although to a varied extent (Sauermann and Roach, 2014). It is also known that motivation matters for innovation (Sauermann and Cohen, 2010). Gambardella said findings from this literature are suggestive that policies about the independence and autonomy of individuals are candidate tools for improving innovation performance at the level of managerial practice. Nicola Lacetera (2009), for example, found that autonomy is a powerful device for increasing scientists’ incentives to supply productive effort, especially when their objectives and priorities are not aligned with the top management. Using the National Center for Science and Engineering Statistics (NCSES) Scientists and Engineers Statistical Data System (SESTAT),2Sauermann and Stephan (2013) found that 61 percent of respondents in scientific and engineering industries value independence highly versus 81 percent in academia.
Given that inventors value autonomy, Gambardella et al. (2015) argue
practice explains 10 to 23 percent of the interquartile difference in total factor productivity across firms.
that firms can use autonomy as a tool to motivate their employees, especially when output cannot be used to measure innovation performance. Data from the PatVal European patent inventor survey (Torrisi et al., 2016) provide consistent evidence. In particular, Gambardella et al. (2013) show that firms provide more autonomy to inventors on projects in which they are less motivated.
To better understand the role of management practices in innovation, Gambardella concluded that more data are needed that follow cohorts of workers at science and engineering (S&E) firms so that the role of practices related to tolerance for failure, long-term versus short-term rewards, and autonomy on innovation and performance can be assessed. Scenario-based experiments—for example, asking informed parties how they would respond under different scenarios—and field experiments are also needed.
Paula Stephan (Georgia State University) spoke about measuring the flows of highly skilled individuals to firms. It is common for researchers to track the contribution of universities to innovation in terms of patents, licenses, and startups; it is less common to focus on the placement of highly trained individuals with firms, despite the fact that individuals are a powerful way of transmitting information, especially tacit knowledge. She quoted J. Robert Oppenheimer, who wrote in 1948, “The best way to send information is to wrap it up in a person.”
While difficult in the past, it is now possible to identify the placements of highly trained people, such as Ph.D.s, with individual firms by matching university-maintained records and U.S. census data. The approach was demonstrated in a “proof of concept” paper in Science (Zolas et al., 2015). Under strict confidentiality protocols, researchers matched data from UMETRICS to census data on employers for a sample of recent Ph.D. graduates from eight universities (Indiana, Iowa, Michigan, Minnesota, Ohio State, Purdue, Penn State, and Wisconsin) who had been supported by grants as students.3 Linking student grant data with data on dissertations (ProQuest) and census records, the researchers were able to track placements of 1,983 recent Ph.D.s. Additional linkages could, in
3The UMETRICS (Universities: Measuring the Impacts of Research on Innovation, Competitiveness, and Science) project collected administrative data on federal research funding and private funding (on some campuses) for 11 universities. In January 2015, the Institute for Research on Innovation and Science (IRIS) was established at the University of Michigan to manage and expand UMETRICS; see http://iris.isr.umich.edu/about/ [August 2016].
principle, be made. For example, matching the NCSES Survey of Earned Doctorates with census data could allow stay rates of graduates in firms to be estimated.
Results indicate that recently minted Ph.D.s who go into industry tend to be employed at larger, high-wage firms. As shown in Figure 4-3, the distribution of payroll per worker at the establishments where these graduates work is much more skewed than it is for all U.S. establishments or R&D establishments. The placement of doctoral recipients supported by grants also varies by field. As expected, engineers are most likely to go into industry, followed by math and computer scientists; only a very small percentage go to young firms. The data also permit annual earnings distributions to be estimated (from Unemployment Insurance Earnings Records) and compared across government, academic, and industry sectors (industry has the highest earnings), or across disciplines (engineering is highest).
Stephan observed that these kinds of data-linking projects open up a wide range of research opportunities—for example, they allow modeling of how knowledge stocks embedded in human capital contribute to innovation and performance at the establishment level. Research can examine such questions as how different types of support received by Ph.D. students relate to employment outcomes and the role in job placement of social networks between universities and establishments, or across firms and establishments.
Stephan also noted the importance in the global economy of measuring the international mobility of highly trained individuals. Research suggests that internationally mobile scientists and engineers contribute disproportionately to innovation. But, she cautioned, very little is known
about how various outcomes are impacted by international mobility. For the United States, the NCSES Survey of Doctorate Recipients (SDR) can be used to follow Ph.D.s who stay in the country by field. A major effort is now under way to follow individuals who receive Ph.D.s in the United States and move, but she said it is much more difficult to follow individuals who arrive for postdoctoral training with a Ph.D. in hand. It is also extremely difficult to compare mobility of scientists and engineers in the United States with patterns outside the United States. Virtually no country has data on emigrant scientists, and there is little empirical evidence to compare the performance of mobile scientists across countries. This, Stephan suggested, is an area where improvement of data should be a priority.
To begin addressing this data gap, Stephan (along with colleagues Chiara Franzoni and Giuseppe Scellato) developed a survey of Italian scientists in four fields who had migrated to 16 countries. Details of the survey, called GlobSci, are summarized in Van Noorden (2012). Stephan said the major advantages of the survey over existing alternatives include the following:
- It is possible to track mobile researchers who returned to their country of origin (if included in the 16 core countries) or who emigrated to a core country.
- Data on “entry point” of foreign born (e.g., Ph.D., postdoc, faculty) are captured.
- Numerous individual-level controls are in place.
- Bibliometric measures of focus articles are generated.
Limitations include a lack of data for China and South Korea and that the questionnaire covers only four fields and provides a snapshot for only 1 year (2001). In addition, statistics reflect outbound mobility only to the 16 countries, and there is no information on “quality” of scientists.
The findings from the GlobSci survey indicated that patterns of mobility vary considerably across the 16 countries studied. The major reason individuals return to their home country is for family and personal reasons, not because of disparate opportunities. Additionally, mobile scientists were found to be more likely to establish international links, have links with larger numbers of countries, and exhibit superior performance on international collaborations. They were also more productive than nonmobile scientists and returnees, and results persist even after instrumenting for mobility. Finally, graduate students and postdoctoral fellows are drawn to study in the United States disproportionately because of reputation of institutions, financial support, and perceptions of how U.S.
study affects their career paths. Lifestyle issues were found to be a discouraging factor for studying in the United States.
Stephan closed by noting that surveys can take the research into international mobility of scientists and engineers only so far. She encouraged NCSES to continue to work with agencies in other countries to develop systematic ways to collect data providing consistent longitudinal information about internationally mobile scientists and engineers. More generally, she added, more thought needs to be given about how to benchmark U.S. data with data from other countries on the production and movement of human capital.
During open discussion, Rosemarie Ziedonis (Boston University) added that using individual-level data was also important for tracking the mobility of scientists and engineers, both career-wise and employment level-wise, across states and regions within the United States.
Jason Owen-Smith (University of Michigan) elaborated on some of Stephan’s themes with a presentation on the use of university administrative data for studying the research process, specifically the role of network creation in enhancing productivity. His focus was on people as the “vectors of innovation, science, entrepreneurship” and on the structures and processes underlying scientific discovery and training in university settings. He explained that the Institute for Research on Innovation and Science (IRIS), of which he is executive director, serves as a “global source for data to support fundamental research on the results of public and private investments in discovery, innovation, and education.” He noted IRIS offers useful opportunities for expanding the kinds of administrative data discussed by Stephan and others.
A standard theory of innovation and discovery is that these processes involve a combining of knowledge or techniques that have not been combined before. Advances occur when ideas come into proximity with one another, even when there is no reason a priori to expect that such a network would include these particular pieces. Every university maintains administrative data, compiled for purposes other than research that can be exploited to advance the understanding of these processes. Owen-Smith provided a visualization of how different kinds of data—whether on sponsored projects, human resources, procurement, or a host of other areas—can be combined to map a collaboration network. Constructed from UMETRICS data, nodes of individuals tied to one another—for example, being paid by the same sponsored project in the same year, as coauthors, or as members of co-patenting networks—can in principle be shown. Similarly, different parts of the network associated with the dominant topics can be mapped.
Owen-Smith asked what contributes to the growth of these idiosyncratic networks that put topics together and people together. An example
of a research topic around which a network has formed is hepatitis C, which is a big problem for homeless teenagers. Clinicians and consultants work on one dimension of the problem, while specialists on liver cancer and organ transplants work on another dimension. This is an instance where social science expertise and bench science expertise is bridged by clinical expertise. With the right administrative data, Owen-Smith showed how it is possible to map the knowledge and collaborative space of a whole campus and to examine implications of the positions of individuals and teams (or even departments and programs) for scientific training and later career outcomes. This is interesting because there is dramatic campus-to-campus variation across universities, and it has implications for actionable planning policies, he said.
One question that the IRIS team has asked is how to use the expansive set of university administrative data to explore the relationships between where people are situated in physical space and how networks evolve. Using administrative directory data to position every investigator in a two-building space, they mapped out how proximate they were to one another and calculated what was called a functional zone. For every pair of individuals, the number of linear feet of overlap for those two paths was established as a naïve proxy for how likely they were to bump into each other and talk as a matter of course during their daily work.
Next, the team matched the physical space data to data on grant applications, both funded and unfunded, drawn from sponsored projects, institutional review board (IRB) applications, and animal use applications, over approximately a 15-year period for everyone in the two buildings. They also identified instances in which these activities were done jointly over a 3-year period. The research found that a 100-foot increase in functional overlap between two investigators was associated with a 14-19 percent increase in the likelihood of them forming a new collaboration (e.g., an IRB, animal use, or grant submission). Similarly, conditional on grant submission, the same 100-foot increase was associated with an 18-20 percent increase in the likelihood of a funded joint project. Such findings are indicative of how the idiosyncrasies of space and capital investments shape networks that produce findings. In addition, Owen-Smith noted, the entire approach is based on an administrative data platform that serves as a model for exposing interesting research questions that can be addressed.
During open discussion, Agarwal reiterated the value of combining data. The dream, she said, would be to integrate proprietary datasets with sources such as the LEHD and SESTAT, or IRIS. A big advantage of SESTAT is that it includes rich data on motivations and preferences, which are absent from most of the secondary data from censuses or on patents. She argued that, if the notion that both abilities and aspirations matter in
determining innovation is taken seriously, then combining data that capture factors about managerial practices, as described by Gambardella, and networks of physical space, as described by Owen-Smith, is needed. From her perspective, it would represent a big step forward to move away from product-based innovation concepts toward measures that capture where the innovative activity is occurring and why it is occurring.
Stephan pointed out that some of the data improvements proposed during the session are not that expensive and that linking across sources can be an efficient way to expand information. For example, the Survey of Earned Doctorates covers practically the entire population of new Ph.D.s in the United States. It needs to be matched to other sources, such as the SDR or census data, in order to study what happens to innovative workers over time. Owen-Smith added that many costs of data linkage projects are social, not technical. They materialize in the form of negotiating memoranda of understanding, maintaining security, and managing data. Currently, he said, people do this individually, and effort is being repeated in different ways that are slightly incompatible and sometimes not well documented. One benefit of taking on these tasks as a community, he asserted, is to streamline linkages, systematize data use, and broker between data producers and parts of the federal statistical and science system to maximize the value of investments in data.
One of the greatest challenges to creative data linking is the lack of stable individual identifiers outside of the context of the Census Bureau’s Protected Identification Keys (PIKs), which can only be used under very restricted conditions in Federal Statistical Research Data Centers (FSRDCs). Regulatory and other problems arise in trying to uniquely identify scientific and other academic authors and contributors in such sources as student record transcript data or workforce information. Microdata that are shared and curated by a community require development of persistent identifiers. Lucia Foster (Census Bureau) noted that her organization was in the process of developing unique identifiers for the projects taking place within the FSRDCs and that, as they bring in other outside agencies into the FSRDCs, they plan to develop datasets with unique identifiers to identify people in datasets over time.
This page intentionally left blank.