An Academic Career in ECSE
William Blake's dream "to see the world in a grain of sand" almost becomes reality for those choosing a career in experimental computer science and engineering (ECSE). Building ever more amazing machines out of silicon and alternate worlds in software challenges the intellect and rewards creativity. Although the academic setting provides enormous intellectual freedom to choose problems and the independence to follow one's curiosity, an academic career in ECSE demands not so much the traits of a dreamer as it does the skills of the entrepreneur. One must organize implementation efforts, formulate goals, build a team, find funding (Box 2.1), and more, as well as handle the traditional academic demands that entrepreneurs do not have, such as teaching, directing dissertations, staying abreast of the literature, and serving on deans' committees. Successful experimentalists not only dream the future; they also implement it.
In this chapter an academic career in ECSE is characterized from the research point of view. The chapter begins with a discussion of the goals of ECSE research and proceeds to consider the infrastructure and support requirements for achieving those goals.
GOALS OF RESEARCH IN ECSE
The purpose of research is to contribute to the knowledge base of the field. In the natural sciences, creating something new (e.g., lawrencium
BOX 2.1 Ideas Require Technology, Funding, and Management
Had it been built, the analytical engine of Charles Babbage would have been the first general-purpose computing machine, although, belonging to the mid-1800s, it would have been implemented mechanically rather than electronically. For many years, historians of computing have believed that the analytical engine was never built because the engineering techniques of the time could not support its construction (perhaps due to insufficiently precise manufacturing tolerances). In this view, the first general-purpose computing machine had to wait until the advent of electronic technology.
Recent evidence suggests that this view is inaccurate. Indeed, in 1991, a working model of the analytical engine was built, using only parts that could have been manufactured in the 1840s. The engineers responsible for building this working model argue that Babbage was unable to build the analytical engine not because of a lack of an appropriate implementing technology, but because of his inability to keep costs under control.
SOURCE: Swade, Doron. 1993. "Redeeming Charles Babbage's Mechanical Computer," Scientific American 261 (February):86–91.
or rubella vaccine) is unquestionably a contribution, because of the constraints imposed by the physical world and the creation's relationship to other physical phenomena. In a synthetic discipline such as ECSE, however, where it is straightforward to create something new, novelty is not enough to establish a contribution. Frederick Brooks of the University of North Carolina points out that the evaluation of scholarly work in synthetic fields is subject to an obligation that is not characteristic in natural fields. In particular, he observes that:
When one discovers a fact about nature, it is a contribution per se no matter how small. Since anyone can create something new [in a synthetic field], that alone does not establish a contribution. Rather one must show that the creation is better.1
This task—establishing that a creation is better and a contribution has been made—is intimately connected with the artifact in ECSE.
Subdisciplines of computer science such as theoretical computer science and, to some extent, computational mathematics establish that a contribution has been made by using criteria that are also employed in mathematics. For example, a complexity bound is "better" if it is tighter, and a theorem solving a long-standing open question is prima facie a contribution. Intangibles such as "elegance," "depth," and ''mathematical sophistication" also figure into the evaluation. The key question in these subdisciplines is, Has the proposition been proved? Moreover, there is often considerable consensus that a given theoretical result is or is not new, although theoreticians may disagree over its importance or significance.
In ECSE, computational concepts and phenomena are judged to be better through studying and measuring the artifacts that implement them. Thus, the artifact can be the subject of the study, the apparatus for the study, or both. Relevant questions are, Is the implementation faster or more efficient in other ways? Does the idea provide greater functionality? and, Does the idea materially improve the process of creating artifacts? The criteria for recognizing when a contribution has been made depend first on whether the artifact is acting in a proof-of-performance, proof-of-concept, or proof-of-existence role.
In a proof-of-performance role, the artifact is usually the apparatus, and better can mean more efficient. Efficiency metrics include higher speed, smaller memory requirements, less frequent disk references, and so on. Better in this sense is determined by direct measurement and is quantitative. When Fraser's peephole optimizer saved more in linkage editing time than it cost in added code generation time, quantities that were measured in seconds, the optimizer was self-evidently better.
In the proof-of-performance context, better can also mean more functional. Enhanced functionality is interpreted broadly and includes being more expressive, as in programming languages; having a larger vocabulary, as in a speech understanding system; and being more robust to errors, as in data transmission or storage systems. A common form of "more functional" is having greater generality, either in terms of admitting more cases or removing assumptions about operating context. Interestingly, it is even possible to be better by being less general, if doing so can be argued not to be significantly restrictive and there is an opportunity for greater efficiency in some other dimension. An example would be relaxing strict memory consistency to allow shared memory multiprocessors greater latitude in hiding memory latency.
For artifacts whose behavior is not well understood, some proof-of-performance research seeks to understand specific properties of
their behavior. In such cases, the experimentalist investigates an artifact produced by someone else. In contrast to the instances above, in which the purpose of experimentation is often to demonstrate the superiority of the experimental artifact over some other artifact, the value of the experimental artifact is accepted as a premise and the research goal is to understand specific properties of it. For example, networks are ubiquitous and their value is indisputable, but careful study of the behavior of different networks (e.g., load-balancing on the Arpanet or the fairness of Ethernet's exponential back-off protocol) is essential to understanding networking. In these cases, research may demonstrate that one or another implementation of a concept may be better under particular sets of circumstances.
In a proof-of-concept role, an artifact is usually the subject of the research. In many of these cases, the dimensions along which a given artifact may be better may be heavily weighted toward the intangible. For example, better may mean "makes a programmer more productive," which must be determined by use. Utility can be difficult to establish, because the value of a new capability in computing is not always evident. For the natural frog, catching a fly is obviously beneficial. Would it be useful to build a robotic frog for catching a fly? Similarly, how is a new programming methodology or a new system for computer-assisted chip design more useful?
The computer-aided design (CAD) tool research of John Ousterhout illustrates how such evaluations are often accomplished. His work on Caesar2 provided a graphic capability for chip design that did not exist previously. Ousterhout did not run experiments comparing the old design approach to Caesar, measuring for a suite of circuits the design time, number of errors, and so on.3 Instead, he distributed the software, and designers voted for the system, and its successor Magic,4 by obtaining a copy and using it. Their voluntary use of it proved beyond any number of controlled studies that the system was better. For such reasons, when—in ECSE—better means more useful, the number of users may well be evidence of impact.
In a proof-of-existence role, artifacts are often obviously better because they provide a never-before-thought-of capability. Indeed, the idea may be so useful that it is quickly absorbed into the consciousness of the field, and explicit credit for the contribution is no longer given. The principal issue in determining whether a proof-of-existence artifact is a contribution concerns how quickly its worth is recognized. By definition, these artifacts offer a never-before-thought-of capability, which may be ''ahead of its time." It may take a while for the community to appreciate the concept fully.
In addition to improving hardware and software, much of the field is concerned with the technology of producing hardware and software more efficiently. This motivates research into CAD tools and more expressive programming languages. What is better is determined indirectly in these cases (i.e., the artifact is evaluated to infer information about the efficacy of the tool or method used to produce it). One challenge for experimentalists is to find the proper metrics for artifacts that will imply useful information about those tools or methods.
Experimental projects—especially large ones—often commingle old and new ideas. Thus, in some cases the true contribution may not be so much the presence of new ideas per se, but rather a novel synthesis of ideas, whether new or old. The UNIX operating system is a case in point. Many of its features (e.g., support for time-sharing, hierarchical file systems, pipes for routing input/output) had been implemented in previous systems; nevertheless, UNIX was a major contribution to ECSE because of its simplicity and ease of modification and use.
Finally, consider the question, Under what circumstances can implementation be considered research? It should be obvious that creating a computational artifact, be it a program, digital hardware, graphic image, or the like, is not synonymous with conducting experimental computer science research. ECSE researchers often program, but programming (even programming of a system that has never before been written5) is not necessarily ECSE research.
Constructing an artifact is research when it contributes directly or indirectly to our understanding of computing. This general formulation implies two specific requirements:
The artifact must embody some computational phenomenon in a manner that reveals new information. Thus the artifact will serve in one of the standard roles, or a similar capacity, and it must be constructed in a way that conveys the information reliably (i.e., it is stable and methodologically sound).
The new information is extracted from the artifact and conveyed in a suitable medium and scholarly manner. If the person constructing the artifact is the only person obtaining new and useful information from it, it is not research. Rather, to be research the implementor must teach others. The research community must learn of the discovery in a way that connects it to the existing knowledge base.
Thus, clear exposition and explanation of innovations are as critical to research as having new ideas or even building a new artifact. A good example is provided by Richard Stallman's EMACS editor. This work made use of concepts that had been known previously —dynamic binding and dynamic loading—but it was not until Stallman explained and demonstrated their significance in the editing context that these concepts became widely applied in this setting.
The above questions have little to do with whether or not the researcher has a particular application in mind when he or she undertakes the research. Put differently, the traditional distinction between "basic" and "applied" research does not hold up under close examination.6 However, efforts devoted solely to making an innovative artifact usable by others not in the research team (e.g., writing documentation) do not constitute research in any sense of the word, although such efforts may be indispensable if an artifact is to be disseminated widely and its contribution evaluated.
RESOURCES FOR ECSE RESEARCH
Equipment and Software
The creation of, or experimentation with, computational artifacts requires equipment. It follows, therefore, that infrastructure resources—equipment and related support facilities—are not optional for an academic research career in ECSE.
In the CRA-CSTB survey of ECSE graduate students (described in Appendix A), a substantial majority of respondents cited lack of adequate infrastructure as the primary drawback for them in seeking or taking an academic position. Similarly, a majority of students who preferred industry jobs cited a better infrastructure for research as an important reason for their preference.
Experimental computer science and engineering research requires hardware, software, or both. The following subsections describe some of the difficulties related to building and maintaining adequate equipment and software facilities at the research frontier.7
Staying on the Cutting Edge in Equipment
Equipment is essential to any laboratory science. However, laboratory equipment for ECSE has an extraordinarily short lifetime at the cutting edge; in a National Science Foundation (NSF) survey conducted in 1985–1986, administrators from computer science departments regarded research instrumentation and equipment that was more than one year old (on average) as not "state of the art."8 Over the years, advances in the hardware state of the art have been truly dramatic, improving by factors of more than 100 in speed and memory capacity in the last decade. This exceeds the speed improvement in aircraft between the Wright brothers' airplane and the SR-71. At the same time, there have been dramatic reductions in hardware cost. It is possible to conduct some meaningful experimental research without having equipment that is on the absolute cutting edge, but with rates of improvement this dramatic, equipment quickly becomes antiquated to a degree that affects research.
Software is generally as important as hardware. The base computing environment for ECSE is the UNIX operating system, for which an enormous amount of software is available. Tools from UNIX, such as Lex and YACC, are widely used building blocks and so standard as not to occasion explicit mention by most researchers in enumerating their software needs. Examples of less ubiquitous, but nevertheless widely used, software include CAD tools for chip design,
While the committee is most familiar with the demands of ECSE, it does not wish to claim that the field's requirements for resources are necessarily greater than those of other fields with defining characteristics similar to those used to describe ECSE in Chapter 1. Such fields include, for example, biotechnology and materials science.
See National Science Foundation. 1988. Academic Research Equipment in Selected Science/Engineering Fields: 1982–1983 to 1985–1986. SRS 88-D1. NSF, Washington, D.C., Table B-5, p. B-14.
specialized language translators (e.g., Common Lisp) that support prototyping and symbolic computation, building blocks such as Interviews and the synthesizer generator, dozens of simulators for computer systems, and so on. To this list can be added specific tools and systems specialized for a particular research area, which are typically exchanged gratis and unsupported.
Dedicated Computing Systems
Experimental software research often requires dedicated systems (and, on occasion, special-purpose systems) and cannot use the general-purpose computing environment that already exists in the department, school, or university. Much experimental work (e.g., in network and operating systems work and, in some cases, databases) cannot be conducted on shared resources (e.g., campus computing facilities, teaching facilities, or department-wide resources).
The reason is that experimental software is more than just an application running on top of an operating system; the experimental software may be the operating system, communications, and/or data storage system itself. In a general-purpose computing environment, these components are vital to all users and so must run flawlessly. Users (rather than researchers) who simply want a computing task performed without regard for how it is performed may be harmed.
For example, researchers who are evaluating a new mechanism for managing the storage of data on large disk arrays will probably construct a functional prototype, test it under controlled conditions to see if it exhibits good performance, and characterize the trade-offs between various design choices. Only then will they attempt to debug the system fully so that it operates well under the wide range of conditions that a general-purpose system experiences. While the system is in the exploratory stages, the development and debugging process would not be tolerated by other users of the system because it would disrupt their ability to carry out their computing work. Computing is central to many research, teaching, and administrative activities. Consequently, the university's or department's general computing facilities (e.g., computing center) are not a realistic option for supporting experimental work. Similarly, non-ECSE users cannot risk using experimental systems.
Experimental hardware systems research requires access to hardware production and testing facilities or services. In some fields
such as computer vision and robotics, a researcher needs special-purpose interfaces to cameras and other devices. Even if the hardware can be bought without constructing special-purpose interface hardware, it still requires modifications of the operating system software (e.g., special device drivers) in order to be incorporated into the desired experimental system. In other computationally intensive fields such as graphics, computing power and speed assume the utmost importance, and the necessary power and speed are rarely delivered by off-the-shelf hardware. In still other fields (e.g., VLSI design), the proof of a chip design is the actual fabrication and demonstration of the chip; simulations, although helpful, do not constitute the final proof. Although the Advanced Research Projects Agency (ARPA) provides to academic researchers a Metal-Oxide Semiconductor Implementation Service (MOSIS) and access to (shared) foundry facilities, thus solving the problem of fabricating the chip, considerable equipment is needed to design and test the chip.
The above-mentioned special needs for equipment lead to the often equally problematic need for space, both for the equipment itself and for the students and staff involved in developing and maintaining it. However, CS&E departments have traditionally found space in short supply. In some instances, this is simply a consequence of the rapid growth of the field, and in other cases it is the result of historical accident, in which computer science departments growing out of mathematics had no laboratory tradition.
Project-specific laboratory space is essential: it provides a location for shared laboratory equipment, a site for constructing physical artifacts, and the meeting site or "community" where the corporate knowledge of the implementation effort is disseminated. General-access laboratories (e.g., terminal rooms) do not generally suffice for these purposes, nor does the alternative suggested for software projects of placing a workstation on individual graduate students' desks. Moreover, in many cases the space must be contiguous to be used effectively, because it is difficult to maintain control over, and run experiments on, equipment that is dispersed throughout a department or building. For this reason, ECSE research is similar to research in chemistry or physics in its need for dedicated laboratory space.
A final complication is that laboratory space must be specially equipped with power and air-conditioning capacity that is not found in standard office or teaching space. Although less expensive than a
wet laboratory or facilities suitable for laboratory animals, ECSE laboratory space often involves significant institutional impediments because of the widespread space crunch found in most engineering schools, if not universities in general. It is rare to find currently unused space, and even if one does, it often is unusable as a laboratory without a significant investment in upgrading power and air conditioning.
Providing such space and perhaps other auxiliary services for equipment may provide an additional advantage with respect to fundraising: such facilities and services are often regarded as significant evidence of a department's or university's commitment to a research project by potential industry or government sponsors who may fund equipment grants or donations.
Maintaining the Research Environment
A research laboratory requires more than state-of-the-art workstations and air conditioning to be a productive environment. Keeping it current requires such common activities as installing and configuring workstations, hardware maintenance at the board-swap level, installing software, upgrading software, interfacing locally produced artifacts with standard facilities, preparing locally produced software for distribution, and so on. None of these is a research activity, but they are all essential to research productivity. Two special cases are hardware and software maintenance.
Equipment is not useful, at least not for very long, without maintenance support. The cost of hardware maintenance alone can equal that of the hardware itself over the lifetime of the equipment. This is particularly true for facilities that are too small to experience economies of scale. Although simple maintenance (e.g., replacing one circuit board with another) can be performed by laboratory members, special contracts are usually needed to ensure continued operation of special-purpose or large-scale machines (e.g., parallel computers).
Perhaps the most time-consuming and least understood laboratory maintenance task is to propagate software changes: often when one system is improved, systems that use it must be changed to take advantage of the improvements or to accommodate revisions in representation. For example, when a windows package is revised, systems using the package may have to be revised to provide access to the new facilities. When staff are not available to perform these functions, the duties fall to the research staff, the faculty, and graduate students. Because these tasks are time-consuming, they diminish the available research time substantially.
Graduate students, although they occasionally disparage their role as "slave labor," are in fact critical to experimental research, as they are for work in other fields. In ECSE, they are the highly skilled creators of new artifacts. Artifacts are extremely labor intensive to construct, and although every faculty experimentalist would happily return to the laboratory to work on and experiment with the artifacts, there simply is not enough time in a professor's day. (This is an oftheard lament of academic experimentalists.)
Construction of artifacts is labor-intensive not only because they are large and complex, involving a great deal of low-level detail, but also because many artifacts are ill-specified when first created and so require technically sophisticated builders capable of working from concepts rather than detailed blueprints. Representations must be created; algorithms must be invented. Moreover, it is often the case that the artifact under study is incorporated into another, existing artifact. Because the two systems will essentially never be "plug-to-plug compatible" with each other, the host system will have to be understood. Development of such understanding often requires a substantial intellectual effort unrelated to a better understanding of the artifact under study, although such effort occasionally has educational value.
Because graduate students are creating artifacts from concepts, they must have suitable background, skills, and knowledge to be successful. Further, as a result of having spent time in the laboratory, studying, building, and experimenting with artifacts, graduate students not only acquire important technical information about their research area, but also learn experimental methodologies. Such practical experience is essential to becoming a successful experimentalist.
Although graduate students are an important component of infrastructure without which ECSE faculty cannot be productive, many experimental systems projects reach a point at which it is difficult to make progress on the basis of graduate student labor alone. When such a point is reached, technical support staff (including technicians and other paraprofessionals) are necessary to assist with laboratory maintenance and implementation.
Even when only a single technical staff member is available, staff can play a significant role in ensuring that the laboratory remains a productive place to work. Keeping the software current is extremely
important. Also, there are significant portions of most implementation efforts that, although essential to the success of the project, are usually peripheral to the main subject of the research. Examples for a compiler project might be linking and loading routines, library routines, interfacing to vendor input/output packages, and so on. Staff members can contribute importantly to the success of a project by implementing these more standardized components.
When technical support staff are available, graduate students are freed to focus more time on the interesting parts of the development with greater "research value" and less on the more routine or lowerlevel (although necessary) components of the system building. Support staff also provide more continuity in implementation because they are not subject to the vagaries of student course load, attrition, studying for qualifying examinations, and so on. Finally, they often have greater low-level system expertise than do graduate students. (In a recent NSF workshop, Research in Experimental Computer Science,9 it was noted also that technical support staff have learned to value the simplicity of a system.)
For large ECSE projects, a considerable degree of administrative support is also necessary. Such support coordinates and manages communications and information flow between collaborators, between the research team and other institutions, and among vendors, technicians, and the research team itself.
Access to Collaborators and Other Experimental Systems
Although faculty collaborators are not essential to all experimental research, and a single investigator with graduate students may be sufficient for many projects, especially those of modest scale, larger-scale systems research is rarely done in isolation; junior faculty members undertaking large-scale systems research are poorly served when they are advised to refrain from collaboration. A good project builds on the work of others for reasons related to productivity, evaluation, dissemination, and impact.
In particular, by collaborating or building on the work of others, an individual researcher can have more impact than by working alone and/or starting from scratch. As an associate professor at a major private university noted in response to the CRA-CSTB survey:
Today, many tools are publicly available (with source code). To avoid the long start-up penalty usually encountered in experimental work, start with one of these systems—like the Gnu compilers or Fraser's lcc system. While they won't necessarily do things the way you think they should be done, they will allow you to begin working, publishing, and refining your intuitions. You can implement your own system from scratch later.
Faculty can save valuable time and resources by using public domain and even commercial system components, even though such use may result in an increased dependence on access to the most current research (and sometimes commercial) software and hardware components and the people that create them.
Another motivation for building on the work of others is related to evaluation. A researcher who develops systems completely unfamiliar to other researchers is at a considerable disadvantage, because these other researchers will find it more difficult to provide meaningful feedback. As a full professor at a major public university put it, ''Work on common architectures, systems, and languages so other researchers will be interested in using your prototypes."
In order to evaluate whether a technical innovation is "good," or to quantify "how good" it is in relation to other approaches, a researcher must demonstrate how the new mechanism compares in supporting the range of intended functions. Daily use of an innovation by collaborators is frequently a good way to obtain feedback on its advantages and disadvantages, especially as word of the innovation is disseminated beyond the local user community.
Finally, collaboration is essential for large system-building efforts because the subsystems of an artifact are often so specialized that expertise beyond that of a single researcher is needed. A typical parallel computer research project—including hardware designers, architects, operating systems, programming languages, and applications personnel—requires diverse skills. The Internet itself emerged from a collaboration (at times formal and at other times informal) of perhaps 150 researchers both in the United States and abroad. Common Lisp, a computer language for artificial intelligence, was the result of collaborative efforts among more than 60 researchers in industry, government, and academia.10
If a researcher is not at one of a very few universities that have large-scale, multiinvestigator projects, another way of contributing to
a development that will have significant impact is to join a group of people working on a common problem.11 Consequently, working with colleagues is a matter not only of getting access to results and artifacts quickly, but also of being a part of a collaborative effort. Yet collaboration is not just a matter of sharing words and a white board; one must be "tied in" enough to share development environment, tools, and sometimes equipment. Young faculty starting careers at institutions where they do not have colleagues need to rely on advisers and mentors to help establish and solidify these connections.
All of the infrastructure components referred to so far—equipment, graduate students, staff, and even access to collaborators and to experimental and commercial software—require money. By and large, these are expenses that are not incurred in as substantial a degree by more theoretically inclined computer scientists. Moreover, given the long time horizons of many experimental projects, sustained funding is as important as adequate levels of funding. As an associate professor at a public university commented:
The NSF "small science" model does not work for my kind of research. I need to replace equipment more often, my work cries for staff programmer support, I need more like 6 to 8 graduate students rather than 1. . . . I would estimate that I need on the order of $300,000 per year funding to carry out the kind of quality experimental systems building and measurement I know I am capable of. Not having the resources implies a distinct waste of talent, especially when you multiply out by all the researchers affected.
Experimentally oriented programs have tended to thrive in recent years, and in today's application-oriented, task-oriented environment it is often easier to obtain funding for ECSE than for theoretical work. For example, among the agencies that fund ECSE (principally NSF, the Office of Naval Research (ONR), and ARPA), there have been some especially successful experimentally oriented programs. The Microelectronics Information Processing Systems (MIPS) Division of NSF's Computer Information Science and Engineering (CISE) Directorate has had an experimental systems program that provides sufficiently large and sustained funding to permit a serious implementation effort. ARPA created and sustains an implementation ser-
vice for metal-oxide semiconductors (MOSIS), making it possible for academic researchers to design silicon chips near the cutting edge of technology. ARPA also funds projects at a high enough level to build substantial artifacts. ONR has been effective at tracking emerging technological trends and funding exploratory projects.
Although the greater funding needs of experimentalists have been recognized to a considerable degree, a number of problems remain. One of the most vexing for junior faculty members is the difficulty of obtaining external ''start-up" funding. A junior faculty member will receive funding in his or her second year only if fortunate enough to get a successful proposal through the system in the very first year as faculty. Such a faculty member would be fortunate indeed, because beginning assistant professors have not, in general, established their professional reputations. For these individuals, only the agencies that accept unsolicited proposals provide realistic funding options, of which the NSF is the most prominent.12
Two NSF programs, in particular, have been essential to junior ECSE faculty: the National Young Investigator (NYI; formerly the Presidential Young Investigator (PYI)) program and the Research Initiation Awards (RIA) program.13 Whereas the former is a foundation-wide program, the RIA is special to the CISE directorate and to engineering. The NYI/PYI program has allowed junior ECSE faculty to support several graduate students and rudimentary equipment for a long enough time that significant work can be accomplished, and the RIA program has provided summer salary or support for one or two graduate students.
At the same time, both programs are highly competitive, meaning that only a few faculty will succeed in winning an award in their first year after graduation, and less than half will ever receive such an award in their eligibility period.14 In addition, although there is no explicit prohibition against new graduates being funded by other agencies, the reality is that one needs a research track record to be successful in most cases.15
Research supported by other mission-oriented agencies such as ARPA is more concentrated in a smaller number of institutions with well-
established reputations. It is a common perception at universities with less established reputations that such agencies fund mainly large projects with senior people as principal investigators at a few large schools, making it particularly difficult for the single ECSE faculty member at a small school to obtain funding. One faculty member (an associate professor at a large public university) noted that seeking support from mission-oriented agencies entails an additional set of barriers:
I am seeking support from DARPA [Defense Advanced Research Projects Agency] and other similar agencies, but that comes with a whole additional set of problems (somewhat closed communities that are hard to break into, having to be more responsive to very specific agency desires, and generally more personal "overhead" in dealing with them).
Industrial support often suffers from a similar trend. Although industry has often collaborated in research with an academic partner (e.g., Intel's successful collaborations with Carnegie Mellon, Caltech, and the Massachusetts Institute of Technology), industry leaders also tend to concentrate their research expenditures at a very few top schools. Personal contacts are important. Unless they are carefully chosen, projects of direct interest to industry may have too little scholarly content to contribute positively to a faculty member's career.
Finally, as the economy has tightened, industry donations of equipment appear to have declined, and there is little evidence that a turnaround will be forthcoming. Software donations are even more problematic. Although software plays a critical role in almost all types of experimental projects, software donations to universities are relatively rare, even compared with hardware donations.16
Even when industry donations of equipment can be obtained, they do not by themselves solve the infrastructure problem. In fact, some "free gifts" end up being very costly in terms of expended time and labor. As described elsewhere in this report, a single research project attempts to innovate in only one particular aspect of a system. To develop, test, and evaluate the mechanism or concept, the system will rely on existing software and hardware as much as possible. If the software support is inadequate (e.g., for compilers, operating system, device drivers, communication), the researcher must expend considerable time and effort filling in the missing pieces. Moreover, "free" equipment rarely comes with free maintenance. Most equipment is either costly, very time-consuming, or both, to maintain.17 A new faculty member with minimal financial support may not have the funds to maintain donated equipment.
The conclusion from this rather grim funding description is that new ECSE faculty members are not likely to have research support based on their own research ideas during the early years of their careers. This presents them with a major challenge to find the equipment, graduate student funding, and infrastructure support needed to conduct a credible experimental research program.
Informal inquiries by the committee among potential donors of software suggest that the underlying reason for the paucity of software donations is related, at least in part, to the lack of tax incentives for such donations. Charitable contributions of merchandise entitle the donor to deduct from income only the manufacturing or production cost of such artifacts (without any mention of associated R&D costs). Of course, the manufacturing cost of a software artifact (the cost of copying some tapes or disks and some manuals) is nearly zero when R&D costs are ignored. Thus, software donations seem not to have a significant benefit to the donor.
For example, in NSF infrastructure grants, cumulative maintenance costs are often 50 percent or more of the equipment costs.
Almost all schools adhere to a six-year probationary period, after which a tenure decision on junior faculty members must be made. Given the character of ECSE work, this constraint places particular burdens on academic experimental computer scientists and engineers, as compared to their theoretical colleagues. The delays inherent in ECSE work, described below, make it rare for junior ECSE faculty to produce enough in that time to become widely known throughout the CS&E community. Except in the rarest of cases, the tenure candidate will be recognized only among his or her direct community of researchers.
One solution to this problem would be to extend the probationary period for ECSE faculty members. However, a serious exploration of that solution would have required the committee to address larger political issues beyond the scope of its charge or resources. Instead, the committee chose to identify the issues that make ECSE particularly time-consuming, in the hope that tenure and promotion committees would take these issues into account when considering junior ECSE faculty members for tenure.
Building Complex Artifacts
Artifacts are complex, and it may take years to design and implement an artifact with which one can experiment. The sheer effort of producing a 100,000-line program or a 200,000-transistor chip design may consume a substantial amount of an assistant professor's probationary time.
Building a Research Laboratory
A new faculty member with research interests in ECSE that are different from those already represented in his or her department must build a research team from scratch (i.e., training the graduate students). In general, it takes several semesters to attract talented students and train them in experimentally oriented systems courses. Graduate students must usually complete several smaller projects before they have the background, skills, and knowledge to tackle dissertation-scale work.
Laboratory development is also time-consuming for the beginning assistant professor, except in those rare cases in which a department has an existing faculty member with very similar laboratory requirements and the willingness and capacity to share existing laboratory facilities with a new faculty member. If no seed funding for
the beginning assistant professor is provided by the institution, the process may well take much more than one year because it takes at least one year to acquire funding.
At a minimum, developing a new laboratory involves raising the necessary seed funding, contacting and negotiating with vendors, negotiating maintenance agreements with the vendor or university support, negotiating and paying for software licenses, funding upgrades on software and hardware, and training graduate students to carry out daily management of the system (e.g., performing regular backups) and to use it effectively. Even after the laboratory is established, it remains an ongoing management activity for the faculty to deal with issues such as laboratory organization, facility enhancement, dealing with broken facilities and upgrades, and student management.
Building Industrial Relationships
In ECSE, much of the most advanced work is being done in industry, and the cooperation between industry and academia is essential to the well-being of the field. This fact introduces several problems for the young faculty member. It takes a long time to develop industrial contacts because, in general, industry prefers to work with a few well-known people at well-established schools, and in some cases, industrial laboratory managers are quite intolerant of academic research. Consider the following comment from an assistant professor at a public university:
After I am tenured I will be willing to work on longer-term projects. Currently I only begin a research project if I am confident that I can have it sufficiently completed that I can publish a conference paper within a year. Many of my ideas cannot be completed in that timeframe. After tenure I'll also be willing to put more time into developing a relationship with industry. It takes time that I cannot afford now to develop those relationships, though I think they would be very valuable both for my research and for industry.
Graduating Doctoral Students
The best available data indicate that the average time to complete a Ph.D. in computer science is 6.4 years.18 Even if this figure characterized the time to Ph.D. for graduate students in experimental com
puter science (it most likely understates the time), in the six-year probationary period, a junior ECSE faculty member might have only one completed Ph.D. student. In fact, often very good experimentalists have no completed Ph.D. students at the time of the tenure decision, although it is not unreasonable to expect them to have several Ph.D. students nearing graduation. When universities consider only the number of completed Ph.D.s as an important criterion for tenure (and do not take into account students in the pipeline), ECSE faculty are placed at a serious disadvantage.
Recovering from Wrong Turns and Dead Ends, and from Being Scooped
It is a natural consequence of any research that occasionally a dead-end path is pursued or an unfortunate trade-off is made. A mistake may well be the result of one of several "nontechnical" factors unrelated to the basic idea being studied: a hardware vendor does not deliver or does not perform as expected; funding runs out; key project participants leave; the technology was inadequate to the task. An equally frustrating event is that the work being performed by one researcher is "scooped" by another (i.e., it is published or otherwise publicly released before the first researcher has had time to announce the result).
In the normal course of events, the researcher, having made an error or having been scooped, must back up and proceed forward with the corrected decision, or simply turn his or her attention to a new problem. However, for the ECSE researcher, the consequences of bad decisions or being scooped are particularly severe, because of the large amount of time that the researcher may have invested without productive and creditable results.
Building a Reputation
As noted in Chapter 1, a great deal of ECSE research is conveyed to the community through the diffusion of artifacts. In terms of time, the diffusion process for artifacts is much more costly than the usual journal publication route, in which the entire relevant community learns of a significant article when it first appears in print. As a result, reputations in ECSE tend to take longer to establish.
In addition, it is traditional in the biological and physical sciences for both experimentalists and theoreticians to take postdoctoral positions for several years after receipt of the Ph.D. Individuals in these fields use this time to concentrate on their research and thereby
get a "head start" on establishing their reputation in the relevant research community before the tenure decision is made. ECSE has mostly had no such tradition (although theoreticians in computer science as well as specialists in artificial intelligence are beginning to develop one), and new Ph.D.s in ECSE often take assistant professorships upon graduation. They therefore do not receive the benefits of a comparable time period in which to establish their reputations. The situation may be changing, however, as regular faculty jobs in ECSE become more scarce.
THE RELATIONSHIP OF RESEARCH SCALE TO INFRASTRUCTURE NEEDS
Infrastructure needs are determined largely by the scale of the research to be supported. Research in ECSE can be conducted at different scales of funding and effort. A small-scale project could be funded at the level of perhaps $100,000 for two years and require one or two person-years to complete; the research "team" might consist of a single investigator at almost any university and a part-time project secretary or assistant. A large-scale project might cost several million dollars per year for several years and require dozens of personyears to complete. It is inherently collaborative, and the research team might consist of several principal investigators, a dozen graduate students, a few technical staff members, and a full-time administrative officer. Such large-scale projects can usually be housed at only a few select universities with the necessary institutional resources and capabilities. Box 2.2 contains examples of small-, medium-, and large-scale ECSE research projects.
Obviously, large-scale ECSE research makes greater demands on infrastructure than small-scale research; it also inevitably requires the presence of collaborative teams. Thus, it is clear that not all types of ECSE can flourish equally well in all academic environments. Scaling project size to resources and facilities available at any particular institution is an important consideration for every researcher. Indeed, the ability to choose significant problems appropriately when faced with such constraints may be a distinguishing mark of creativity and thoughtfulness in a faculty member.
Similar considerations also apply to the question of time. It is undeniable that large-scale systems projects take a long time to complete. However, ECSE researchers also have the option of choosing smaller-scale experimental problems that do not take as long to complete. Undertaking large-scale ECSE research that is carefully structured so that meaningful intermediate outputs can be obtained is also an option in many cases.
BOX 2.2 Scales of ECSE Research
Small-scale ECSE research. The program synthesizer, undertaken in the early 1970s, was the forerunner of the programming environments that are in use in most modern software engineering projects today, and yet it was performed at a scale of perhaps six to seven person-years (one faculty member and one graduate student) with total funding of about $150,000 over its lifetime.
Medium-scale ECSE research. The Sprite operating system, described in Chapter 1, was undertaken in the 1970s. It lasted four to five years, involved two full-time-equivalent faculty and several graduate students, and consumed perhaps $1 million over its entire lifetime.
Large-scale ECSE research. The Multics project was a large-scale systems research project undertaken in the 1960s to develop a scalable time-shared computer utility. Over its eight-year R&D lifetime, its ARPA-supported budget was on the order of $2 million per year; in addition, the Bell Telephone Laboratories and General Electric (later Honeywell) contributed comparable resources during this period. At MIT the development effort in addition to staff involved about a dozen faculty members and perhaps two dozen graduate students. Although commercialization of Multics was only moderately successful—a peak of 77 sites worldwide—concepts researched and developed through the Multics project (such as virtual memory, mapped files, dynamic linking, protection mechanisms) play key roles in many operating systems today. The UNIX operating system in particular built heavily on the Multics experience.
Without adequate infrastructure, many ECSE faculty are not able to fulfill their true potential. There are many facets to this infrastructure. The availability of general computing environments in the form of workstations has improved immensely over the past 10 years. However, as described in this chapter, a workstation alone is not sufficient to carry out interesting and important experimental research in software systems, let alone hardware.
The bottom line is that on the basis of infrastructure considerations alone, most ECSE faculty who are trying to pursue important work cannot hope to achieve the same publication or completed-Ph.D. records as their theoretical colleagues: they encounter unavoidable delays before start-up, the work is more time-consuming along the way, and their unavoidable dependence on factors such as graduate students and external vendors can add significant delays or drag to the process.