Stress and Cognitive Workload
The integration of very high-tech equipment into standard operations is a radical change in the challenges faced by the infantry soldier. In addition, the battlefield, in all its forms, remains a place of extreme stress. Coupled with this stress comes the new burden of cognitive workload associated with the operation and management of new technological systems. The Land Warrior System, as currently conceived, is but one version of a potential family of advanced systems-each of which may generate its own combination of stresses. In this chapter we examine these stress-inducing factors, to identify sources of potential problems and to recommend avenues to solve such problems, either through existing capabilities or by proposing additional research.
A UNIFYING FRAMEWORK
One of the reasons it is difficult to predict accurately the performance of soldiers under stress is that the scientific foundation of this area is inadequate. For many decades, we have relied on the untestable, inverted-U theory of physiological arousal, with both high and low levels of arousal inhibiting attention (see Yerkes and Dodson, 1908). This theory has been used to account for a wide pattern of performance, after it has happened; unfortunately, it has little if anything to say about what will happen in the future under specific circumstances (Hancock, 1987). Consequently, although the theory is very useful to the scientist trying to explain a confusing pattern of results, it is of little use to the designer of equipment or the leader of a platoon trying to predict how individuals will react to specific conditions. More recently, researchers have looked at effects on
either a task-by-task basis (e.g., memory versus decision making) or on a stress-by-stress basis (e.g., heat versus noise) (Hockey and Hamilton, 1983). However, these approaches provide no unified basis for understanding stress and subsequent effects on workload.
A unified approach to the prediction of stress effects has been presented recently by Hancock and Warm (1989). The basis of this model is shown in Figure 6-1. Briefly, the model establishes ranges of adaptability. Violation of these ranges of adaptability causes progressive failure in performance. Because the model uses both a physiological and a behavioral index, it can deal with the question of combined physical and mental demands and is recommended for use in association with the testing of advanced technologies. In this work, Hancock and Warm (1989) have combined existing physiological and psychological theories of stress effects into a unified view.
In the center of the figure is a region in which stress exerts little impact in performance variation. In the center of this comfort zone, individuals can be expected to produce their best performance. As stress increases, either through a systematic increase or a systematic decrease of appropriate stimulation, the individual is required to use greater resources to combat the influence of such demanding conditions. This reduces the resources available for explicit task performance, and so efficiency is reduced accordingly. It must be emphasized that the demands of the primary task itself may well act in this depleting fashion.
There are some strategies by which the individual can sustain satisfactory performance, even as stress increases. One common tactic is to select only those
cues relevant to task completion and to reject of filter out cues that are irrelevant. This allows some degree of protection from stress effects. However, if the stress should continue to grow or should be perpetuated over a longer time, then some degree of performance failure is observed.
Individuals can tolerate a certain level of stress with no disturbance to their capability. This is the flat portion of the extended-U of the model. However, if the stress persists for a sufficient period or the severity is increased, the individual begins to lose capability. Since the functions stay the same, the way in which task performance falls off mirrors the way in which physiological capacity is impaired. The Army is accustomed to providing physiological support for its personnel, in terms of energy, resources, and specific equipment to avoid threat. The Army is accustomed to providing supplementary support when such systems fail or induce stress. The proposed model argues that new equipment ensembles and associated mission stresses can be ameliorated in the same fashion-that is, by providing training and supplementary support.
The model does not indicate where any individual will fail in relation to a specific form or level of stress. Furthermore, the model does not stipulate what stress alleviation or stress generation is inherent in the employment of the Land Warrior System. There is much to be learned about individual reactions under such circumstances, and this consequently represents one area of needed research. One critical aspect of this research must be the identification of incipient failure and the reduction of stress under the specific conditions associated with using the Land Warrior equipment. As is evident from the model, failure in performance, such as errors, will be evident before failure in physiological functioning. However, since the soldier is exposed to stress in order to accomplish a task, it is essential for his squad or platoon leader to be aware of these incipient failures. This implies the need for on-line feedback of individual soldier performance to be provided throughout the information network; it would then become possible for the leader to know when specific members of their command are becoming overloaded and subject to performance error and potential failure. Such information is critical to ensure mission success, and its transmission could be a central feature of the proposed helmet-mounted display.
Much more research effort needs to be directed to the critical question of performance prediction under stress. It is a major requirement for the success of the Land Warrior System on the battlefield of the 21st century. The Land Warrior System will add new tasks for the soldier to perform. These tasks are expected to be more information-intensive: they will require more reading and more cognitive processing than is currently necessary. The following section describes the existing sources of battlefield stress that provide the psychological and physical context for the introduction of the proposed system. These context factors must be considered in examining the implications of the helmet-mounted display for increased workload and, in turn, for soldier performance.
SOURCES OF OPERATIONAL STRESS
The Task as a Source of Stress
The traditional approach to understanding stress has been to focus on a specific source-for example, fatigue-and to try to understand how variation in fatigue affects capability. This view places primacy on the external source of stress. More recent views (Hancock and Warm, 1989) have emphasized the primacy of the performance task itself as the major source of stress. For the infantry soldier, mission requirements themselves have to be considered the proximal source of stress, and other outside influences, such as heat, noise, fatigue, etc., are seen as forms of stress that interact with the primary demands of the task. This new perspective is particularly important when we consider the cognitive demands associated with advanced technologies. New tasks initiated by the use of technologies, such as the helmet-mounted display, represent a major challenge and therefore a source of stress on the infantry soldier. In sum, the interaction among stressors may be crucial.
Acclimating to a source of stress provides resistance to its deleterious effects (Hancock, 1986c). For example, heat acclimatization many take a matter of some weeks but, having passed through such acclimatization, the individual is much better able to withstand extended exposure. This process of acclimatization is well known in physiological research. However, there is a comparable effect for cognitive systems. That is, to train high-performance skills (Schneider, 1985), it is necessary to engage in extended practice on the task-in essence, acclimating the cognitive (central nervous) system. To accomplish this, the task has to be constructed in a specific fashion. In particular, it has to be consistently mapped, between input stimulus and output response (see Schneider and Shiffrin, 1977). Consistently mapped, overlearned, or automated tasks are not only performed more quickly and with lower error rates, but also prove less vulnerable to the effects of external sources of stress (see Hancock, 1984, 1986a). As a consequence, how cognitive tasks are structured has a direct influence on how well they will be performed on the stress-replete battlefields of the future in which the Land Warrior or a comparable system might be employed (see Schneider, 1985, for further details on structuring complex performance tasks for optimal training). In the past, the military has led the way in terms of advanced training and it is especially certain that training will play a significant role in operating new technical support systems such as the Land Warrior System.
Information Overload and Underload
Many pursuits can be characterized as "hours of boredom and moments of terror," including aviation, law enforcement, firefighting, and even less life threatening ones such as acting. Definitely included with these are the activities
of infantry soldiers, who are often held at some location for many hours waiting, only to find themselves in the midst of a confusing engagement a short time later. In these circumstances, the individual goes from the potential extremes of underload to a massive overload in moments. Little is known about the effects of such transitions (but see Huey and Wickens, 1993).
What is clear is that, as a design principle, technology should provide soldiers with a compatible information load at all times. In essence, tasks should be structured as much as possible to transfer the information overload of engagement conditions to either the pre- or postengagement period. As much as possible, technical support should anticipate the demands of engagement and permit the preprocessing of pertinent information. Similarly, during the engagement, technology should off-load the information demands on the soldier, especially if noncritical tasks can be postponed for the postengagement period. In related work, Hancock and Chignell (1987, 1988) have suggested an adaptive human machine system, in which the task of load distribution is subsumed by an intelligent interface (Chignell and Hancock, 1989), which uses signals from both the technology and the operator to decide which task components need to be given to which of the human-machine partnership and at what time. Such a system is under exploration for use in advanced single-seat tactical aircraft; in principle, it should be equally appropriate for the Land Warrior System.
Recent engagements have referred to the strategy, ''Blind it, cut it off, and kill it." This refers to the process of destroying the enemy's information collection systems (sensors, etc.), the subsequent destruction of communication channels, and finally the destruction of an impotent enemy force. However, it is important to be aware that an information overload for our forces is just as dangerous as an information underload for the enemy. In depriving the enemy of their sources of information, our forces must avoid doing the same to themselves by producing an incomprehensible avalanche of information. This is especially true for the dismounted infantry soldier, who needs only a few sources of vital information during engagement. If this filtering process is not successful, there are liable to be additional cases of soldiers immobilized by stress-not the stress of battle but the stress of information overload. The use of electronic displays in the Land Warrior System opens the door wide to such a condition.
Information and Disinformation
Orthogonal to the problem of overload and underload is that of information versus disinformation. Intimately linked to the question of security, the problem of trust in the technical support system is an extremely important one. Lee and Moray (1992) found that the level of trust exhibited by the operator is a critical component in a system's utility. Riley (1994) found that operators are very slow to recover belief in system information, given a single failure event in an automated system. Given military experiences with friendly fire incidents, it is even
more critical that the information supplied by an information system does not introduce distortions and inaccuracies. Of course, access to the force's information systems by the enemy forces represents a devastating event which could have negative influences even beyond a specific incident. Ideally, critical information would always be confirmed by a secondary source, preferably one not relying on technological support.
A great many occupations make taxing mental demands and a large (but diminishing) group of occupations impose physical demands. The job of the infantry soldier requires considerable physical work at the same time as critical decision making. As technical innovations percolate more into the soldier's everyday activities, these combined physical and mental demands are becoming a critical concern. In the panel's view a considerable research effort needs to be directed to an understanding of these combined influences.
In addition to the combined stresses associated with physical and mental workload, there is the stress generated by the equipment itself. It is important to keep in mind that the weight of the Land Warrior System that has to be carried is only one component of an already large assembly currently carried. Extreme stresses may become associated with operation of the equipment itself, especially helmet-mounted displays, if they are not comfortable and clear, and if the display resolution is poor and they are trying to operate vital controls. How the proposed equipment is integrated with NBC protection is a vital question when considering reducing sources of soldier stress. Experience indicates that equipment that is not considered vital or easy to operate during a firefight will be quickly jettisoned. Consequently, the predicted advantages of having well-equipped soldiers on the electronic battlefield may well be negated by the problem of discomfort. Consequently, considerable research is needed into the questions of comfort, prolonged use, systems integration, ease of ingress and egress, and fundamental ergonomic questions whose solution can reduce the stress associated with suboptimal design.
The Operational Environment
There is no greater form of stress than an immediate threat to life. Consequently, the proposed Land Warrior System will have to operate under the most extreme conditions of stress. Also, because there is no certain way to predict in what theater of operations soldiers will be required to perform, we consider specific forms of stress as well as the general, pervasive forms.
Stress accompanies military operations. Typically, stress is seen as a threat or an overload of demand conditions that seem likely to overwhelm the individual. However, stress is also destructive in the case of chronic underload,
especially when accompanied by the uncertainty that marks contemporary operations. Provision of on-line, momentary information can therefore have a profound influence on experienced stress, especially when some form of control or autonomy is provided to the individual soldier.
Perhaps the most influential effect of stress in the present context is the restriction in the range of cue utilization. Attentional narrowing (which is highly related to the concept of situation awareness) occurs under extreme conditions. Easterbrook (1959) proposed, and others have in general confirmed, that the nature of the cues recognized in a display varies as a function of stress. This phenomenon is familiar to pilots, whose scan pattern alters directly with the threat of environmental conditions. Originally, this was theorized to be a visual phenomenon, something akin to G-induced loss of consciousness, in which the pilot proceeds from tunnel vision through gray out to black out and finally unconsciousness. However, more recent findings suggest that it is an attentional phenomenon, since by manipulation of cue salience "narrowing" can be directed to the visual periphery. Given the reality of this phenomenon, the design of portable equipment and the way in which information is provided must vary according to context (Wickens et al., 1993).
Physical Sources of Stress
Heat is one of the more ubiquitous and disruptive forms of environmental stress that is liable to be encountered. Recent operations of U.S. forces, including the Desert Storm, Grenada, Haiti, and Arabian Gulf missions, have involved the problem of heat stress. Not only is heat stress a consequence of the geographical location of operations, but also, in any encounter in which the use of chemical or biological weapons is suspected, heat stress can affect military personnel wearing exclusion garments for protection. The heat load on the operator, generated by these exclusion garments is considerable, and on occasion has been sufficient to cause missions to be canceled or mission goals to be changed. Heat stress from such use does not have to occur in the hot regions of the world.
Researchers typically focus on the complex effects of heat on central nervous system functioning. This orientation is understandable, given that heat stress investigations are often part of a more general for the effects of stress on human performance. However, in the present context, there are certain coarse-grained, physical disruptions from heat that have to be considered first. People under heat stress sweat, and sweat is a problem for any visual display, since profuse sweating can impair visual capabilities. Also, individuals brushing sweat from their eyes may miss information solely from this physical gesture and not from some more complex effect on cognitive activity. In fact, to make accurate statements
about heat effects on cognitive function, one must be careful to exclude these peripheral effects.
Wearing a helmet-mounted display exacerbates these problems. A large percentage of heat load from the body is lost from the head, particularly from the front of the face. Covering this area is liable to generate profound heat-related problems. Heat loss also comes from vapor exchange in breathing. Displays that cover the whole face can lead to extensive fogging problems, as experienced for example in space suit operations. Although leaving half the face uncovered might seem like a solution, it ignores mandates about full exclusion garments for toxicological warfare. Sweat also presents a problem with respect to fit. Cooling off by injecting water into the interface between the display unit and the soldier's face can cause significant short-term and long-term problems, those of slippage and maintenance of complex electrical equipment exposed to water on a regular basis. Independent of all the concerns of geographical heat stress and garment heat stress, prolonged covering of the eyes causes thermal balance problems. Of necessity, any proposed displays need extensive field testing and also systems integration with existing equipment ensembles.
Predicting human tolerance to heat stress when physical load is expected can be derived from Figure 6-2. Predicting the breakdown in behavioral decision making from heat stress can be derived from Figure 6-3; this figure also includes information about body temperature, so it can evaluate the combined effects of physical and mental demand.
Whereas heat presents a number of unique problems, cold also provides considerable contextual challenge for advanced systems. Paradoxically, many of the heat-related effects, such as sweating and fogging, apply in the cold as well, since individuals are covered in heavy clothing to resist cold effects and are usually overclothed to provide protection against extended exposure. The primary, unique factor in cold exposure is shivering. Shivering is a physiological process designed to spend 100 percent of the body's stored energy to heat it. The oscillating motion experienced is not directed to behavioral goals and in fact directly interferes with them. Hence, the central problem of shivering is mechanical interference. Traditionally, the research literature has looked at how this interference debilitates psychomotor performance such as pursuit tracking; however, there are obvious effects on perception as well. Although fewer studies have examined this facet of performance, it is clear that some degradation is to be expected.
Unfortunately, by the time shivering becomes a mechanical problem, it has already begun to affect central functions such as speed of response and decision making. As a consequence, cold effects become as much of a contextual challenge as heat effects, although somewhat less likely to be encountered.
Noise is another form of disruptive stress, and its concurrent and aftereffects are less well understood than those of temperature. Noise effects are much more diverse and difficult to localize than other stresses. Unless noise drops below approximately 30 Hz, the effects are mainly aural, although very low-frequency noise and vibration effects are directly linked.
Noise is an intermittent stress. In World War I, noise was one of the major forms of annoyance for the infantry soldier; partial deafness from shelling was not uncommon. Modern warfare relies less on such artillery tactics, since the theater of war itself is more mobile, so the individual soldier is more likely to experience brief bouts of intensive noise punctuated by bursts from ordnance impact. These may be interspersed with periods of relative quiet.
One obvious concern in physical ergonomics is the provision of auditory
messages against a noisy background. In a literal sense, this is a signal-to-noise problem. When soldiers are exposed to extremely noisy conditions, multimodal forms of information transmission may make sense, although that strategy may risk visual overload.
Noise is defined as unwanted sound, such as loud sounds that are either momentarily disturbing or more prolonged and annoying. Unwanted sounds
could also include an excess of verbal communications and messages, especially if a party-line system is in use. The soldier may not want to listen to extraneous information that distracts him from the task at hand.
In any battle, noise and its persistent aftereffects will be present, and these factors must be considered when integrating developed technologies. It may be possible to develop a helmet-mounted system specifically designed to insulate the soldier from excessive ambient noise; however, sometimes vital auditory cues are needed to survive. Flexibility and choice of configuration-that is, an adaptive system-is the best recommendation.
Whereas noise is airborne, vibration can be thought of as material borne. Vibration results from vehicle operations such as helicopter and armored personnel vehicles, and also as a function of locomotion. Vibration effects are very similar to shivering, with one critical exception. The frequency of material-borne vibration can vary over a considerable range, including the resonance frequency (3-6 Hz) of the major organs of the body. Vibration in this range is exceptionally disruptive to the point of nausea and beyond. Vibration effects, like shivering, are liable to affect both perception and action. Usually, vibration is masked by internal compensatory systems; we do not see the world bounce up and down as we walk. However, with helmet-mounted displays, the suppression of this intrinsic form of vibration is not clear. Much empirical research still needs to be done on work performance while walking (see Sampson, 1993). The panel urges that such research continue, since it lies at the heart of the interface between the physical characteristics of the environment and the design of the technical support systems themselves. Figure 6-4 illustrates the quantitative limits on vibration for visual capability. Boff and Lincoln (1988) discuss a wide range of vibration effects across differing sources of stress.
Extended Operations/Time Of Day
As mentioned earlier, infantry operations are conducted 24 hours per day. An important factor that remains uncertain is the length of an individual engagement and consequently the duration of mission that the infantry soldier is expected to sustain. In ground-based warfare of the immediate past decades, emphasis has been given to the first 72 hours of engagement. In consequence, the U.S. Army became a world leader in understanding the effects of sustained operations. Given the depth and clarity of this work, it would be redundant to repeat it here. However, it is important for any designer and user of the Land Warrior System to consider the stresses associated with circadian phases of operation, duration of operation, and associated fatigue. Whereas the Army has an outstanding record in the area of extended operations research, specific studies
using the Land Warrior System are needed to establish the tolerance and reaction of soldiers wearing such equipment under operational conditions.
The Problem of Interaction
In approaches to understanding stress effects, one great limitation is the failure to investigate and capture the effects of multiple interactive sources of stress. Most detailed investigations have examined the influence of a single stress on a single type of performance task (e.g., Hancock, 1986b). Reliance on the arousal explanation enabled the postulation of a common pathway for all forms of bodily stress, as mediated through the ascending reticular activating
system. This assumption has proved unreliable. Multiple stress effects cannot be inferred from the singular action of each one. A number of alternatives to the unitary arousal theory taxonomies offered by Hockey and Hamilton (1983) and Sanders (1983) can serve as a basis for ordering results, but they do not represent a causal account of the phenomenon at hand.
The major stumbling block in predicting stress effects concerns the interactive nature of stresses in the real world. The model by Hancock and Warm (1989) sees attention as the final arbitrator of performance efficiency and seeks to specify how multiple sources of stress influence this capacity. They distinguish two basic components of any source of stress: its spatial and temporal components. The temporal component involves the rate at which stimulation occurs, the spatial component involves the structure of the environmental source with respect to the perceiver. Since each form of stress must have these intrinsic characteristics, there is an opportunity to express different sources of stress on the same scale. This decomposition is easier for task stressors than it is for environmental stressors. However, since environmental stressors act through sensory systems in the same fundamental way as other stimulation to the central nervous system, the comparison is valid.
Extended research is clearly needed on specific systems in the operational context. Until the theoretical basis for predicting stress effects matures, there is simply no substitute for full-scale testing. It may be possible to forecast the effects of some specific forms of stress and to use these predictions as a basis for individual design. That strategy is probably more relevant to traditional ergonomic forms of stress such as heat, noise, vibration, etc., and their effects on performance and comfort than to more complex issues of software presentation and task design. Independent of original specifications, it is absolutely vital that user acceptance is evaluated at all stages of development of the Land Warrior System.
EFFECTS OF STRESS ON PERFORMANCE
Fatigue, a subset of general stress effects, is a concatenation of a variety of physiological and psychological factors. Among the precursors are extensive hours of work; among the physiological factors are circadian rhythms; and among the behavioral factors is vigilance or sustained attention. An individual experiences fatigue when the interaction of these multiple factors propels him or her across a "fatigue barrier."
Fatigue is a critical factor in military performance, especially for prolonged operations. Large and consistent efforts have been directed by the military and others to understand fatigue, yet it has resisted their best efforts. Attempts to define fatigue-for example, "subjectively experienced disinclination to con-
tinue performing the task at hand"-have to fight hard against the absurdity of tautology. In a classic work, Muscio (1921) even suggested abandoning the term fatigue, in part because of definition failure and in part because of its multidimensional nature.
Fatigue clearly has an impact on cognitive functioning. However, in the context of the infantry soldier, physical fatigue is equally significant. Perhaps the most important part of physical fatigue in terms of the helmet-mounted display is visual fatigue. Like any other component of the body, the eye is moved by muscles and these muscles are vulnerable to fatigue. Repetition of use usually brings on fatigue, although there is much evidence that even this putative muscular fatigue is actually mediated by central control. That is, a muscle that an individual judges to be fatigued will respond at almost 100 percent when stimulated externally. Because much of the information provided on the helmet-mounted display is expected to be visual, the question of visual fatigue during operations merits focused empirical effort. The physical fatigue the soldier experiences will not result solely from operating the equipment but is intrinsic to battle operations. The design of the display must therefore facilitate cognitive functioning under this form of stress.
The human as an adaptive organism seeks to combat the adverse effects of stress for as long as possible. When such efforts are exhausted, the individual quickly begins to fail, both in performance and in physiological response. (A similar drop is evident in some aging individuals, who appear fit and active for many years, only to be taken extremely ill and die within a remarkably short time.) The question of relevance here is how to distinguish the onset of the drop. In complex task performance, the effects of stress are evident before they are measurable in terms of physiological response. It therefore makes sense to measure incipient individual failure during performance. Both subjective and physiological measures of cognitive workload (discussed in a later section) can be used to measure the onset of failure.
The speed and accuracy with which an individual accomplishes a task are the primary indicators of performance. In the process of failure, stimulus detection remains fairly rugged, whereas stimulus selection is more vulnerable to stress effects (Hancock, 1996b). One indication of incipient failure is that task-related cues are selected, but they are inappropriate for the action of the moment. For example, an infantry soldier may correctly detect and identify an enemy soldier at a distance, while a much closer enemy goes undetected. Evidence of a performance pattern such as this could signal to a platoon leader or other individual at a remoter site that performance is likely to degrade. Of course, such events need to occur in a pattern, since a single failure could occur for a variety of reasons. It is the job of the platoon leader to distinguish ongoing failure from momentary
fluctuation in performance efficiency. This distinction is a difficult one to make and would depend directly on the current status of the engagement.
ADAPTIVE RESPONSE TO STRESS
The performance challenge for the soldier has traditionally been a physical one. Selection criteria reflect the capabilities needed to respond to physical demands, and training prepares the soldier to meet this challenge. A career in the military, ascending through the chain of command, has been traditionally accompanied by progressively greater responsibility and the burden of increasingly important decision making. Contemporary and proposed technologies are making changes in the challenge to the dismounted infantry soldier, in which significant cognitive demands are added to physical demands. The critical questions are whether soldiers will be able to meet this challenge and how to structure support systems that ensure success rather than impose failure.
A vital step in the sequence is the assessment of cognitive demands. We note that the Army has experience in this area and has produced important information on workload assessment, in work led by Christ and his colleagues. However, it is important to recognize that the majority of information on cognitive workload has been derived from either laboratory experiments or field evaluations in which physical demand has played a nonsignificant or at most minor role. Much remains unknown about the effects on cognitive demand of physical activity and, conversely, the effects on physical activity of cognitive workload. Although comparable activities have been studied, such as firefighting and emergency mine egress, relatively little experimental information is available (but see Vercruyssen et al., 1988). Also, further research is needed on the interaction effects of trying to do several cognitive tasks at once. Although there is some evidence concerning dual-task performance and time-sharing capabilities, they have not been examined in association with physical effort. Much remains to be done in these areas.
The research literature includes almost limitless arguments on how to define cognitive workload.
For our purposes, we define workload1 as:
time required/time available
If the time required by a task is more than the time available, there is cognitive overload. If the time required is very much less than the time available and the soldier has no other tasks to do, there is cognitive underload. Some tasks, such as extended surveillance, require the soldier to pay attention for long periods of time without overtly doing anything. This underload situation can be stressful and so result, almost paradoxically, in cognitive overload. Cognitive workload is best seen as a continuum on which both too much and too little are liable to result in problems.
The number and kinds of mental processes that can be carried out in a given amount of time are limited. For example, we can only listen to and comprehend one conversation at a time. Even well-practiced and automatic tasks, such as driving, can become difficult and attention-demanding. Encountering heavy traffic in an unfamiliar city may interrupt an ongoing conversation with a passenger or cause us to abandon various internal mental processes, such as imagery and planning. For the designer, understanding and measuring the attentional demands of various activities are important prerequisites for minimizing attentional overload.
Models of Attention
Broadbent (1958) presented one of the first attempts to characterize capacity limits in human information processing. He proposed that comprehending multiple, unrelated spoken passages was impossible because of limited short-term memory capacity. Therefore, some messages had to be selected for admission to short-term memory, while irrelevant information was essentially blocked by a filter mechanism. The filter could be set to admit information on the basis of physical characteristics such as location, pitch, loudness, etc. As Gopher and Donchin (1986) point out, Broadbent's model of attention does have implications for systems design. If it is important for an operator to focus attention on one source of information, it should be clearly segregated from other sources by simple physical features, to allow efficient selection by the filter. This same prescription applies to information conveyed by visual displays. Simple physical features, such as color, line orientation, and motion, can be picked up by preattentive processes (Treisman and Gelade, 1980; Wolfe, 1994) and can rapidly guide attention to relevant display areas. The filter model, however, has relatively little to say about how and when divided attention is possible. Nor does it account for the idea that tasks vary in difficulty, which can at least partially be overcome by increased effort.
Kahneman's (1973) capacity theory of attention suggests that people have a limited supply of mental resources that can be divided among simultaneous men-
tal activities. Difficult tasks require larger shares of resources and so cannot be carried out together without exceeding the supply. Easy tasks, well-practiced automatic ones, can be effectively ''time-shared." The total pool of capacity is characterized as variable, depending on the subject's state as well as the task demands. When faced with a difficult task, arousal level may increase, providing additional resources to meet the increased demand. The increase in capacity can be monitored by measuring pupil diameter, which is related to arousal.
This framework led to a series of experiments by Beatty and colleagues (reviewed in Beatty, 1982). Their effort theory suggests that one can objectively measure task difficulty using arousal-related measures such as pupil diameter (see also Just and Carpenter, 1993). This model also points the way to new methods of measuring task difficulty based on the idea of spare capacity. A given task receives the resources necessary for performance; spare capacity can be measured by giving subjects a secondary task, providing an index of the capacity required to perform the primary task. In this way, the capacity demands of a variety of tasks can be measured and used to predict dual-task performance. Unfortunately, this endeavor did not meet with much success. Interference was often observed with even simple tasks, and the amount of interference seemed to be related to task similarity. For example, two visual tasks would interfere more than a visual and auditory task even though the "capacity demands" of the tasks were comparable. Kahneman had anticipated something like this, pointing out that "structural interference" between simultaneous tasks could occur when they both required a "non-shareable" mechanism. An example would be two tasks that simultaneously require eye movements to the left and right sides of a display screen. According to Kahneman, this interference is clearly different than that which arises from competition for a limited attentional capacity. As mentioned above, interaction effects can be overwhelming.
Multiple Resource Models
Wickens (1980) tried to account for the effects of task similarity on dual-task performance in terms of a multiple resource model. He suggested that resources correspond to a combination of three dichotomous dimensions consisting of: (1) stages of processing (perceptual/central versus response), (2) modalities of perception (visual versus auditory), and (3) codes used to represent and respond to information (verbal/vocal versus spatial/manual). Efficient time-sharing performance should occur when two tasks use different values on each of the three dimensions, as depicted in Figure 6-5. For example, retaining visually presented words calls on the visual input modality, central stages of processing (memory rehearsal), and verbal coding. Moving a joystick to track the location of a sound involves the auditory input modality, response level stages, and spatial codes. These two tasks should pose little interference when performed together.
Although debate continues on the number and nature of mental resources,
and indeed whether resources are a useful concept at all (navon, 1984; Allport, 1993), the multiple resource model still offers a useful perspective on cognitive workload. First, it is clear that, if resources are of more than one kind, then workload assessment must measure each individually rather than offering a single, overall index. Second, the model offers a rough heuristic for designing tack environment to minimize interference.
Recent research by Pashler (1989,1994) has uncovered a surprisingly robust pattern of dual-task interference that occurs between even simple tasks. Suppose that subject are required to make a speeded response to a tone by pressing a button. The are also required to make a rapid foot-press response to a visual signal that occurs shortly after the tone. When the interval between the tone and the light is small, the second response if delayed, in some cases by several hundred msec. This period of time, during which a second task is interfered with by a prior task, is known as the psychological refractory period. The magnitude of interference gets smaller and eventually disappears as the interval between the tasks increases. There appears to be a bottleneck in processing such that two tasks occurring in close temporal proximity are competing for access to a limited-capacity mechanism; the task arriving second must wait until this mechanism is no longer needed by the first task.
Pashler proposed that the particular pattern of interference he observed indicated that both tasks needed access to a single-channel mechanism, that is, a mechanism that can process only one input at a time. He further suggested that this single-channel bottleneck occurred fairly late in processing, at the point of selecting responses. The selection of the appropriate response for the second task had to wait until the response selection for the first task was completed. Similar
effects have now been demonstrated for a wide variety of task pairings and response modalities, including hand and foot movements, vocalization, and eye movements (Pashler, 1994; Pashler et al., 1993). Carrier and Pashler (1995) have recently shown that retrieval of information from long-term memory, rather than response selection, is more generally responsible for the observed effects. Similar logic was used by Ruthruff et al. (1995) to show that mental rotation processes, which are used, for example, to imagine what a shape might look like in a different orientation, also use a bottleneck mechanism.
Thus there appears to be a mechanism involved with some aspects of memory retrieval and possibly memory transformation (as in mental rotation), which is strictly serial and constitutes a significant bottleneck in dual-task performance. It may be the same mechanism implicated in other theories that postulate a working-memory bottleneck. In addition, Pashler (1991) has shown that there is a second limited capacity attention system involved in perceptual processing. This system, unlike the central limited-capacity system, may be shareable by several perceptual inputs at a time.
There are obvious similarities between Pashler's model of dual-task interference and Wickens' multiple resource model. Both postulate separate limited capacity systems responsible for perceptual analysis and more central memory processes. Both systems allow for interference at stages involved with responding. According to Pashler, however, response selection presents a bottleneck regardless of the modality of the response system; verbal responses should interfere with verbal responses as much as with manual responses (Pashler, 1994). Pashler's proposals also bear some similarities to recent theories that point to working memory as the principal bottleneck in human information processing.
Working Memory Models
Knowledge in long-term memory is thought to be in an inactive state relative to the small subset of knowledge that is the focus of thought at any given moment. This activated knowledge is said to constitute working memory. A variety of cognitive activities, such as language, reasoning, and planning, require some knowledge to be maintained in an active state to be examined, transformed, and used for retrieval of additional information. According to a model proposed by Just and Carpenter (1992), maintenance of information in working memory requires attentional activation, as do other activities such as transformation and storage of working memory contents. This activation is drawn from a finite pool (similar to Kahneman's theory) and therefore, when comprehension becomes difficult, performance may slow down and become more error prone.
Just and Carpenter (1992) also propose that there are individual differences in working memory capacity as measured by the working memory span test, that have implications for how well an individual comprehends a difficult instruction. For example, they found that low-span and high-span subjects read simple sen-
tences at about the same rate, but that low-span subjects had particular difficulty reading and comprehending syntactically difficult sentences. Similarly, St. George et al. (1995) found that high-span subjects were more likely to make optional inferences in reading a text.
Interestingly, pupil diameter has been shown to be sensitive to working memory load in a variety of tasks, including reading (Just and Carpenter, 1993) and problem solving (Just et al., in press). This is consistent with Kahneman's original claim that pupil diameter may serve as a general measure of processing difficulty. It remains to be seen whether working memory will prove to be fractionated into separate limited-capacity systems, as occurred with Kahneman's model. For example, Shah and Miyake (in press) have recently provided evidence that there may be separate working memories for language and visual/spatial tasks, which is consistent with recent brain imaging work showing that these tasks activate separate brain areas.
Measures of Cognitive Workload
As the previous sections suggest, there are severe limits in our ability to process multiple sources of information. Understanding these limits is essential in designing and evaluating components of the Land Warrior System. Current technology allows many options in terms of information display, graphical interface, input devices, etc. The availability of these options has the potential to produce severe information overload for an infantry soldier who is hot or cold, tired, and stressed. Comprehensive assessment of workload is therefore a critical part of the design process.
The goal of cognitive workload measurement is to characterize the attentional demands that a task places on an operator. O'Donnell and Eggemeier (1986) suggest several criteria for evaluating measures of cognitive workload. Measures can be evaluated according to their: sensitivity (does the measure respond to variations in task difficulty or load?), diagnosticity (does it indicate what kind of attentional resource is being used?), intrusiveness (does the instrument interfere with performance of the task?), implementation requirements (does it require costly equipment or large amounts of time to complete?), and operator acceptance.
Table 6-1 lists four classes of workload measurement with their pros and cons. Primary task performance and subjective measures can be used for the initial screening of designs. These are easy to administer, sensitive to task difficulty, and have good user acceptance. Subjective measures include structured interviews, rating scales focused on particular tasks, etc. They can be supplemented by looking at how easy it is for operators to combine various tasks together, using the embedded secondary task technique. Dual-task performance, using pairs of tasks that are likely to occur together in the course of a mission, should provide valuable information on potential bottlenecks.
TABLE 6-1 Workload Measurement Methods
High face validity
Measures assessed anyway
Workload or performance?
Poor user acceptance
Loading task in high situations
Theory bound interpretation
High face validity
Easy to obtain
Dissociation with primary
Largely post-hoc measures
Subject to artifacts
Poor user acceptance
Difficult to administer
Several useful reviews of these techniques are available. Gopher and Donchin (1986) provide an overview of the concept of workload, O'Donnell and Eggemeier (1986) give a good description of various subjective scales of workload measurement, and Kramer (1991) has provided a recent analysis of physiological workload measures. In the section that follow we summarize the important points and consider their relevance to the Land Warrior System.
There is a subjective aspect to attention. We are often aware of increases in the difficulty of tasks as well as our own efforts to compensate for that difficulty. This effort or intensity dimension of attention is the basis of Kahneman's model and its many successors. Subjective measures are an attempt to systematically query subjects about their own awareness of task difficulty. The Cooper-Harper scale is one of the earliest efforts to assess workload. Pilots are given a questionnaire that requires them to rate the handling characteristics of aircraft, as well as how much compensation is required to make up for handling deficiencies. Later versions were modified to specifically measure workload and were found to be highly correlated with a variety of task difficulty manipulations (North et al., 1979). Clearly, this scale attempts to provide a general or global measure of workload without being diagnostic as what factors are driving the workload.
Task difficulty is perhaps the most obvious contributor to subjective workload, but it is certainly not the only one. A task that appears easy in one case can
be made difficult by applying time pressure. So part of workload involves a sense of time pressure. In addition, physical demands may contribute to our sense of mental workload. Two recently developed instruments provide subscales to measure some of these different aspects of workload. The NASA task load index or TLX (Hart and Staveland, 1988) asks operators to make ratings on six dimensions: mental demands, physical demands, temporal demands, performance, effort, and frustration. A weighted average of scale values on these six dimensions is computed to provide an overall measure of workload. Like the Cooper-Harper scale, scores on the TLX show high correlations with various criterion variables, such as performance in simulators as well as in field tests (Hart, 1986). The subjective workload assessment technique (SWAT) uses three scales designed to measure time load, mental effort load, and psychological stress load. For example, operators are asked to estimate mental effort in terms of whether the task required little conscious mental effort, moderate effort, or extensive mental effort and concentration (O'Donnell and Eggemeier, 1986). Weights are then assigned to the three scale values to derive a composite measure of workload.
The multidimensional nature of SWAT and TLX appear to offer an increase in diagnosticity relative to single-scale approaches such as the Cooper-Harper scale. For example, one can look at the scale weights for a given subject to determine whether that subject perceived the task to be time stressed or high in frustration. On the negative side, these scales may take a long time to administer, and, in a complex system like the Land Warrior, with many components, this time could be prohibitive. In addition, the meaning of the weights assigned to the dimensions is not perfectly clear. In comparing TLX and SWAT, Nygren (1991) points out that equal weighting of the scales in TLX ought to be about as good a predictor of the criterion variable as the differential weights derived from the data, and in fact that has been found empirically (Byers et al., 1989).
It also appears that subjective measures may sometimes dissociate from other measures of workload. For example, Sirevaag et al. (1993) collected TLX ratings from helicopter pilots in a high-fidelity simulator. They varied communication demands and found that pilots had trouble adhering to nap-of-the-earth altitude criteria under high communication demands. The greater load imposed by the communication task was reflected in several workload measures but not in the subjective ratings. When questioned about this puzzling outcome, pilots indicated that they were aware of the greater difficulty involved in the demanding communication condition, but their ratings didn't reflect this because they felt that none of the conditions exceeded their capacity to perform successfully, and so they rated them as equivalent. Subjective measures are just that, and subjects can adopt their own criteria for the ratings.
Hendy et al. (1993) point out that both the TLX and SWAT go to great lengths to provide a composite measure of workload. They suggest that, if one is interested in a global measure of workload, one might do just as well by asking
subjects to simply estimate the workload on a univariate scale. The procedure used by SWAT and TLX seems to assume that, although subjects may be able to give accurate estimates on specific components, they are not able to report on overall or global workload. In four studies, Hendy et al. compared the composite score on a modified TLX test with a univariate measure of global workload, asking subjects to use a magnitude estimation procedure to estimate the difficulty of various segments of a flight relative to the difficulty of the takeoff and departure segments. They found that the univariate measure was more sensitive to variations in task difficulty than the TLX composite measure or any of its subscale scores, suggesting that, if a global measure of workload is what is desired, a simple univariate scale works pretty well. They also raise an important issue about the goal of workload measurement: How useful is it for the system designer to know that a workload problem stems from excessive mental effort, time pressure, or frustration? They also point out that "excessive time pressure could come from the number of things to attend to, the requirement for high precision, lack of feedback, or use of inefficient strategies attributable to insufficient training" (p. 599). In other words, it might be difficult for the system designer to use the kind of knowledge provided by workload measures to diagnose what aspect of the design should be changed.
Hendy et al. (1993) recommend presenting scales that focus on various aspects of the design in terms of impact on attentional demands. For example, knowing that memory retrieval of unfamiliar information is taxing or that visual clutter requires perceptual resources, one can examine various aspects of the Land Warrior System that appear to have these characteristics and evaluate different designs in terms of perceived workload. Subjective methods that are attuned to aspects of the particular design (e.g., evaluating the cursor control, using the helmet-mounted display to view maps in a navigation exercise) and are motivated by a theoretical understanding of attentional systems (such as the multiple resource model) may provide a good first-pass estimate of which aspects of the design present likely workload bottlenecks.
Primary and Secondary Task Measures
Primary task measures simply try to infer workload from performance on the task of interest. Primary task performance is obviously the critical variable. However, it isn't clear that primary task measures have much of a direct association with workload. Errors in performance do not necessarily indicate high workload imposed by the primary task. They can arise from a variety of sources, including workload levels that are too low, as in vigilance tasks in which the operator may miss signals because they are so infrequent.
A better measure of workload is provided by secondary task performance, in which spare capacity is assessed by presenting operators with occasional probe signals that require them to press a key. Probe response time should be related to
the difficulty of the primary task. For example, the difficulty of choosing various options of map presentation on the helmet-mounted display could be evaluated in terms of speed of response to occasional auditory tones.
We pointed out earlier that there are a number of difficulties associated with using secondary task performance as a measure of spare capacity. Subjects may use various strategies for dealing with what is basically a dual-task situation; some may actively prepare for the probe task despite instructions to treat it as secondary. In addition, the existence of multiple resources means that a given probe task, such as auditory detection, will vary in difficulty depending on its similarity to the primary task (due to variation in overlap of the particular resources used by each task). Finally, the secondary task can be intrusive and therefore disruptive of primary task performance. As we noted in connection with the model of dual-task interference (Pashler, 1994), even very simple and dissimilar tasks can produce interference when both of them require memory retrieval at the same time. These considerations suggest that secondary task performance should be viewed with caution as an index of workload.
Of course, soldiers using Land Warrior equipment will often be in dual-task situations. For example, a soldier may be navigating terrain with the aid of the map display and GPS when an auditory message comes in. The message has to be checked for its importance relative to the navigation task; the speed and accuracy of response to such messages would be therefore expected to be a function of the ease of use of the map system. This is an example of an embedded secondary task method, in which the tasks are presented within the context of meaningful scenarios that are motivated by the kinds of dual-task combinations that are likely to occur in operational use. The panel recommends that this sort of dual-task analysis be carried out using a variety of realistic task combinations. At the very least, this approach would provide valuable information on which kinds of dual-task situations pose difficulty. It could prove valuable in design as well as test and evaluation.
Measures of brain activity such as the electroencephalogram (EEG), have potential to reflect cognitive workload. Averaging procedures can be used to derive the event-related potential (ERP), which reflects electrical activity associated with a particular signal. For example, subjects can be required to count "target" tones of a particular pitch embedded in a series of nontarget tones having a different pitch. The targets will be associated with a particular component of the ERP known as the P300. The amplitude of the P300 is associated with how much attention the subject allocates to the signal (Israel et al., 1980). As the primary task becomes more difficult and requires more attention, the P300 associated with the target tones is reduced. This kind of trade-off between tasks in
terms of P300 amplitude has been demonstrated in several studies (e.g., Hoffman et al., 1985; Israel et al., 1980).
Measuring the ERPs of probes is clearly a variation on the secondary task method, and it is fair to ask whether the information provided by this technique is worth the additional expense and complexity associated with EEG recording. One advantage of the P300 relative to reaction time measures is that it appears to be sensitive to perceptual/central processes and is not affected by the response system. Reaction time, in contrast, is sensitive to limited capacity processes from input to response. P300 therefore offers an increase in diagnosticity over behavioral measures.
As we have pointed out, the secondary task method can be modified by using an embedded task. This is also true of the P300. A good example is provided by Humphrey and Kramer (1994), who presented a dual-task situation in which operators monitored a series of gauges for critical readings. In addition, arithmetic problems could be periodically presented on the same screen used to present the gauges. The difficulty of each task was manipulated and P300s collected in response to the presentation of information in each task. The goal was to determine whether it was possible to determine the difficulty level (and presumably workload) that subjects were experiencing by examining P300 amplitude. They found that 90 percent discrimination accuracy could be obtained by using 1 to 11 sec of ERP data. These results suggest that the P300 has the potential to serve as a real time, diagnostic index of momentary fluctuations in workload.
An additional potential application of EEG methodology is relevant to the Land Warrior System. Work at the Naval Health Research Center by Makeig and Jung (in press) has shown that EEG can be used to monitor alertness. They found that they could accurately predict, in real time, the likelihood that operators would miss occasional sonar signals by monitoring EEG power in particular frequency bands. When operators enter states of low alertness, they could be warned by a monitoring system. Given advances in the development of dry electrodes and miniature amplifiers, this development raises the possibility that real-time alertness monitoring could be included in the Land Warrior System for soldiers in vigilance situations.
There are other physiological measures available as well. For example, the electro-oculogram provides a measure of eyeblink frequency and duration over a given period of time. Blink rate frequency and blink duration both tend to decrease with higher levels of workload (Stern and Skelly, 1984). Electrocardiogram measurement quantifies a subject's heart rate and heart rate variability. Although the average heart rate seems to be generally insensitive to small or moderate fluctuations in workload (it may rise with stress, however), the variability of the heart beat (sinus arrhythmia) can be a good indication of mental workload. In general, the heart functions with less variability between beats at higher levels of workload. Various filtering techniques have been developed to
increase the sensitivity of this measure by reducing noise due to other bodily functions (Mulder, 1980; Porges, 1985; Moray et al., 1986).
In this section, we review recent attempts to develop detailed models of task performance that can provide insights into limitations of dual-task performance that would be difficult to obtain using the measures reviewed above. A thorough understanding of task interference and competition for limited capacity mechanisms will probably require a detailed simulation of how humans perform a given task. Such a simulation should provide a detailed account of the order, duration, and capacity limits of the elementary cognitive processes assembled for a given task. The model human processor (MHP) proposed by Card et al. (1983) is a good example of this approach. They specified a set of elementary information processing mechanisms (for example, making an eye movement), along with associated execution times. A detailed task analysis then made it possible to specify the particular arrangement of subprocesses involved and arrive at an estimate of the total time required to perform a given task. The MHP does a fairly good job of predicting these times, but mainly for simple tasks (Eberts, 1994). With increasing complexity, the job of specifying the path of elementary processes becomes quite difficult.
A modern predictive modeling approach, similar in spirit to the MHP, is the EPIC model (executive process-interactive control) of Kieras and Meyer (1995). Their simulation system specifies a number of input systems (eye, ear, etc.) and output systems (manual, vocal, eye movement, etc.). In general, EPIC contains a much richer set of input and output devices than MHP and can therefore be applied to a wider range of problems. In addition, decisions, transformations, and such are carried out by a set of production rules operating in working memory. A simulation is performed by first specifying the algorithm by which the task is accomplished in terms of the set of production rules required. Exposure to the task domain initiates a cycle of "firing" of the production rules and changes in input and output devices that generates the desired behavior. We note that, although the experimenter has to specify the algorithm by which the task is accomplished, this ought to be easier than specifying the detailed chain of processes required by MHP.
EPIC has been applied to the simulated task of a telephone operator interacting with a workstation to help a customer complete a call. The model did a good job of predicting the total time on task as well as times for individual keystrokes by the operator. In addition, the model showed that the major limitation on operator speed was not typing time but the rate at which the customer spoke the telephone number. These instances of model-based insights into the nature of bottlenecks illustrate that this approach can be quite useful in redesigning tasks and interfaces in an attempt to minimize interference. Creating detailed models is
a time-consuming process and requires a thorough task analysis. This probably would not be feasible for every soldier task, but it could be worthwhile for particular subsets of tasks that are critical to a mission. In the case of the telephone company, saving a few seconds on each call can result in savings measured in the millions. Savings of this magnitude on the battlefield could be the difference between mission success and failure.
As noted throughout this chapter, there are many unresolved problems concerning the assessment and interpretation of cognitive workload and its demands. Already mentioned are the unknown effects of combining physical and mental demands. In addition, many other questions have to be faced in using this form of assessment. For example, there is very little information on the differences among individuals (Damos, 1988). Furthermore, the effects of training on mental workload are still to be clarified, especially when training takes place under conditions that only simulate actual operational environments (see Hancock et al., 1994). The effects of task failure on workload are uncertain (Hancock, 1989), and strategies to improve performance through adaptive or compensatory systems have only recently begun to provide potential answers (Chignell and Hancock, 1985; Hancock and Chignell, 1987; Hancock et al., 1994). Also unknown are the relationships among stress, workload, and other aspects of performance, such as situation awareness. In particular, there is concern that, under certain conditions, workload and performance dissociate, that is workload increases but performance gets better or, similarly, workload decreases but performance worsens (Derrick, 1988; Yeh and Wickens, 1988). This is especially disturbing for those who want to use cognitive assessment of workload as a basis for design decisions and the definition of operational procedures. As technical support systems are being developed, these become vital experimental questions to be addressed.
What makes the job of the soldier different from, say, a pilot are the extreme physical demands placed on the individual. Indeed, the infantry soldier has to carry his technology with him and, unlike others, is not carried by it. The soldier has to meet the demands imposed by operational conditions, be it the climate or the theater of operations or simply the need to maintain mobility over difficult terrain.
We cannot overemphasize the significance of the fact that the lessons learned so far about human interaction with complex technologies have been garnered under quite sedate conditions. Typically, the individual is well-rested and seated in a light, well-ventilated, and comfortable situation and then asked to perform the necessary tasks. We know little about perception, action, and complex decision making during or immediately following strong physical exertion. It is essential that such testing be initiated prior to system design. One recommenda-
tion that can be made now is for tasks to be presented, not in the typical alphanumeric form but rather in graphic form. The latter are more easily and reliably solved and are better able to resist the deleterious effects of stress (Hancock, 1996b).
As we have noted in the discussion of stress, one critical characteristic is the reduction of the range of cue utilization (Easterbrook, 1959). This phenomenon has become to be known as narrowing. Although difficult to show in the laboratory, in the real world, it is a common experience. Most people can recount an incident, such as driving in bad weather, when their whole attention was focused on a single object, such as the road ahead. In experimental work, Hancock and Dirkin (1983) have shown that this is an attentional strategy, rather than a visual phenomenon (such as occurs in the restricted vision of pilots under large Gloads). As a consequence, extreme stress can restrict attentional scanning of other environmental sources.
If the source of information to which the soldier narrows is located on the helmet-mounted display, he may become ''blind" to other threats in the environment. Consequently, what is displayed to the soldier during action is a critical decision; it may be advisable to provide some cue to change attention from the information displayed in the helmet to that available externally. Ideally, these two sources would be fused so that conflicts do not occur. It is also clear that narrowing relates in some degree to the notion of situation awareness. A strong research effort is needed on the phenomenon of narrowing, especially as it relates to soldier activity involving helmet-mounted displays for advanced tactical engagements.
Improving Adaptive Response
In general, the adverse effects of stress are very difficult to combat. Repeated practice and training in appropriate conditions do provide some protection from the worst aspects of stress. However, they are far from achieving a situation in which stress does not influence performance-or even improves performance skills.
Accounts of battles in the eighteenth and early nineteenth centuries emphasize the confusion of the battlefield and the reliance on hearing rather than vision, which was totally obscured by the smoke of artillery. Confusion is an ally to the enemy. Confusion has often lost battles, and certainly it is a source of extreme stress to the infantry soldier. Therefore, any technology that serves to reduce confusion is at one and the same time helping mission performance and reducing the incidence of stress.
One factor that has been observed to affect the response to stress is the perceived degree of control over the situation. Besides reducing confusion, control dissipates stress. An individual who feels in control of a situation, is less likely to experience stress in that situation. It is anticipated that new helmet-
mounted display technologies will facilitate the distribution of control from a single operational headquarters to individual squads and even members of those squads. On the modern battlefield, decisions are directed to those with the most relevant information, on occasion, that will be the front-line soldier. This added sense of control will be a critical influence in reducing stress.
Information About Battlefield Context
There is a tendency in many technological systems today toward an ever greater degree of complexity. By complexity, we specifically mean the number of degrees of freedom in the system, which translates to the number of potential states of the system that can be communicated to the operator. With pull-down menus, window systems, and direct database accessing, it will be technically feasible in the near future to give the infantry soldier almost unlimited on-line access to all aspects of human knowledge.
Such a prospect is the paramount example of a plethora of data but no information. Information in this context is data that are directly applicable to the immediate conditions in which the soldier finds himself. On a battlefield of uncertainty, the aim of such support technology should be to reduce uncertainty, not increase it. Thus the challenge to design is to provide the appropriate information in its simplest form, just as it is needed.
One potential solution to this question lies in the framing of mission objects as perceptual-motor demands, not esoteric cognitive operations. In a recent article, Hancock (1996a) has indicated how such a metaphoric representation could be achieved. This approach would essentially superimpose a picture of the demands on the real-world view. This would minimize alphanumeric presentation and emphasize data fusion between the real and task-represented worlds. By simplifying the data presentation format, the probability that the system will be successfully deployed and operated in the stress of battle would be significantly increased.
These observations relate to the actions of the individual soldier in the operational environment. However, given the structure of the forces as currently configured, the additive contribution of information from each and every soldier to the next higher level of command promises to increase workload geometrically at each succeeding level. Consequently, if information is not filtered in some fashion, made context-contingent in some fashion, and fused in some fashion, higher levels if command will be blinded not by the paucity of data but by an overwhelming overload of it. Consequently, as this program moves forward, it is important to consider workload issues beyond the individual soldier and to address specifically how the evolving architecture of the Land Warrior technology can manipulate information at even higher levels of command to achieve mission goals.
Training, Expertise, and Individual Differences
However intuitively logical the display may first appear, training will be needed to maximize the use of this new form of soldier support. In accordance with the observations of Schneider (1985), the training should emphasize the consistency of representation and the linkage between representation and real world actions-for example, always designating the enemy by the color red. With such training, it is possible that low level attributes of stimuli used in the display, can be processed in a speedy and error-free manner. Furthermore, the extensive use of such consistency would render task performance less vulnerable to the stress of uncertain conditions (see Hancock, 1986a).
Soldiers' facility with the use of helmet-mounted displays and other advanced technologies can be expected to vary greatly among individuals. For example, some recruits from rural areas may not have much experience with comparable technologies, such as video games. Others will have spent many hours on technologies that, although designed as games, may have a strong transfer to the new operational equipment. Consequently, software configurations for initial training need to be designed with this baseline performance difference in mind.
Although individuals can be selected for specific operations, what remains to be clarified are the intrinsic abilities (e.g., superior spatial orientation) that would make some soldiers immediate experts at the task presented via a helmet-mounted display. There is a body of knowledge concerning soldier skills and initial screening batteries for performance capability; however, the proposed changes in soldier function are so great as to require further research evaluations of these capability screening issues. An initial software consideration is how to structure the displays so that soldiers find a challenge in their repeated performance and thus, like video game players, become highly proficient at their task. This requires some knowledge of task motivation and especially how to train for performance skills under extremes of stress. In essence, the problem of individual differences in response to cognitive workload-one of the most underresearched areas in all human factors-is critical to performance success. More knowledge is needed in this area to ensure the success of the new technologies.
IMPLICATIONS FOR DESIGN
We can never lose sight of the infantry soldier's primary goal of mission accomplishment. In designing a technical support system to be most conducive to effective performance, at no point should such a system hinder this primary objective. The infantry soldier must recognize the helmet-mounted display as a vital piece of his equipment that is critical to survival. Equipment that is cumbersome, unreliable, or ineffective can and will be discarded in the extremes of battle. Consequently, the thought at the forefront of design should not be the
feasibility of implementation but everyday utility to the individual soldier. If the helmet-mounted display does not work well in the stress of battle, burdening the soldier with advanced technology systems is a clear disservice.
In our consideration of the Land Warrior System's helmet-mounted display, which will be used to display maps, symbology, sensor images, and other information, we have reviewed recent work in psychology directed toward reducing the cognitive workload associated with visual displays of information. Based on visual and cognitive limitations that can make information hard to see and interpret, Wickens (1992) offers a useful list of principles of display design, along with examples. We duplicate that list here and examine a few of these principles in more detail.
- Absolute judgment. Don't require observers to make judgments on a variable such as color or brightness using more than 5-7 levels.
- Top-down processing. Don't violate expectations based on past experience.
- Redundancy gain. Presenting the same message more than once, particularly in different formats (e.g., voice and visual display) increases comprehension.
- Similarity. Similarity can cause confusion, both in perception and in memory. A useful measure of similarity is the ratio of similar to dissimilar features.
Mental Model Principles
- Pictorial realism. A display should look like the variable it represents.
- Moving parts. Dynamic aspects of displays should move in accordance with the user's mental model of what is being represented. For example, an altimeter indicator should move up with increases in altitude.
- Ecological interface design. Displays should bear a close correspondence to the environment being represented. Adhering to the pictorial realism and moving part principles should help one achieve pictorial realism.
- Minimize information access cost. Disengaging attention and the eye from one display location and moving to another require time. Information should be located in such a way that access time for frequently used information is minimized. This principle has obvious applications in design of menus for computer interfaces.
- Proximity compatibility. Sources of information that must be integrated
- should be close together in the display. In some cases they can be integrated into a single "object." Conversely, sources that don't require integration should be spaced to enhance focused attention on each source.
- Multiple resources. Dividing one's attention is easier when information uses separate resource pools, e.g., vision and audition.
- Predictive aiding. When possible, have the system aid in predicting future values. Limits in working memory mean that operators will often be busy processing current information and will fail to project what will happen in the future.
- Knowledge in the world. One way to reduce memory load is to place required information in the environment. A pilot's checklist is an example.
- Consistency. As much as possible, displays should be consistent with recently viewed displays or habits that the observer brings to the situation.
As Wickens points out, some of these principles conflict with each other. The proximity compatibility principle is a good example. Wickens and Carswell (1995) distinguish between display proximity and processing proximity. Display proximity refers to the distance between two sources of information in the user's perceptual space. Proximity here refers to more than simple physical distance. Various grouping principles such as common color, motion, and shape can unite display elements that are separated spatially. Proximity can be enhanced by combining sources into a single "object." For example, two variables could be represented as the height and width of a rectangle. Their product could then be directly perceived as the area of the rectangle.
Processing proximity refers to whether or not the operator needs to combine information from different sources. An example of high task proximity would be a process-monitoring task requiring the operator to integrate two variables, such as rate and duration. In other cases, the operator might have to monitor two sources that are independent and therefore have low processing proximity. Compatibility between these two ways of defining proximity is clearly desirable. Close processing proximity can be enhanced by close display proximity and similarly for distant processing and display proximity. Wickens and Carswell describe several display techniques that are designed to manipulate display proximity and achieve the desired goals in terms of the operator's task.
The principle of display proximity is particularly relevant to the helmet-mounted display. Wickens and Carswell (1995) point out that head-up displays in aircraft usually lead to better performance than the older head-down format. One clear advantage is display proximity. The symbology presented on the head-up display (near domain) is closer to information in the outside world (far domain) than is the case with the head-down format, producing an advantage in terms of access costs. This advantage is even greater for near and far domain information that is "conformal." This follows from the definition of display proximity. Making information conformal helps integrate information across the
two domains. In contrast, information in the two domains that is nonconformal will tend to be segregated, enhanced focused attention. This may be one reason that attention to head-up display symbology may sometimes result in missing rare events in the far domain.
Many of these same principles can be used to guide the creation of maps and graphs that are easy to comprehend. Detailed recommendations for these domains can be found in Kosslyn (1994). A recent book by Travis (1991) provides guidelines for the effective use of color in displays. Shah and Carpenter (1995) present several studies aimed at determining how subjects represent graphical information in working memory.
This review makes clear the insufficient state of knowledge about the stress and cognitive workload of individuals performing physical and cognitive tasks at the same time. Although there is extensive information about physical effort and physical workload, as well as some insight into cognitive workload, we do not yet know enough about their combined effects to make definitive pronouncements relevant to future infantry operations. One reason is that current models of stress and performance are insufficient to determine how combined physical and mental effort may affect performance. Clearly, a strong effort is needed in basic stress and performance research.
Another critical question in need of more research is how to provide the soldier with task-related information and how to suppress extraneous information so as to avoid information overload. Even with the best designed equipment, excessive information presented via a helmet-mounted display threatens to induce "cognitive capture," in which the individual becomes oblivious to the threats of the external environment. Information must be presented in a way that does not dominate the individual's attention. This clearly relates to research on situation awareness, as discussed in Chapter 3.
Finally, better understanding is needed of the individual differences in responses to cognitive workload and the relation of those responses to standardized training strategies. One strategy for dealing with such differences mentioned in this chapter is the use of customized adaptive interfaces that can be tuned to the particular user. However, we still need to know more of what drives individual perceptions of load, especially under operational conditions.
- Whenever possible, provide graphic representations of the problem, since graphics are processed more efficiently and accurately under stress.
- Whenever possible, simplify the graphic representation of the task do-
- main, since the task itself is a stress in addition to the threat conditions. Simplification of decisions will reduce workload in high-load situations.
- Present salient information in the center of the visual display. Information on the periphery of the display will be lost in high-stress conditions.
- Reserve the presentation of complex alphanumeric information for premission phases and post-mission debriefings unless absolutely necessary.
- Whenever possible, use redundant auditory and visual warnings for threat location.
- Whenever possible, use visual presentations for detailed communication of information.
- Provide global help functions (e.g., location of nearest friendly force) at all times.
- Reduce menu and data entry requirements to a minimum, especially during engagements.
- Provide a restricted, understandable icon set to reduce stress in processing.