Evaluation of Architectures
Keeping bombs off of aircraft is the primary measure of the performance of a TAAS (total architecture for aviation security). Improved security components recently deployed to minimize the probability that a bomb can be placed on an aircraft by a terrorist include CAPS (computer-assisted passenger screening), PPBM (positive passenger-bag matching), HULDs (hardened unit-loading devices), TEDDs (trace explosives-detection devices), noncertified bulk explosives-detection equipment, and EDSs (FAA-certified bulk explosives-detection systems). Blind operational testing using realistic simulated bombs will be necessary to evaluate the effectiveness of the overall aviation security system. At present the FAA has only limited blind-test data on EDSs and even less data on other components of the TAAS. Test results on all TAAS components will be necessary for a systems analysis that can estimate the results of full-scale field testing of the entire TAAS. A sensitive measure of effectiveness, such as the SEF (security enhancement factor), could be used to assess improvements to TAAS components and reduce the complexity of evaluating the whole system.
A good measure of the improvement from implementation of new security measures is a reduction in the number of simulated explosives brought onboard aircraft compared to the number brought onboard under a previous (or baseline) TAAS. Realistic operational testing can be used to assess how well TAAS components prevent simulated bombs from getting through security and onto aircraft. The current MBTS has sufficient dynamic range to test the bulk explosives-detection equipment components of the TAAS, but similar test articles are not available for testing TEDDs. Because TEDDs are used to resolve alarms by bulk explosives-detection equipment, evaluating TEDD performance in an airport setting is crucial. Because it may not be practical to contaminate the test sets with trace amounts of explosives, another test device must be developed.
Security Enhancement Factor
The purpose of the SEF is to develop a measure of security enhancement based on changes in the number of bombs that defeat the TAAS and are brought aboard an aircraft. The examples that follow are not representative of the actual performance of security equipment but suggest methods of analyzing performance. The SEF is defined in terms of the ratio of the number of bombs getting through a defined baseline TAAS to the number of bombs getting through a modified or upgraded TAAS. If there is no improvement in the performance of the TAAS, the SEF = 1. If half as many bombs get through the new TAAS, SEF = 2. If the upgraded TAAS prevents all bombs from getting through, the SEF would be infinite. For example, assume that the FAA tests a baseline TAAS with 400 simulated bombs and 100 of them defeat the TAAS and make it onboard an aircraft. If the 400 MBTSs are then put through the improved TAAS and only 50 get through, the SEF would be 2 (because half as many bombs defeated the improved TAAS). Thus, the SEF is a system-level measurement of the performance of the total TAAS.
As was discussed in Chapter 2, the baseline system in an SEF measurement will have to be redefined as new and improved equipment becomes available and as the threat evolves with time. When new equipment is deployed (perhaps to address a new threat), measuring its impact on the TAAS may involve more than measuring the SEF with this equipment "plugged in" to the existing baseline security architecture. The sequencing of TAAS components could affect the number or configuration of threats (explosives) that are screened by the new component, thus influencing the impact of the new component on the SEF. Several combinations of potential TAAS elements may have to be assessed.
Eliminating False Negatives
The critical factor for improving the SEF (and therefore aviation security) is reducing the number of false negatives
(i.e., missed detections). Because real events (i.e., bombing attempts) are currently rare, the only way to ensure that the security system works is through realistic training and blind testing. If the TAAS is credible, deterrence may reduce the likelihood of a real attack. Thus, in principle, the more competent the TAAS is perceived to be in defeating bombing attempts through testing and training, the less likely a terrorist attack is to occur (see Figure 10-1).
When the Pfa is relatively high (e.g., higher than the airlines are willing to accept in daily operations), the large number of alarm resolutions will make it more difficult for operators to distinguish a real bomb from a false alarm. Operator decisions are currently necessarily frequent, must be made relatively quickly, and are biased toward clearing the alarm because of the infrequency of test events and the even rarer actual bomb threats. Thus, the probability of an operator missing a bomb during the alarm-resolution process is not insignificant. Improving operator performance will require substantial regular training under airport operational conditions.
Relationship between the Certification of Security Equipment and TAAS Performance
The FAA only certifies EDSs that meet their requirement for Pd, Pfa, and throughput rate, which are determined during certification testing with the candidate EDS in an automatic mode (i.e., without operators) at the FAA Technical Center. Under operational conditions at an airport, however, operator intervention has lowered both the Pfa and Pd of the FAA-certified In Vision CTX-5000 SP (see Table 6-3) (FAA, 1997a, 1997b). The panel observed that operators resolve more than 30 times as many alarms for the CTX-5000 SP by using the display as they do by actually opening bags. Although the capability of the operator to identify bombs on the EDS display is not part of the current certification process, it is one of the most important aspects of reducing false alarms and false negatives. According to current certification standards, a future system with a lower spatial resolution (i.e., less capacity to resolve individual objects) could conceivably be certified in an automatic mode and yet be incapable of providing an image that could be used by an operator for resolving alarms. Therefore, the panel believes that the combined performance of the operator and equipment should also be qualified or certified in an airport environment.
Architectures for Aviation Security
There are two basic philosophies for aviation security architectures. The first, the detection-first (DF) philosophy, mandates the detection of explosives at specified levels at the expense of other factors. The second, the throughput-first (TPF) philosophy, emphasizes the efficient throughput of bags through the baggage-handling system and considers
detection rates secondary. In DF security architectures, the EDS with the highest Pd is usually the first piece of equipment to screen checked baggage. In TPF architectures, the explosives-detection device with the highest throughput is placed first, sometimes at the expense of Pd. No analyses were presented to the panel of the conditions under which DF systems are more effective than TPF systems. Security systems in Europe are generally TPF systems; security systems in the United States and Israel are DF systems. The panel investigated deployed DF systems at airports in San Francisco, Los Angeles, and New York City. At JFK Terminal One, a TPF system has recently begun operations serving European air carriers. The threat vectors addressed in the two aviation security systems are shown in Figure 10-2.
The primary difference between a DF and a TPF system is the way bags are selected for more thorough scrutiny. Figure 10-3, which represents a generic checked-baggage system, shows the similarities and differences between DF and TPF systems. In the United States, the FAA has established a system for screening passengers (CAPS) that separates airline passengers into high-risk and low-risk groups. High-risk passengers (selectees) are subjected to more thorough screening. Most European airports use a TPF approach because of the difficulties of interviewing (for CAPS) the high volume of international passengers. Instead, all bags are screened, resulting in increased volume of baggage passing through security equipment (e.g., EDS) and placing a premium on throughput. Thus, as Figure 10-3 shows, in a TPF system the first level of security is the screening of all checked bags by a high-throughput explosives-detection device (mostly high-throughput, noncertified, x-ray explosives-detection devices) with a lower Pd than a certified EDS; the first step in the DF approach is CAPS. For both DF and TPF systems, the second step is the screening
by a certified EDS (e.g., CTX-5000 SP) of bags not cleared in step one. The alarm-resolution procedures are similar for both approaches; and passenger screening for weapons and the inspection of hand-carried baggage are also similar.
Improvements in aviation security from the deployment of new security components (e.g., TEDDs or EDSs) are not directly related to their performance. All of the components of the TAAS must be considered together, either in a total-system blind test or by systems analysis to determine the SEF. The utilization rate of TAAS components, their reliability, the regulatory guidance provided by the FAA for their use, and human factors also figure into the SEF. A terrorist's perception of the effectiveness of the TAAS must also be considered because, ultimately, the terrorist's perception of the TAAS, rather than the actual performance, determines the level of deterrence. See Box 10-1 for a notional example of how the SEF can be used to estimate the performance of deployed equipment.
A HULD for wide-body jets (e.g., the Boeing 747) is currently being operationally tested by three airlines. One HULD is probably large enough to hold all of the selectee bags for a given flight (note that it is unlikely that more than one selectee bag would contain a bomb, or bomb test set, on a given flight). A bomb at the minimum threat weight would be contained in the HULD, which has a high probability of saving the aircraft; thus, the HULD would improve the SEF. In the notional example in Box 10-1, 46 out of 100 passenger bags that contain MBTSs would be missed by the baseline DF security system. Five of the 46 bags were missed by CAPS (i.e., five simulated terrorists were not identified as selectees) and, therefore, the use of HULDs to hold selectee bags would not address these five bags. However, all 41 selectee bags not detected or improperly cleared by explosives-detection equipment would be contained in a HULD (although probably not all in the same HULD). The five bags containing MBTS devices that were not selected would not be detected or put into the HULD. Thus, the SEF in Box 10-1 would be 100/5 = 20.
The proper combination of technologies in the right order can dramatically improve the SEF. EDSs can more easily detect larger bombs (i.e., the larger the bomb the higher the Pd), whereas the HULD becomes less effective as the size of the bomb increases. A combination of EDSs and HULDs can, therefore, reduce the overall risk of aircraft loss. With an EDS in place, a terrorist would be forced to use smaller bombs; without an EDS, the terrorist might use a larger bombs than the HULD could contain. Thus, EDSs and HULDs are mutually dependent for improving aviation security.
Substantial improvement of the TAAS might involve trade-offs between some parameters of EDS performance under operational conditions if the TAAS includes HULDs. The optimum security system might trade off Pd, Pfa, and throughput rate with the capability of the HULD to contain explosives of a particular size.
Most passengers in the western hemisphere fly on narrow-body aircraft (e.g., Boeing 717, 727, 737, 757). At present, no HULDs are available for use on narrow-body aircraft. Therefore, improving the SEF for flights on narrow-body aircraft would require a substantial improvement in the ability of EDS operators to detect explosives in alarmed bags. The FAA has deployed TEDDs with some EDSs to assist operators in resolving alarms. The TEDD can be used to sample electronic devices (e.g., laptop computers, radios) that are hard to resolve when the bag is opened and manually inspected. If it is assumed that 25 percent of the bags mistakenly cleared by an operator contain bombs concealed in electronic devices, a TEDD should reduce the number of bags mistakenly cleared. In the example in Box 10-1, 36 of the 100 bags that contain a test bomb (simulated explosive) are mistakenly cleared by the operator. Therefore, if 25 percent of the 36 bags the operator mistakenly cleared were opened and tested with a TEDD, nine of them would be sampled with a TEDD. In this example, the electronic device is assumed to be contaminated with a detectable level of explosive material, and the Pd of the TEDD is about 80 percent. Thus, eight of the nine simulated explosives would be identified. Thus, using a TEDD with each EDS would improve the SEF (SEF = 100/(5 [missed by CAPS] + 5 [missed by EDS] + 36 [mistakenly cleared by EDS operator] – 8 [detected by TEDDs]) = 100/38 = 2.63) by about 20 percent. If the protocol were altered so that every bag alarmed by the EDS were screened with a TEDD, each of the 90 bags containing test bombs that would normally be alarmed by the EDS (in the example in Box 10-1) would be subsequently subjected to a TEDD. For this example, it is assumed that the operator still detects 60 percent of the MBTSs alarmed by the EDS (i.e., 54 would be detected by the operator, and 36 would be missed). The TEDD would detect 80 percent of the 36 missed by the EDS operator (i.e., 28 would be detected and 8 would be missed). Thus the SEF = 100/(5 [missed by CAPS] + 5 [missed by EDS] + 8 [missed by TEDD and EDS operator]) = 5.6.
In the notional example in Box 10-1, 46 out of 100 passenger bags that contain test bombs would be missed by the DF security system. Five of the 46 bags that were missed were missed by CAPS (i.e., five simulated terrorists were not identified as selectees), an additional five were missed by the EDS, and 36 were mistakenly cleared by the EDS operator. If the FAA required operators to achieve an 80 percent detection level for MBTSs alarmed by an EDS, only 28 (i.e., 20 percent of 90 bags containing MBTS alarmed by an EDS  + 5 missed by CAPS + 5 missed by EDS) MBTSs in the example in Box 10-1 would be missed. The SEF would be 100/28 = 3.6.
If potential terrorists cannot be identified by passenger screening,1 it may be necessary to screen all bags. TPF systems rapidly screen all bags. Typically, a noncertified explosives-detection device is used for the initial screening; the Pd is as high as possible while maintaining a high throughput rate. The Pd of these devices is lower than the Pd of certified EDSs, but their throughput rate is much higher. If a bag cannot be cleared, it is sent to a more sensitive system, such as the FAA-certified CTX-5000 SP, where an operator can take more time to resolve the alarm. An example of the impact of a TPF system on SEF is shown in Box 10-2. The effects on the TAAS of DF and TPF systems are shown in Table 10-1 (based on the same assumptions used in Boxes 10-1 and 10-2).
Comparison of Detection-First and Throughput-First Approaches
The analysis presented in this chapter is based on the assumption that the rate of initial detection using CAPS in a DF system will be higher than with a bulk explosives-detection device in a TPF system. Thus, the key parameter in comparing DF and TPF approaches is how well the first level (i.e., CAPS vs. a high-throughput explosives-detection device) detects a threat. As long as CAPS is more effective, the DF approach will have a higher SEF than the TPF approach (see Figure 10-1). Note that no quantitative evidence is available to indicate whether CAPS is more or less effective than bulk explosives-detection equipment.
CAPS is less costly than screening all bags and reduces the complexity of the TAAS, but the capability of screening passengers and not missing potential terrorists has not been tested. The U.S. air transport system has a very large domestic component and many ways to identify nonthreatening passengers (e.g., airline frequent-flyer programs). The impact on the SEF of CAPS in combination with various other security measures is shown in Figure 10-4.
Conclusions and Recommendations
Preliminary analyses suggest that the deployment and utilization of new security equipment could substantially
1 The airlines with the highest potential terrorist threat believe that interviewing is the best way to identify passengers with baggage that may contain a bomb.
improve aviation security as measured by the SEF. Security would be enhanced more where the threat is highest. The performance of the TAAS can only be assessed with systems analysis techniques that account for the contribution and sensitivities of each component. The certification and optimization of a subsystem (e.g., EDS) do not necessarily optimize the TAAS. The overall system must be tested frequently in airport operations by blind testing procedures to provide data for the system analyses and to train operators.
Analysis of the Total Architecture for Aviation Security
Most data on deployed aviation security systems are incomplete. The TAAS performance used for the preliminary systems analysis described in this chapter was based on very limited quantitative data and some anecdotal information. However, the examples indicate the types of data necessary for a more thorough TAAS-level analysis of aviation security. The panel concluded that data should be collected over a range of conditions to evaluate the impact of component performance (e.g., EDSs, TEDDs) on TAAS performance. Data collection should be continuous, professionally staffed, and well funded.
The FAA should design and implement a mechanism for collecting data to ensure that sufficient data are available for substantive analyses of the total architecture for aviation security.
The FAA should collect comprehensive total architecture for aviation security performance data based on operational blind testing as a basis for measuring improvements in total architecture for aviation security performance levels under airport operational conditions.
The discussion of TPF and DF approaches in this chapter was based on the assumption that CAPS is a more effective way to identify threat passengers (i.e., passengers with explosives concealed in their baggage) than a high-throughput explosives-detection device was in finding explosives concealed in baggage. However, a thorough comparison of the TPF and DF approaches has not been completed. The panel concluded that an analysis that includes measures of the effectiveness of CAPS and high-throughput explosives-detection equipment should be done to determine the relative impact of the two approaches on overall aviation security.
The FAA should conduct a thorough analysis of the effectiveness of throughput-first and detection-first approaches. The security enhancement factor (or a similar measure) should be used as a basis for comparison.
To increase the deterrent effect of the overall security system, analysis, testing, and training should be visible to the public. The detection of simulated terrorists threats should be publicly acknowledged but with only enough detail to create uncertainty about the performance of the overall security system in the terrorist's mind.
Although the FAA has not defined how a HULD should be used to increase security, the simple analysis in this chapter suggests that using one HULD on each wide-body aircraft would increase the SEF almost tenfold. The panel concluded that the greatest benefit would be derived if the bags of all CAPS selectees were placed in a HULD, even if their bags were subsequently cleared by other security procedures.
When hardened unit-loading devices (HULDs) become more widely available and can be integrated into air carrier systems without unreasonably disrupting operations, the FAA should require that all selectee bags for wide-body aircraft be placed in HULDs. A similar requirement should be established for narrow-body aircraft when HULDs for them become available.
The operational performance data available to the panel for TEDDs were anecdotal and, therefore, not sufficient to establish the optimum use of TEDDs in the context of the TAAS. However, even anecdotal and superficial data suggest that TEDDs do improve the SEF.
The FAA should develop procedures and test capabilities to assess the operational performance of trace explosives-detection devices in an airport environment.
Security Enhancement Factor
The SEF described in this report is based on minimizing false negatives for the TAAS as a whole. The SEF (or a similar measure) can also be used as a tool to assess the effectiveness of various components of the TAAS.
The FAA should develop a systems analysis to evaluate the total architecture for aviation security in terms of a security enhancement factor (SEF) or a similar measure. As a first step, the FAA should use the data from blind testing. The SEF estimated by these results could be used to assess the benefits of newly deployed systems.
Decisions to deploy new equipment and implement new procedures should be based on a full analysis of the total architecture of aviation security, including the security enhancement factor or a similar measure.