Summary
The National Research Council (NRC) of the National Academies was asked by the National Aeronautics and Space Administration (NASA) to perform an independent assessment of NASA’s National Aviation Operations Monitoring Service (NAOMS) project, which was a survey administered to pilots from April 2001 through December 2004. To conduct this review, the NRC established the Committee on NASA’s National Aviation Operations Monitoring Service (NAOMS) Project: An Independent Assessment, consisting of experts from the fields of aviation safety, aviation operations (including several pilots), survey methodology, and statistics. The committee reviewed various aspects of the NAOMS project, including the survey methodology, and conducted a limited analysis of the publicly available survey data.
Sample surveys have been used routinely by federal agencies to collect and analyze data in order to inform policy decisions and assess national needs. They can also be used effectively to provide statistically valid information on rates and trends of events (such as bird strikes or rejected takeoffs) that are potentially related to aviation safety. In this context, surveys have several advantages over other sources of data: for example, they could provide reliable information about all segments of civilian aviation and characterize the safety of general aviation (GA) flights and the safety of the flights of other segments of aviation where data are limited. Further, government-sponsored surveys can produce data that are accessible to the public and can be analyzed regularly and independently. However, past experience in the government sector indicates that successful large-scale surveys typically require a substantial commitment of time and resources to develop, refine, and improve the survey methodology and to ensure that the survey provides useful and high-quality data.
Several aspects of the NAOMS survey design were consistent with generally accepted practices and principles in survey design, and the committee finds these aspects to be reasonable and appropriate. These include the choice of a cross-sectional design, the computer-assisted telephone interview (CATI) method, and the use of professionally trained interviewers. A CATI system has the potential to incorporate checks for unlikely or implausible values during the interview process. However, the committee found that substantial fractions of the reported non-zero counts of events and reported flight legs and hours flown had implausibly large values, suggesting that the NAOMS survey did not take full advantage of this feature of CATI. The NAOMS team also faced challenges in the choice of the sampling frame and had to make compromises at several stages. Unfortunately, the use of the publicly available Airmen Certification Database for the sampling frame and the criteria used to draw the sample of pilots in the air carrier (AC) survey led to biases in the sample, with an over-representation of wide-body aircraft and an under-representation of small aircraft. While the choices and compromises by NAOMS may have been for
good reasons, the team should have investigated their potential impact as well as the magnitude of biases resulting from failure to locate sampled pilots and other forms of nonresponse. In particular, the collection and analysis of supplemental data during the early phase of the survey would have enabled a reliable assessment of the various biases and may have led, if necessary, to the development of alternative design strategies.
The committee identified deficiencies in the structure and wording of the questions used in the survey. Some of the questions asked pilots for information that they would not have had without a post-flight analysis. Other questions had complex structure or multiple parts or used vague phrases to describe the events that the survey was attempting to measure. In addition, both the AC and the GA questionnaires asked respondents to include events, flight hours, and flight legs in segments of aviation that went beyond even the broadest definition of AC operations and beyond the conventional definition of GA operations. As a result, highly disparate segments of the aviation industry were aggregated into the safety-related event rates that were calculated from the AC and GA surveys. Finally, the inability to link safety-related events to the aircraft type or to the type of operating environment in which the event occurred severely hinders any meaningful analysis of event rates or trends in event rates by aircraft type or by segment of aviation.
The committee did not have access to the original survey data. The redacted data sets have several limitations that further constrain the ability to analyze the data to meet the committee’s objectives. For example, the time of survey response was grouped into years, so estimates of event rates could be computed only by years. This limits the ability to track changes in event rates over shorter timescales, to determine the effects of changes in the aviation system on event rates, and to assess seasonal and similar types of effects. In addition, grouping the exposure data (number of hours and flight legs flown) into categories increases the uncertainty in the estimates of event rates broken down by key characteristics, such as pilot experience and plane type. Issues associated with preserving respondents’ anonymity and confidentiality and with the public release of data have been known in the survey community for some time, and these issues should have been anticipated and addressed at the design stage of the NAOMS project.
The committee’s limited analysis of the redacted data revealed serious problems with data quality: substantial fractions of the reported non-zero counts of events had implausibly large values, and respondents often rounded their answers to convenient numbers. The extent and magnitude of these problems raise serious concerns about the accuracy and reliability of the data. In the committee’s view, some of these problems could have been reduced substantially if more effort had been spent on ensuring data accuracy during the interview and data-entry stages and if respondents had been asked to refer to their logbooks when possible. This would have been especially useful in providing reliable information on the number of hours flown and the number of flights (takeoffs/landings) and in helping to confine the answers to the recall period. The committee does note that many of the biases that are relevant for estimating event rates would be mitigated for trend analysis to the extent that the biases remain relatively constant over time. However, the degree of mitigation might vary substantially across event types.
The committee did not find any evidence that the NAOMS team had developed or documented data analysis plans or conducted preliminary analyses as initial data became available in order to identify early problems and refine the survey methodology. These activities should be part of a well-designed survey, especially a research study to assess the feasibility of survey methodology in aviation safety.
Given the deficiencies identified, and despite some methodological strengths of the NAOMS project, the committee recommends that the publicly available NAOMS data should not be used for generating rates or trends in rates of safety-related events in the National Airspace System. The data could, however, be useful in developing a set of lessons learned from the project.