Skip to main content

Currently Skimming:

1. Psychological Testing and the Challenge of the Criterion
Pages 15-30

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 15...
... It is common for major transition points in schooling and in working life to be marked by some sort of standardized test or assessment procedure. Beginning with the assessment of school readiness among preschoolers, the individual faces a long succession of tests designed to do such things as establish grade-level progress; diagnose learning difficulties; track pupils; allocate places in special programs or magnet schools; provide a measure of institutional accountability; ensure that those receiving a high school diploma have achieved minimum levels of competency; screen applicants for admission to college, training programs, the military, or entry-level jobs; and, in many lines of work, certify the achievement of mastery levels for advanced or specialized positions.
From page 16...
... Testing is the primary means of competitive selection in federal, state, and municipal merit systems, an outgrowth of the reformist beliefs of the Progressive Era that ability and not political cronyism should be the grounds for selection into the civil service. Many private-sector employers look to tests of general or specialized abilities as an important part of human resource management; widespread concern in the business community about the shortcomings of American education has in recent years increased the attraction of testing among employers who hope to maintain their competitive advantage through more effective personnel selection.
From page 17...
... It is essential, however, to appreciate that psychological testing in its American manifestation is a combination of high science and practical purpose, of experimentalism and the correlation coefficient on one hand and human resource management on the other. The Science of Testing The psychometric approach to human abilities owes as much to physics as to philosophy.
From page 18...
... Moreover, Pearson's product-moment correlation, which is the heart of test validation, is an average value expressing the amount of consistent deviation from the means that the same group of people shows on two measures. As Irvine and Berry conclude, psychological testing in America "takes physics and moments around a point not just as an analog, but as a model for an exact science of the mind" (1988:111.
From page 19...
... As early as 1908, Alfred Binet had urged the testing of conscripts to eliminate detectives; careful scientist that he was, he cautioned that preliminary trials would be necessary to see if the test eliminated those individuals actually found to be inefficient in the Army (Peterson, 1925:2921. But Yerkes and his colleagues on the American Psychological Association Committee on Examination of Recruits had a more ambitious vision of the contribution of testing to the war effort.
From page 20...
... In 1921, the developers of the Army Alpha published a revised version called the National Intelligence Test. Within two years almost 1,400,000 copies were sold, and by 1923 a total of 40 different intelligence tests were available nationally (Freeman, 1926; Pass, 19801.
From page 21...
... But for all that aptitude or ability testing has been an American preoccupation since the 1920s, providing "a powerful organizing principle, a way of ordering perceptions, and a means for solving pressing institutional and social problems" (Fess, 1980) , comparatively little systematic attention has been devoted over the years to understanding and measuring the kinds of human performance that tests are commonly used to predict.
From page 22...
... put it, "on criteria that are predictable rather than appropriate." The dangers inherent in using inadequate performance measures were brought into focus in Captain John Jenkins' report (1950) on the tests used to select and classify Navy and Army air crews during World War II.
From page 23...
... But success in training may well have no bearing on combat proficiency. Psychologists in the Navy aviation program, under the leadership of Captain John Jenkins, spent a good deal of time thinking about the problem of developing better performance criteria, criteria that would show whether the selection and classification tests were identifying the good combat pilots or just pilots with the characteristics needed to succeed in training (Jenkins, 19461.
From page 24...
... A total of 2,872 experienced combat pilots in the Pacific theater was involved in the final study, and the names of 4,325 nominees were obtained. Of these, 40 percent were nominated more than once.
From page 25...
... , and the 1970s saw a number of large-scale efforts to develop taxonomies of human performance (see Fleishman and Quaintance, 19841. Captain Jenkins remarked in 1946 that a review of the literature published between 1920 and 1940 would turn up hundreds of articles on the 3 The report drew the same conclusion about the Army Air Force combat criterion program, which investigated the relationship between test variables and a number of combat criteria in the categories of strike photo studies, administrative action studies, and ratings of combat effectiveness.
From page 26...
... Until 1980, the primary focus of research on performance ratings was the rating instrument, its measurement properties, and standardization of raters to reduce error. An enormous amount of professional energy was expended on the quantitative expression of rating error and the control of error variance through improvements in rating technology.
From page 27...
... Hence scale development is based on a careful job analysis and the identification by job experts of examples of effective and ineffective performance. An even more elaborate attempt to control rater error is seen in the mixed standard scale developed by Blanz and Ghiselli (19721.
From page 28...
... By understanding how evaluators process performance information and by what heuristic their judgments are stored in memory, the researchers with a cognitive bent hoped to come up with devices for improving those judgments. One of the early findings of the work on the cognitive processes of raters is that, whether the rating instrument is cast in terms of behaviors or personal traits, evaluators appear to draw on trait-based cognitive models of an employee's performance, and that these general impressions substantially affect the evaluator's memory and judgment of actual work behaviors (Landy and Farr, 1983; Murphy et al., 1982; Ilgen and Feldman, 1983; Murphy and Jako, 1989; Murphy and Cleveland, 1991~.
From page 29...
... In part, it is because most people think that test scores have some inherent meaning. The nomenclature surrounding testing ability tests, vocational aptitude tests, intelligence tests—has masked the degree to which meaning is derived, not from a deep understanding of what ability or intelligence is, but from the calculation of external relationships among variables of interest, e.g., the consistency with which the distribution of individuals' test scores around the mean is replicated on a criterion measure.
From page 30...
... If the policy maker reaches decisions about the health of the education enterprise on the basis of data from tests designed to evaluate basic skills mastery, no matter how suitable the tests for the latter purpose, those decisions are likely to misguided. The search for new and better measures of job performance has not, of course, been limited to the performance evaluation strategies described above.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.