Skip to main content

Currently Skimming:

5 Setting Reasonable and Useful Performance Standards
Pages 162-184

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 162...
... Since that time, results from most of the main NAEP assessments have been reported not only in descriptive terms summary scores that reflect what students know and can do in NAEP's subject areas but in evaluative termspercentages of students that reach specific levels of performance defined by what students should know and be able to do. In keeping with its historic commitment to reporting results on metrics understandable to policy makers and the public, NAGB has used these performance standards, more commonly known as NAEP 162
From page 163...
... NAEP PERFORMANCE STANDARDS AND THE ACHIEVEMENT-LEVEL-SETTING PROCESS Goals of Standards-Based Reporting As described earlier, in the 1970s and early 1980s, NAEP reports were built around the assessment materials themselves; by displaying individual assessment items and associated student performance data, initial reports allowed NAEP users to review the types of tasks students could and could not do. Since the implementation of the first redesign of NAEP in 1984, data on item responses have been summarized across items to provide a picture of overall performance for the nation and for key demographic subgroups.
From page 164...
... Many, including NAGB and Congress, contend that standards-based reporting metrics hold more meaning for policy makers and other NAEP users than do reports on the current, arbitrary 300-, 400-, or 500point reporting scales. Proponents believe that standards-based reporting facilitates communication and understanding of achievement results, stimulates public discourse, and serves to generate support for education.
From page 165...
... The Achievement-Level-Setting Process During the development of frameworks for each of the main NAEP subject areas, NAGB's policy definitions of achievement levels are applied, resulting in more detailed subject-specific descriptions of performance expectations for each of the three achievement levels; these are known as the "preliminary achievement-level descriptions." As an integral part of the framework, these descriptions are intended to provide guidance for the development of items and tasks for the assessment. After the administration of the assessment, these performance standards are applied to the assessment results in a process known as achievement-level setting, the outcome of which is the reporting of NAEP results in terms of percentages of students performing at basic, proficient, and advanced achievement levels.
From page 166...
... On past NAEP assessments, notable differences were observed in the outscores set for each achievement level depending on item difficulty, number of score levels specified in the item scoring rubrics, and response format e.g., multiple choice versus constructed response. Method variance of this kind is problematic because it renders outscore locations dependent on the mix of item types in the assessment, in addition to rendering questionable the verbal description of the meaning of achievement at each level (National Academy of Education, 1993a, 1996; Linn, 1998~.
From page 167...
... . Neither the descriptions of expected student competencies nor the exemplar items are appropriate for describing actual student performance at the achievement levels defined by the outscores.
From page 168...
... 1996 SCIENCE ACHIEVEMENT-LEVEL SETTING The NAEP achievement-level-setting process has evolved over time, in part in response to the evaluations summarized above, although variants of the modified Angoff procedure have remained in place. In accordance with the committee's charge from Congress, we reviewed the processes used to develop achievement-level descriptions and set achievement levels for the 1996 main NAEP science assessment, the most recent achievement level setting to be completed.
From page 169...
... and reiterated the findings of previous evaluation panels that cutscores derived through the current process lead to unreasonable results. Thus, in the case of science, NAGB's own examination of the external comparative data led it to the same conclusion that multiple evaluation panels had reached: that the results of the achievement-level-setting process were not believable.
From page 170...
... Given this science panel's conclusions, NAGB's executive committee decided later in April to defer issuing the interim science achievement levels, which had been scheduled to be issued with the NAEP 1996 Science Report Card in early May (O'Sullivan et al., 1997~. In June 1997, NAGB impaneled another group of science educators to examine the full range of items that mapped to each new achievement level, determine the knowledge and skills assessed by these items, and author descriptions based on their observations of the items and student response data.
From page 171...
... In a study of press reports from the 1994 main NAEP reading assessment and the 1996 main NAEP mathematics assessment, Barron and Koretz (in press) found that achievement levels were the most popular reporting metric, with the most commonly reported statistic being the percentage of students reaching the proficient level.
From page 172...
... reported NAEP state-level mathematics and science results entirely in terms of the percentage of students at or above the proficient level of achievement, even though the initial NAEP science Report Card had presented results only on the numeric proficiency scale (O'Sullivan et al., 1997) and even though the report of science achievement results provided achievement-level descriptions of student performance based on what students "can do" rather than what students "should be able to do." State assessment programs also increasingly are taking NAEP's lead in reporting by performance standards.
From page 173...
... In summary, despite the continuing serious failings of the current standardsetting process, NAEP should continue its commitment to finding valid and useful ways of reporting standards-based achievement results. The ability to evaluate whether achievement results meet well-defined expectations as embodied in achievement standards is likely to enhance the usefulness of NAEP results for policy makers, and, in addition, the detailed descriptions of student performance that accompany the achievement levels are potentially useful to educators.
From page 174...
... We are concerned that the process by which the science achievement levels were set is not readily replicable, primarily because the criterion used to judge reasonableness and the rules or process used to make adjustments when initial results failed the reasonableness criterion are not well documented. The report mentions TIMSS, advanced placement information, and NAEP results from other disciplines as points of comparison in judging the reasonableness of the proposed science achievement levels.
From page 175...
... NAGB's own rejection of the results of the science achievement-level setting and the imposition of their own judgment to set final levels demonstrates the critical need for an alternative paradigm and methods. We recommend that the current model for setting achievement levels be abandoned.
From page 176...
... Role of the Preliminary Achievement-Level Descriptions The function of preliminary achievement-level descriptions in assessment development for the main NAEP assessments has not been not well specified or well documented. The current science assessment was the first NAEP subjectarea assessment for which preliminary achievement-level descriptions were developed along with the frameworks and, even so, they were somewhat of an afterthought.
From page 177...
... of the preliminary and final achievement-level descriptions for eighth-grade science at the proficient level.1 The preliminary description on the left is quite general and {As noted previously in this chapter, the final science achievement-level descriptions were unusual in that they were developed inductively from item-level data using behavioral anchoring methods after NAGB had reset the achievement levels. NAGB warns against comparing these descriptions to the preliminary descriptions or descriptions for achievement levels in other subject areas because of this difference in how they were developed.
From page 178...
... 178 GRADING THE NATION'S REPORT CARD TABLE 5-1 Analysis of Preliminary and Final Achievement-Level Descriptions for the Grade 8 Proficient Level Preliminary Final Experiments and Data 1. Collect basic information and apply it to the physical, living, and social environments 4 Design experiments to answer simple questions involving two variables 5.
From page 179...
... Panelists include teachers and curriculum specialists in the subject area for which achievement levels are being set, as well as members of the public, many of whom apply knowledge of the subject area in their work. The composition of the achievement-level-setting panels has been specified in detail and the process for identifying participants has been carefully planned, carried out, and documented.
From page 180...
... All of these groups, through NAGB, should have a role in establishing and reviewing the process and the resulting achievement levels. Use of Normative and External Comparative Data Many experts argue that the data-based and policy consequences of the results of standard setting should be known to the achievement-level-setting raters early in their deliberations; thus normative student performance data and external comparative data should be considered by raters in setting NAEP achievement levels, primarily for use in evaluating the reasonableness of levelsetting decisions.
From page 181...
... Data obtained through these multiple methods would undoubtedly provide a rich source of information to aid in setting achievement levels but would also add to the complexity of the process; however, in the short term, it is judicious to focus on achievement-level setting using data from the
From page 182...
... It seems more important to "get it nght" in these subject areas, using data from one assessment methodology, than to devote resources to setting achievement levels based on multiple methods or in all of NAEP's subject areas. Eventually, however, we envision an achievement-levelsetting process in which all available information that describes student achievement in a subject area, gleaned from across multiple assessment methods, would be used to inform the setting of NAEP' s achievement levels.
From page 183...
... Items and tasks should be written and rubrics defined to address the intended achievement levels. Preliminary achievement levels for advanced performance in the content domains need to be clarified.
From page 184...
... Recommendation 5H. In order to accomplish the committee's recommendations, NAEP's research and development agenda should emphasize the following: · documentation and analysis of the impacts of standards-based reporting in NAEP on understanding and use of the results, · development and implementation of alternate achievementlevel-setting models, · investigation and implementation of the use of normative and comparative data in determining achievement levels and evaluating their reasonableness, and · analysis of similarities and differences between results of NAEP achievement-level-setting efforts and those associated with state and other testing programs.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.