Skip to main content

Currently Skimming:

4 An External Evaluation of the 1996 Grade 8 NAEP Science Framework
Pages 74-100

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 74...
... To accomplish the goals of this study, 10 carefully selected science teachers were recruited to review items from the 1996 grade 8 NAEP science assessment and provide judgments regarding the knowledge and skills measured by these items. 1Some measurement specialists (e.g., Messick, 1989)
From page 75...
... Thus, for all 1996 NAEP science assessments, the results were to be reported along four separate scales: one for each of the three fields of science and a composite score scale summarizing science proficiency across the three fields. The second dimension of the science framework is a cognitive dimension described as "ways of knowing and doing science." There are also three components to this dimension: conceptual understanding, practical reasoning, and scientific investigation.
From page 76...
... METHOD Ten science teachers were recruited to scrutinize a carefully selected sample of items from the 1996 grade 8 science assessment and provide judgments regarding the content characteristics of the items. As described below, these teachers provided both ratings of the content similarities among the items and ratings linking each item to the content, cognitive, nature, and theme dimensions defined in the frameworks.
From page 77...
... Items Selected for Analysis As noted above, 189 items comprised the grade 8 science assessment. Sixty items were selected for the purposes of this study to represent the test specifications in terms of the content and cognitive dimensions as well as item format (multiple choice, short constructed response, extended constructed response)
From page 78...
... TABLE 4-3 Cross-Tabulation of Specifications for 45-Item Subset Used in Item Similarity Rating Study Ways of Knowing and Doing Science Field of Conceptual Practical Scientific Science Understanding Reasoning Investigation Total Earth science 9 3 3 15 (33%) (Theme)
From page 79...
... Therefore, the content specifications for these items, and the content frameworks for the test, were not described to the SMEs. To facilitate understanding of the item similarity rating task, three "practice" item pairs were distributed to the judges.
From page 80...
... Earth Explain your answer. A B C Moon 1 1 ~O HE001703 Very Similar Very Different 1 2 3 4 5 6 7 8 FIGURE 4-1 Sample item similarity rating sheet.
From page 81...
... Space on the questionnaire was also provided for the SMEs to add any additional criteria they used that were not included on the list. Item-Objective Congruence Ratings The purpose of the item similarity rating task was to obtain the SMEs' "independent" appraisal of the knowledge and skills measured by the items (i.e., independent of knowledge of the content, cognitive, nature, and theme dimensions that governed item development)
From page 82...
... 82 ~ 1 1 11 1 idol I T I I T I T I I T I I 1 5 ~ 1 1 1 in T~TTTTT ~1: T~TTTTT ~ ~ ~1 1 1 1 1 1~ ~IIIII ~1 1 1 1 1 d BUTT I I I LTfTTTTT so to = Ct so o = ;~ o · ~ Ct V:
From page 83...
... ROGERS, AND H SWAMINATHAN Data Analysis 83 The item similarity ratings were analyzed using multidimensional scaling (MDS)
From page 84...
... The two remaining misclassified items were life science items, one of which nine SMEs classified as earth science; the other item was classified as life science; by only six SMEs. The percentages of correct classifications for the earth, life, and physical science fields were 86, 90, and 76 percent, respectively.
From page 85...
... Earth 22 45 86 Life 21 71 90 Physical 17 41 76 Average 53 85 Note: "Leaves" represent the number of SMEs correctly classifying each item, with O indicating all 10 SMEs correctly classified the item. scientific investigation cognitive areas were 70, 53, and 50 percent, respectively.
From page 86...
... by at Least Seven SMEs (%) Conceptual understanding 27 30 70 Practical reasoning 17 0 53 Scientific investigation 16 0 50 Average 13 60 Notes: "Leaves" represent number of SMEs correctly classifying each item, with O indicating all 10 SMEs correctly classified the item.
From page 87...
... Analysis of the exit survey data revealed that the SMEs were fairly confident in the validity of their item-objective congruence ratings. When asked how confident they were regarding how well their ratings reflected the way the items "should truly be classified," the median confidence rating on an eight-point scale (where 8 = very confident)
From page 88...
... Correct classifications are indicated in boldface. MDS Results All SMEs completed the item similarity ratings within six hours.
From page 89...
... These results suggest that in general the similarity ratings can be considered reliable; however, it is likely that some specific item pairings for some SMEs are probably unreliable, which is not surprising given the large number of ratings completed. However, given that the replicated ratings were made toward the end of the rating task and that the average discrepancies for these pairs were small, it does not appear that the SMEs' similarity ratings are undermined by low reliability.
From page 90...
... The congruence among the SMEs was evaluated by inspecting the individual subject weights and the subject weirdness indexes.3 Although differences were observed in the weighting of the dimensions across SMEs, all SMEs appeared to be using all five dimensions in making their similarity ratings. Figure 4-3 presents separate two-dimensional subspaces from the five-dimensional SME weight space.
From page 91...
... o u, .3 . _ ~ .2 7L .1 o o 8 9 7 4 10 3 6 5 1 1 1 1 1 1 1 0.0 .1 .2 .3 .4 .5 .6 Dimension 1 (Conceptual Understanding)
From page 92...
... ; dimension 2 is an item format dimension that separates the multiple-choice items from the constructed-response items; dimension 3 is "practical/applied reasoning" cognitive dimension that separates the practical reasoning items from the scientific investigation items; dimension 4 is a content dimension that separates the life science items from the earth science items; and dimension 5 is a content dimension that separates the physical science items from the life science items. Thus, the first three dimensions are related to cognitive item attributes, and the fourth and fifth dimensions are related to content item attributes.
From page 93...
... Both of these items exhibited low item-objective congruence for scientific investigation. Similarly, all but two of the practical reasoning items had positive coordinates on dimension 3, both of which also had low item-objective congruence ratings for the practical .
From page 94...
... FIGURE 4-6 Two-dimensional MDS stimulus subspace: items plotted along dimensions 1 and 3 using cognitive classification symbols. C, conceptual understanding; P
From page 95...
... 3 FIGURE 4-7 Two-dimensional MDS stimulus space illustrating cognitive groupings among grade 8 NAEP science items. C, conceptual understanding; P
From page 96...
... The content, cognitive, nature, and theme designations were "dummy" coded for this analysis. For example, an earth science dummy variable was created by coding all earth science items "1" and all other items "O." The cognitive, theme, and nature areas were also dummy coded, as well as an item format variable (multiple-choice/constructed-response)
From page 97...
... The nature of the science dummy variable also exhibited a large correlation with this dimension, but the nature of science item-objective congruence ratings did not. This finding probably stems from the fact that 5 of the 10 nature of science items were also scientific investigation items.
From page 98...
... The first three dimensions correspond to cognitive and item format attributes, and the fourth and fifth dimensions correspond to fields of science attributes. In summary, analysis of the item similarities data using MDS uncovered cognitive- and content-related dimensions that were congruent with those dimensions specified in the National Assessment Governing Board frameworks.
From page 99...
... The item-objective congruence ratings, and the dimensions observed in the SME-derived MDS solution, did not strongly support the themes of science or nature of science dimensions of the framework. However, like the ways of knowing and doing science dimension, separate scores are not reported for these dimensions, and including them in the frameworks probably enhanced item development and contributed to the overall quality of the item pool.
From page 100...
... National Assessment Governing Board (NAGB) 1996 Science Framework for the 1996 National Assessment of Educational Progress.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.