Skip to main content

Currently Skimming:

5 Appraising the Dimensionality of the 1996 NAEP Science Assessment Data
Pages 101-122

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 101...
... To document the science knowledge and skills of our nation' s students, great care was taken in operationally defining the science domains to be measured on the assessment. For the 1996 grade 8 science assessment, which is the focus of this paper, three separate score scales were derived for three separate fields of science earth science, life science, and physical science.
From page 102...
... The results for the 1996 grade 8 NAEP science assessment were reported on a composite score scale, which was a weighted composite of the three fields of science scales. Thus, there are four score scales of interest in evaluating the assessment: the composite score scale and the earth, physical, and life sciences scales.
From page 103...
... N S3 6 1 5 6 Yes 2,961 S4 5 1 3 3 1 5 3 6 Yes 2,739 S5 7 4 3 7 Yes 2,711 S6 2 4 4 2 6 Yes 2,861 S7 12 7 2 3 2 10 No 2,401 S8 10 9 1 5 5 No 2,424 S9 13 10 3 3 10 No 2,401 S10 6 6 4 10 3 3 8 8 No 1,784 S 11 7 2 7 8 6 2 8 8 No 1,797 S 12 3 7 6 11 2 3 8 8 No 1,806 S 13 6 4 5 7 4 4 8 7 No 1,947 S 14 3 5 8 13 3 7 9 No 2,412 S 15 4 5 6 8 3 4 6 9 No 1,836 S20 6 4 6 8 7 1 8 8 No 1,939 S21 3 6 7 10 5 1 7 9 No 1,797 Total: 62 65 62 108 43 36 73 116 Correlational Analyses Raw Score Correlations For each test booklet, correlations were computed among the raw scores for the field of science content areas. These raw scores were computed by summing the item scores for those items in a booklet that corresponded to the same content area.
From page 104...
... Nevertheless, inspecting these correlations with the expectations described above provided a different lens through which to view the idea of composite and separate science proficiency scales. Principal Components Analysis As a preliminary check on dimensionality, data from four test booklets were analyzed using principal components analysis (PCA)
From page 105...
... SWAMINATHAN, K MEAN, AND F ROBIN 105 IRT Residual Analyses The fit of IRT models to the data was evaluated directly by calibrating each block using a unidimensional IRT model.
From page 106...
... are defined above. The standardized residuals computed at the score level are analogous to those routinely computed for dichotomous IRT models by comparing observed and expected proportions correct.
From page 107...
... Item response theory is an example of a nonlinear factor analysis procedure and is the procedure of choice for evaluating the dimensionality of nonlinear data. The problem is that currently only unidimensional IRT models (for dichotomous and polytomous responses)
From page 108...
... If the blocks contained items from all three fields, a constrained three-factor model was fitted to the data. Because of constraints, the two- and three-factor models are analyzed using the confirmatory, rather than exploratory, factor analysis procedure; there is no distinction between confirmatory and exploratory procedures with the onefactor model.
From page 109...
... The sample sizes for each booklet were approximately the same, ranging from 274 to 284. Booklet 209 comprised 38 items from blocks S3, Sll, and S12: 10 earth, 9 life, and 19 physical science items (16 multiple-choice and 22 constructedresponse items)
From page 110...
... 110 APPRAISING THE DIMENSIONALITY OF THE NAEP SCIENCE ASSESSMENT DATA 20 18 16 14 124 ~ 1n .~ o 37 . 1 5 9 13 17 Component Number FIGURE 5-1 Scree plot from PCA for booklet 209.
From page 111...
... This process resulted in 21 correlations among earth and physical science raw scores, 17 correlations among life and physical science raw scores, and 15 correlations among earth and life science raw scores. The 21 earth-physical correlations ranged from .61 to .79, and the median corre
From page 112...
... 112 APPRAISING THE DIMENSIONALITY OF THE NAEP SCIENCE ASSESSMENT DATA 20 18 16 14 12 it ~10 .~ 8 4 t it_ at l 1 5 9 13 17 21 25 29 33 37 _ 17 21 25 Component Numbe FIGURE 5-3 Scree plot from PCA for booklet 231.
From page 113...
... There were no data available for computing correlations among an earth science block and a 2The disattenuated correlations were computed by dividing the raw score correlation by the square root of the product of the reliability estimates for each content-area raw score. Because the alpha coefficient is known to be an underestimate of reliability (Novick, 1966)
From page 114...
... The two life-physical disattenuated correlations were .72 and .85. These correlations are also relatively low, suggesting that the physical science items may also be measuring a somewhat unique domain of science proficiency.
From page 115...
... The magnitudes of the disattenuated correlations suggest that the life science items (blocks S8 and S9) may be more closely related to general science proficiency than the earth and physical science items.
From page 116...
... Blocks TABLE 5-5 Summary of POLYFIT Results Block Content Item Types Expectation Result Problem Items S3 P 6 CR Good fit Poor fit 1 2P item S5 E 8 CR Good fit Poor fit 2 2P items S6 L, P 6 CR Poor fit Poor fit 1 2P item S8 L 5 MC, 5 CR Good fit Poor fit 5 MC, 1 2P item S10 E, L, P 8 MC, 8 CR Poor fit Adequate fit Sll E, L, P 8 MC, 8 CR Poor fit Adequate fit S12 E, L, P 8 MC, 8 CR Poor fit Adequate fit S13 E, L, P 8 MC, 8 CR Poor fit Adequate fit S15 E, L, P 6 MC, 9 CR Poor fit Adequate fit S20 E, L, P 8 MC, 8 CR Poor fit Adequate fit S21 E, L, P 7 MC, 9 CR Poor fit Adequate fit Notes: E = earth science, L = life science, P = physical science; CR = constructed-response item, MC = multiple-choice item; 2P = dichotomously scored constructed-response item.
From page 117...
... Given the high fit index values obtained with the one-factor model for all of the blocks, the acceptable fit values obtained with S3 and S14, TABLE 5-6 Summary of Confirmatory Factor Analysis Results Block Content Item Types GFI/AGFI Areas S3 P CR .95/.89 S4 E,L,P CR,MC .95/.92 S5 E CR .84/.72 S6 L,P CR .99/.99 S7 E CR,MC .99/.98 S8 L CR,MC .99/.99 S9 L CR,MC .99/.99 S10 E,L,P CR,MC .99/.98 Sll E,L,P CR,MC .98/.98 S12 E,L,P CR,MC .98/.98 S13 E,L,P CR,MC .99/.96 S14 E,L,P CR,MC .91/.89 S15 E,L,P CR,MC .99/.98 S20 E,L,P CR,MC .99/.98 S21 E,L,P CR,MC .99/.98 Notes: E = earth science, L = life science, P = physical science, CR = constructed-response, MC = multiple-choice.
From page 118...
... Although the results of this analysis cannot be generalized to the dimensionality of the complete dataset, which includes the polytomously scored items, it does evaluate whether the 91 dichotomous items can be considered unidimensional. The one-dimensional MDS solution did not display adequate fit to the data (STRESS = .20, R2= .88~.
From page 119...
... However, there were only four blocks of items comprising items from a single field of science, and the residuals from IRT models fit to these blocks were larger than residuals from IRT models fit to mixed blocks (see Table 5-5~. Therefore, it is difficult to conclude that these lower correlations are due to field of science content distinctions.
From page 120...
... The large disattenuated correlations observed among the field of science raw scores also argues against three separate scales. It is possible that there were too few items in each content area at the block or booklet level to uncover their uniqueness, but it is clear that these three fields of science are highly related.
From page 121...
... using "theoretical DETECT" and concluded that these mixed blocks were essentially unidimensional. In the current study, block 21 displayed adequate fit to a unidimensional IRT model and displayed adequate fit to the one-factor FA model.
From page 122...
... 1991 MULTILOG: Multiple Categorical Item Analysis and Test Scoring Using Item Response Theory, Version 6. Computer program.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.