Skip to main content

Currently Skimming:

6 Test Administration: Other Possible Innovations
Pages 57-66

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 57...
... . That is, the innovations discussed in this chapter potentially provide other ways to reduce administration costs, but a successful transition to local test administration would substantially reduce field costs, which would substantially reduce the possibility for these innovations to save money.
From page 58...
... . NCES estimates that two-subject administration would require an investment of $10 million to cover three studies, two in 2026 to examine a two-subject design, one with a traditional linear test design and one with adaptive test design, and a bridge study in 2028.4 NCES estimates that two-subject administration would allow sample sizes to be reduced by one-third without changing NAEP's precision.5 The estimates include an expectation that multisubject testing will be coupled with adaptive testing, but the sample size reduction would largely be based on the extra time per student.6 NCES estimates that two-subject NAEP testing will save $17 million from 2028 through 2030.
From page 59...
... If the panel's estimate is correct that local administration has the potential to reduce administration costs by about 80 percent, then the remaining average annual administration costs in a decade should be roughly 20 percent of their current values: $7.2 million for the assessments and $1.0 million for the pilot administration.9 If 90-minute tests can reduce these administration costs by one-third, that might represent an annual average savings of $2.7 million by 2030, which is 1.6 percent of NAEP's overall budget. The potential savings in the next few years from using tests with 90 minutes for the cognitive items -- before local administration is implemented -- would be much larger because the overall administration costs are currently much larger.
From page 60...
... RECONSIDERING THE SAMPLE SIZES NEEDED TO ACHIEVE NAEP'S PURPOSES Test administration costs -- particularly in the current model -- are directly related to the size of NAEP's sample. If it is possible for NAEP to perform its mission with smaller sample sizes, there could be substantial cost reductions.
From page 61...
... Because this last question compares two mean differences, this is a "difference in differences." Then, the statement mentions differential progress "among jurisdictions and subpopulations." This parameter amounts to asking, for example, whether gender gaps are increasing more in one state than another. Because this compares a gap over time, in two different states, this is sometimes called a "triple difference." As an example, consider the 2019 NAEP estimate for the performance of English-language learners in Shelby County, Tennessee, which is one of the urban districts included in the Trial Urban District Assessment program.
From page 62...
... ADAPTIVE TESTING Computerized adaptive testing has been effectively used in large-scale testing since the mid-1990s. A typical adaptive test sequentially administers test questions and uses students' responses to assign subsequent questions at appropriate levels of difficulty until scores reach prescribed levels of precision or decision accuracy. Computer-adaptive multistage testing is a variation in which the adaptation occurs for groups of items, rather than for individual items (Luecht, 2014)
From page 63...
... . For example, urban districts that are now estimated in the Trial Urban District Assessment program have smaller samples than states and tend to perform on average at lower levels on the scale; both differences result in higher standard errors. If NAEP starts to target its statistical precision more closely to a specific level necessary for policy decisions, with a goal to reduce sample sizes, adaptive testing could help ensure that statistical precision is adequate for lower-performing populations.
From page 64...
... In addition to problems related to the frameworks, there are some costs related to a transition to adaptive testing. They include the investments needed to develop larger item pools for the low and high ends of the ability distribution that tend to be poorly covered in traditional tests, the costs of reassembling existing items into blocks at different levels of difficulty, and the cost of developing the technology for adaptive testing.
From page 65...
... to coordinate data collection and promote sharing of item pools and even assessment components.19 In principle, the integration of sampling and data collection activities across education surveys could result in substantial cost savings, as well as improve the quality of data collected and facilitate a range of special studies and linking activities. However, realizing these improvements would require a high level of coordination across separate programs.  The easiest way to recognize efficiencies in NCES assessments would be to coordinate the data collections for the school-based assessments, while leaving the structure and content of the assessments intact.
From page 66...
... Even without reaching the level of coordinating the content of different assessments, the practical and political costs involved in achieving this level of coordination across separate assessment programs is likely to be overwhelming. RECOMMENDATION 6-4: Efforts to coordinate NAEP test admin istration with the international assessment programs sponsored by the National Center for Education Statistics should not be used as a strategy to reduce costs.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.