Skip to main content

Currently Skimming:

8 Evaluation and Monitoring
Pages 146-160

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 146...
... Thus, in attending to the details of the legal requirements, a state may miss the broader question of whether and how well its policies and resources -- and specifically its assessment system -- are supporting progress in science achievement. NCLB makes clear that evaluation and monitoring are important.
From page 147...
... The same issues apply to the evaluation of assessment systems that produce a variety of information from multiple measures as apply to the use of multiple measures for assessing individuals, although the available methodologies have to be adapted for that purpose. As discussed earlier, available evidence suggests that the science standards in many states are vague and not sufficiently specific to represent a clear target for assessment development or for curriculum and instruction (Cross, Rebarber, and Torres, 2004)
From page 148...
... AERA, APA, and NCME Standards The most recent edition of Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 1999) articulates professional standards regarding assessment validity and quality.
From page 149...
... Drawing on the Standards for Educational and Psychological Testing (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 1999) as well as their own knowledge and experience and ethical considerations, the developers of the CRESST standards stress that accountability systems should be evaluated on the basis of multiple forms of evidence.
From page 150...
... Assessment tasks and scoring rubrics that are designed to elicit the knowledge and cognitive processes that are consistent with the nature of learning provide a framework not only for the development of assessments but also for evaluation of the validity of interpretations based on the assessment's results. For science assessment to support learning, the ways in which learning develops must be considered as the assessment systems are evaluated and monitored.
From page 151...
... Such contrasts raise such questions as whether the state test results reflect real learning or just the effects of test preparation or teaching to the test, and whether the national tests were adequate measures of the Kentucky curriculum. In another example, California's strong accountability system in reading and mathematics resulted in impressive initial improvement in test scores, with the majority of elementary schools meeting their target goals.
From page 152...
... On the theoretical side, evaluating alignment entails establishing the equivalence of the cognitive demands of assessment tasks (often multiple-choice test items) and the cognitive demands of state standards (usually prose statements about student knowledge and skills)
From page 153...
... We note here that the creation of an assessment system may create additional challenges for alignment studies, although a systems approach could improve the overall alignment between standards and assessments. The designers of a science assessment system select the tests and tasks that constitute the system to align collectively with the breadth and depth of state science content standards, to address program monitoring and evaluation needs, and to provide evidence of stu
From page 154...
... By updating alignment studies whenever the standards or the tests change, states can monitor a contractor's efforts to ensure alignment. It is also important to note that improving alignment does not necessarily mean changing tests.
From page 155...
... Field testing of assessment tasks and tests provides the next step in evaluation, and a variety of types of evidence are needed to show the extent to which the assessments will provide reliable and accurate data and can support valid inferences for their intended purposes. Among the types of evidence that states should look for are the following (American Educational Research Association, American Psychological Association, and National Council on Measurement in Education, 1999)
From page 156...
... Furthermore, because the committee advocates a system of assessments that supports student learning and development over time, alignment studies will need to address all of the assessments and sources of data that are intended to be part of the system, as well as addressing the alignment of assessments with learning expectations across grades. Methodologies will be needed to judge the alignment of a multilevel system.
From page 157...
... If the primary purpose of NCLB science assessments is to improve student achievement overall and to close the achievement gap between high- and low-achieving students, then studies should examine the extent to which the intended benefits are realized. The CRESST researchers suggest that among the intended benefits that should be investigated are the extent to which the system does the following: · builds the capacity of staff to enable students to reach standards; · builds teacher assessment capacity; · influences the way resources are allocated to ensure that students will achieve standards; · supports high-quality instruction aligned with standards; and · supports equity in students' access to quality education.
From page 158...
... Assessment Use An additional concern is the utility and use of assessment results. A primary purpose of state assessment systems is to provide evidence that will improve decision making and enable states, districts, and schools to better understand and improve science learning.
From page 159...
... ; evaluating instructional sensitivity; and identifying optimal ways to identify and address fluctuations in scores from year to year that are unrelated to student learning. QUESTIONS FOR STATES States can use the following questions to consider whether their methods for evaluating and monitoring their assessment systems are sufficient, and to think about ways to move their assessment systems in the directions the committee has described.
From page 160...
... Do they include multiple indicators, such as technical quality, utility, and impact? Question 8-3: Does the state monitor and evaluate the interactions between its science assessment system and the assessment systems for other disciplines?


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.