Skip to main content

Currently Skimming:

2 Improving Assessments - Questions and Possibilities
Pages 15-32

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 15...
... They are intended to guide classroom instruction, the development of curricula and supporting materials, assessments, and professional development. Yet evaluations of many sets of standards have found them wanting, and they have rarely had the effects that were hoped and expected for them (National Research Council, 2008)
From page 16...
... Learning complex ideas takes time and often happens as students work on tasks that force them to synthesize new observations with what they already knew. Students draw on a foundation of existing understanding and experiences as they gradually assemble bodies of factual knowledge and organize it according to their growing conceptual understanding.
From page 17...
... Stevens also noted that standards need to address common misunderstandings and difficulties students have learning particular content so that instruction can explicitly target them; research has documented some of these. The approach of Stevens and her colleagues to designing rigorous science standards is based on previous work in the design of curricula and assessments, which they call construct-centered design (McTighe and Wiggins, 1998; Mislevy and Riconscente, 2005; Krajcik, McNeill, and Reiser, 2008; Shin, Stevens, and Krajcik, in press)
From page 18...
...  STATE ASSESSMENT SYSTEMS BOX 2-1 Articulation of the Idea That Electrical Forces Govern Interactions Between Atoms and Molecules • lectrical forces depend on charge. There are two types of charge -- positive E and negative.
From page 19...
... Stevens said that pilot tests and field trials provide essential information, and review is critical to success. Stevens and her colleagues were also asked to examine the draft versions of the common core standards for 12th grade English and mathematics that were developed by the Council of Chief State School Officers and the National Governors Association to assess how closely they conform to the constructcentered design approach.
From page 20...
... – Students cite the opposite charges of the two surfaces as producing an attractive force that hold the two objects together. SOURCE: Krajcik, Stevens, and Shin (2009, p.
From page 21...
... First, the inclusion of particular content or skills signifies to teachers, parents, and policy makers what should be taught. Second, the con tent or structure of an item conveys information about the sort of learning that is valued in the system the test represents.
From page 22...
... Thus, they can either focus on the tested material despite possible misgivings about what they are neglecting, or they can view preparing for the state test and teaching as two separate endeavors. More broadly, Wilson said, systems that are driven by large-scale assessments risk overlooking important aspects of the curricula that cannot be ade quately assessed using multiple-choice tests (just as some content cannot be easily assessed using projects or portfolios)
From page 23...
... The construct map defines what is to be assessed, and Wilson described it as a visual metaphor for the ways that students' understanding develops and, correspondingly, how their responses to items might change. Table 2-2 is an example of a construct map for an aspect of statistics, the capacity to consider certain statistics (such as a mean or a variance)
From page 24...
... Box 2-2 shows a sample item that assesses one of the statistical concepts in the construct map in Table 2-2.
From page 25...
... The more specific guidance developed for a particular item is used as the actual scoring guide, Wilson explained, which is designed to ensure that all of the information elicited by the task is easy for teachers to interpret. Figure 2-2 is the scoring guide for the "Kayla" item, with sample student work to illustrate the levels of performance.
From page 26...
...  FIgURE 2-2 Scoring guide for sample item. SOURCE: Wilson (2009, slide #27)
From page 27...
... INNOvATIONS AND TECHNICAL CHALLENgES Stephen Lazer reflected on the technical and economic challenges of pursuing innovative assessments on a large scale from the point of view of test devel opers. He began with a summary of current goals for improving assessments: • increase use of performance tasks to measure a growing array of skills and obtain a more nuanced picture of students; • rely much less on multiple-choice formats because of limits on what they can measure and their perceived impact on instruction; • use technology to measure content and skills not easily measured using paper-and-pencil formats and to tailor assessments to individuals; and • incorporate assessment tasks that are authentic -- that is, that ask stu dents to do tasks that might be done outside of testing and are worth while learning activities in themselves.
From page 28...
... Many of these issues need further research. Test Development Test developers know how to produce multiple-choice items with fairly consistent performance characteristics on a large scale, and there is a knowledge base to support some kinds of constructed-response items.
From page 29...
... The Role of a Theoretical Model The need goes deeper than operational skills and procedures, however, Lazer said. Multiple-choice assessments allow test developers to collect data that support inferences about specific correlations -- for example, between exposure to a particular curriculum and the capacity to answer a certain percentage of a fairly large number of items correctly -- without requiring the support of a strong theoretical model.
From page 30...
... Conflicting Goals For Lazer, the key concern with innovative assessment is the need to balance possibly conflicting goals. He repeated what others have noted -- that the list of goals for new approaches is long: assessments should be shorter and cheaper and provide results quickly; they should include performance assessment; they should be adaptive; and they should support teacher and principal evaluation.
From page 31...
... However, the BEAR example seems to marry the expertise of content learning, assessment design, and measurement in a way that offers the potential to be implemented in a relatively efficient way. The discussion of technical challenges illustrated the many good reasons that the current testing enterprise seems to be stuck in what test developers already know how to do well.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.