Skip to main content

Currently Skimming:


Pages 177-220

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 177...
... This chapter does not aim to describe the entire assessment design process. A number of existing documents, most notably Standardsfor Educational and Psychological Testing (American Educational Research Association fAERA]
From page 178...
... The model suggests the most important aspects of student achievement about which one would want to draw inferences and provides clues about the types of assessment tasks that will elicit evidence to support those inferences. For example, if the purpose of an assessment is to provide teachers with a tool for determining the most appropriate next steps for arithmetic instruction, the assessment designer should turn to the research on children's development of number sense (see also Chapter 31.
From page 179...
... These researchers have implemented their Rightstart program in different communities in Canada and the United States and have consistently found that children in the experimental program perform significantly better on a variety of measures of number sense than those in control groups (Griffin and Case, 1997; Griffin, Case, and Sandieson, 1992; Griffin, Case, and Siegler, 19941. Later in this chapter we present an assessment they have developed to assess student understanding relative to this theory.
From page 180...
... Differing theoretical descriptions of learning should not be viewed as competitive. Rather, aspects of existing theoretical descriptions can often be combined to create a more complete picture of student performance to better achieve the purposes of an assessment.
From page 181...
... 3. Double mental counting line structure.
From page 182...
... Most useful for assessment design are descriptions of how characteristics of expertise are manifested in particular school subject domains. Studies of expert performance describe what the results of highly successful learning look like, suggesting targets for instruction and assessment.
From page 183...
... 125 1300 When borrowing from a column whose top digit is 0, the student writes 9 -522 but does not continue borrowing from the column to the left of the 0. 878 140 Whenever the top digit in a column is 0, the student writes the bottom digit -21 in the answer; i.e., 0- N = N
From page 184...
... . More typical classroom assessments, such as quizzes administered by teachers to a class several times each week or month, provide individual students with feedback about their learning and areas for improvement.
From page 185...
... Underlying Models of Cognition and Learning: Examples PATAIgebra Tutor fohn Anderson's ACT-R research group has developed intelligent tutoring systems for algebra and geometry that are being used successfully in a number of classrooms (Koedinger, Anderson, Hadley, and Mark, 19971. The cognitive models of learning at the core of their systems are based on the group's more general theory of human cognition, ACT-R, which has many 185
From page 186...
... Assessment The Facets program provides an example of how student performance can be described at a medium level of detail that emphasizes the progression or development toward competence and is highly useful for classroom assessment (Hunt and Minstrell, 1994; Minstrell, 20001. Developed through collaboration between rim Minstrell (an experienced high school science teacher)
From page 187...
... Starting with a model of learning expressed in terms of facets, Minstrell and Hunt have carefully crafted assessment tasks and scoring procedures to provide evidence of which facets a student is likely to be using (illustrated later in this chapter)
From page 188...
... The facets of student thinking are ordered along a continuum, from correct to problematic understandings. 310 Pushes from above and below by a surrounding fluid medium lend a slight support (net upward push due to differences in depth pres sure gradient)
From page 189...
... . For example, to assess student learning as part of a seventh-grade unit called the Antarctica Project, students work in groups to design a research station for scientists.
From page 190...
... Progress maps provide a description of skills, understandings, and knowledge in the sequence in which they typically develop a picture of what it means to improve over time in an area of learning. Australia's Developmental Assessment is used as an example throughout this report, not because the progress maps are particularly reflective of recent advances in cognitive research, but because the Developmental Assessment approach represents a notable attempt to measure growth in competence and to convey the nature of student achievement in ways that can benefit teaching and learning.
From page 191...
... An evaluation using tasks designed to tap specific performances on the map can provide a "snapshot" showing where a student is located on the map, and a series of such evaluations is useful for assessing a student's progress over the course of several years. · 5 Uses unitary ratios of the form 1 part to X parts (the ratio of cordial to water was 1 to 4)
From page 192...
... These processes involve both reflection and empirical observation, and require several iterations of the steps described below. In addition to starting with a model of learning for the subject domain, assessment design should be led by the interpretation element of the assessment triangle, which guides how information from the assessment tasks will be filtered and combined to produce results (that is, how observations will be transformed into measurements)
From page 193...
... , it frequently is not achieved in current practice. Task Design Guided by Cognitive and Measurement Principles Many people consider the designing of assessment tasks to be an art.
From page 194...
... However, the relationship between item format and cognitive demands is not so straightforward. Although multiplechoice items are often used to measure low-level skills, a variety of item formats, including carefully constructed multiple-choice questions, can in fact tap complex cognitive processes (as illustrated later in Box 5-71.
From page 195...
... Ideally, task difficulty should be explained in terms of the underlying knowledge and cognitive processes required, rather than simply in terms of statistical item difficulty indices, such as the proportion of respondents answering the item correctly. Beyond knowing that 80 percent of students answered a particular item incorrectly, it would be educationally useful to know why so many did so, that is, to identify the sources of the difficulty so they could be remedied (assuming, of course, that they represented important construct-relevant variance)
From page 196...
... As described in Chapter 4, several existing measurement models are able to incorporate and analyze aspects of item difficulty. Yet while some of these models have been available for several years, their use in mainstream assessment has been infrequent (e.g., Wilson and Wang, 19951.
From page 197...
... Teachers report that the experience reveals differences in children's thinking that they had not previously noticed, prompts them to listen more actively to each child in their class, and gives them a sense of what developmental instruction entails (Griffin and Case, 19971. Evaluation of Student Responses The observation corner of the assessment triangle includes tasks for eliciting student performance in combination with scoring criteria and procedures, or other methods for evaluating student responses to the tasks.
From page 198...
... 198 KNOWING WHAT STUDENTS KNOW BOX 5-5 Number Knowledge Test Below are the 4-, 6-, 8-, and 1 0-year-old sections of the Number Knowledge Test. Preliminary Let's see if you can count from 1 to 10.
From page 199...
... 5 IMPLICATIONS OF OF THE NEW FOUNDATIONS FOR ASSESSMENT DESIGN Level 2 (8-year-old level) : Go to Level 3 if 5 or more correct 1.
From page 200...
... The utility of assessment information can be enhanced by carefully selecting tasks and combining the information from those tasks to provide evidence about the nature of student understanding. Sets of tasks should be constructed and selected to discriminate among different levels and kinds of understanding that are identified in the model of learning.
From page 201...
... Thus, this particular pattern of correct and incorrect responses can be explained by positing a basically correct subtraction procedure with two particular bugs. Note that in both the traditional and cognitive research-based examples shown here, the student answered three problems incorrectly and two correctly.
From page 202...
... Interpretation models tell the assessment designer how much and what types of tasks and evidence are needed to support the desired inferences and at what point additional assessment tasks will provide unnecessarily redundant information. The interpretation model also serves as the "glue" that holds together the information gleaned from the items and transforms it into interpretable results.
From page 203...
... , 1992; Minstrell, Stimpson, and Hunt, 1992) gave these questions to 60 students at the end of a high school introductory physics course and developed an informal interpretation model.
From page 204...
... 204 KNOWING WHATSTUDENTS KNOW BOX5-7 Use of Multiple-Choice Questions to Test for Theoretical vs. Knowledge-in-Pieces Perspective Problem: In the following situation, two identical steel marbles M1 and M2 are to be launched horizontally off their respective tracks.
From page 205...
... 5 IMPLICATIONS OF OF THE NEW FOUNDATIONS FOR ASSESSMENT DESIGN Combinations of answers are more consistent with a knowledge-in-pieces perspective than with a theoretical perspective. Problem Parts Answer Frequency Count Combination (a)
From page 206...
... On a single physics assessment, one could imagine having many such sets of items corresponding to different facet clusters. Interpretation issues could potentially be addressed with the sorts of measurement models presented in Chapter 4.
From page 207...
... used some of these techniques to examine how well test developers' intentions are realized in performance assessments that purport to measure complex cognitive processes. They devel207
From page 208...
... Interviews illuminated unanticipated cognitive processes used by test takers. One finding was the importance of distinguishing between the demands of openended tasks and the opportunities such tasks provide students.
From page 209...
... Example: QUASAR Cognitive Assessment Instrument QUASAR is an instructional program developed by Silver and colleagues to improve mathematics instruction for middle school students in economically disadvantaged communities (Silver, Smith, and Nelson, 1995; Silver and Stein, 19961. To evaluate the impact of this program, which emphasizes the abilities to solve problems, reason, and communicate mathematically, assessments were needed that would tap the complex cognitive processes targeted by instruction.
From page 210...
... . Concept maps were developed for scoring student explanations.
From page 211...
... The judgments of the internal reviewers, along with the pilot data, were used to answer a series of questions related to the quality of the tasks: Does the task assess the skill/content it was designed to assess? Does the task assess the high-level cognitive processes it was designed to assess?
From page 212...
... The task was therefore revised so it asked students to describe howthey knew which figure comes next. This change increased The development process for the QUASAR Cognitive Assessment Instrument required continual interplay among the validation procedures of logical analysis, internal review, pilot testing, and external review.
From page 213...
... Traditionally, achievement tests have been designed to provide results that compare students' performance with that of other students. The results are usually norm-referenced since they compare student performance with that of a norm group (that is, a representative sample of students who took the same test)
From page 214...
... The ways people learn the subject matter and different states of competence should be displayed and made as recognizable as possible to educators, students, and the public to foster discussion and shared understanding of what constitutes academic achievement. Some examples of enhanced reporting afforded by models of learning (e.g., progress maps)
From page 215...
... Conditional Versus Unconditional Inferences To some extent in any assessment, given students of similar ability, what is relatively difficult for some students may be relatively easy for others, depending on the degree to which the tasks relate to the knowledge structures students have, each in their own way, constructed (Mislevy, 19961. From the traditional perspective, this is "noise," or measurement error, and if 215
From page 216...
... This means that evaluation or interpretation of student responses does not depend on any other information the evaluator might have about the background of the examinee. This approach works reasonably well when there is little unique interaction between students and tasks (less likely for assessments connected with instruction than for those external to the classroom)
From page 217...
... (c) "Far transfer": Here, children were presented with a task that was amenable to the same control-of-variables strategy but had different surface features (e.g., paper-and-pencil assessments of good and bad experimental designs in domains outside physics)
From page 218...
... . A third method is to let students choose among assessment tasks in light of what they know about themselves- their interests, their strengths, and their backgrounds.
From page 219...
... Validation that tasks tap relevant knowledge and cognitive processes, often lacking in assessment development, is another essential aspect of the development effort. Starting with hypotheses about the cognitive demands of a task, a variety of research techniques, such as interviews, having students think aloud as they work problems, and analysis of errors, can be used to analyze the mental processes of examinees during task performance.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.