The next major section is a rationale for the mathematics education community, which in many respects is the heart of Measuring Up. This is where comments on the content, style, or intent of the task appear (e.g., why the task was included), as well as more general messages about mathematics education that the task is intended to convey.
Following the main presentation of the rationale for the task, there are two subsections that provide further information. The first, task design considerations, discusses some of the details behind the task — why certain questions were phrased as they were, or why particular numbers were chosen. The second, variants and extensions, hints at other directions in which the task could be taken, for purposes either of instruction or further assessment. These subsections are far from exhaustive, for often the tasks could be starting points for weeks of instruction. One important message conveyed by this section is that these particular prototypes are in no way unique.
The next section describes a rough scoring system — what is called a protorubric — for the task. It is now widely recognized that an assessment task by itself means little without an indication of how children's responses would be scored. In other words, an important component of an assessment task is a scoring rubric that describes and orders a variety of answers that a child might typically give. For reasons discussed in the next section, the rubrics given here are necessarily tentative and incomplete — whence "protorubrics."
Finally, in some of the tasks there is a section containing references to relevant sources.
The Protorubrics
How might fourth graders do?
Although each task in this volume contains commentary about scoring based on student work, for a number of reasons we have not developed fully detailed scoring rubrics:
-
The intended audience for these tasks are students who have had a mathematical education that is different from what is commonly available in U.S. schools today.
-
Ideally, a scoring rubric should be based on the responses of many hundreds of children who are properly prepared for the tasks. While all of these tasks have been pilot tested with children, in most cases the testing has not been sufficient to provide a solid base for a complete scoring rubric.
-
There is no universal agreement on how to structure scoring rubrics. Various groups who are currently active in creating alternative assessments in mathematics have used different styles and different levels of specificity (for example, four vs. six levels of gradation) for scoring rubrics.
-
A complete analysis of scoring rubrics would require a foray into the thorny problem of judging individual performance in group settings. Although we do intend that these prototypes will encourage teachers to use group work, we have deliberately set aside the daunting task of codifying rubrics for assigning individual grades when students work in groups.
-
There is continuing debate between proponents of ''holistic" and "analytic" approaches. Does one look at every isolated component of a complex response, or should one make a general, overall, judgment of the child's response? While it is important to be fairly specific about what the task is intended to elicit and about what is to be valued in children's responses, there is no compelling evidence to favor one position over the other. The protorubrics given in this book can easily be adapted to different styles.
Moreover, protorubrics are in some ways analogous to standards: they express goals, ensure quality, and promote change in assessment. Hence, protorubrics by themselves may have a unique contribution to make to assessment reform, whether or not they ever are formalized into polished rubrics.
The protorubrics in Measuring Up are structured around three levels: high, medium, and low. Rather than try to define precisely what constitutes a "high" response, the protorubrics list only selected characteristics of a high response. We leave to others the