NOTICE: The project that is the subject of this report was approved by the Governing Board of the National Research Council, whose members are drawn from the councils of the National Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. The members of the committee responsible for the report were chosen for their special competences and with regard for appropriate balance.
The study was supported by Contract/Grant No. RJ97184001 between the National Academy of Sciences and the U.S. Department of Education. Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the view of the organizations or agencies that provided support for this project.
International Standard Book Number 0-309-06789-8
Additional copies of this report are available from:
National Academy Press
2101 Constitution Avenue NW Washington, DC 20418 Call 800-624-6242 or 202-334-3313 (in the Washington Metropolitan Area). This report is also available on line at http://www.nap.edu
Copyright 1999 by the National Academy of Sciences. All rights reserved.
Suggested citation: National Research Council (1999). Embedding Questions: The Pursuit of a Common Measure in Uncommon Tests. Committee on Embedding Common Test Items in State and District Assessments. D.M. Koretz, M.W. Bertenthal, and B.F. Green, eds. Board on Testing and Assessment, Commission on Behavioral and Social Sciences and Education. Washington, DC: National Academy Press.
THE NATIONAL ACADEMIES
National Academy of Sciences
National Academy of Engineering
Institute of Medicine
National Research Council
The National Academy of Sciences is a private, nonprofit, self-perpetuating society of distinguished scholars engaged in scientific and engineering research, dedicated to the furtherance of science and technology and to their use for the general welfare. Upon the authority of the charter granted to it by the Congress in 1863, the Academy has a mandate that requires it to advise the federal government on scientific and technical matters. Dr. Bruce M. Alberts is president of the National Academy of Sciences.
The National Academy of Engineering was established in 1964, under the charter of the National Academy of Sciences, as a parallel organization of outstanding engineers. It is autonomous in its administration and in the selection of its members, sharing with the National Academy of Sciences the responsibility for advising the federal government. The National Academy of Engineering also sponsors engineering programs aimed at meeting national needs, encourages education and research, and recognizes the superior achievements of engineers. Dr. William A. Wulf is president of the National Academy of Engineering.
The Institute of Medicine was established in 1970 by the National Academy of Sciences to secure the services of eminent members of appropriate professions in the examination of policy matters pertaining to the health of the public. The Institute acts under the responsibility given to the National Academy of Sciences by its congressional charter to be an adviser to the federal government and, upon its own initiative, to identify issues of medical care, research, and education. Dr. Kenneth I. Shine is president of the Institute of Medicine.
The National Research Council was organized by the National Academy of Sciences in 1916 to associate the broad community of science and technology with the Academy's purposes of furthering knowledge and advising the federal government. Functioning in accordance with general policies determined by the Academy, the Council has become the principal operating agency of both the National Academy of Sciences and the National Academy of Engineering in providing services to the government, the public, and the scientific and engineering communities. The Council is administered jointly by both Academies and the Institute of Medicine. Dr. Bruce M. Alberts and Dr. William A. Wulf are chairman and vice chairman, respectively, of the National Research Council.
COMMITTEE ON EMBEDDING COMMON TEST ITEMS IN STATE AND DISTRICT ASSESSMENTS
DANIEL M. KORETZ (Chair),
School of Education, Boston College; RAND Education, Washington, DC
SUSAN AGRUSO,
Office of Assessment, South Carolina Department of Education
RONALD K. HAMBLETON,
School of Education, University of Massachusetts, Amherst
H.D. HOOVER,
Iowa Testing Programs, University of Iowa
BRIAN W. JUNKER,
Department of Statistics, Carnegie Mellon University
JAMES A. WATTS,
Southern Regional Educational Board, Atlanta, Georgia
KAREN K. WIXSON,
School of Education, University of Michigan
WENDY M. YEN,
CTB/McGraw-Hill, Monterey, California
REBECCA ZWICK,
Graduate School of Education, University of California, Santa Barbara
PAUL W. HOLLAND, Liaison,
Board on Testing and Assessment; Graduate School of Education, University of California, Berkeley
MERYL W. BERTENTHAL, Study Director
BERT F. GREEN, Senior Technical Advisor
JOHN J. SHEPHARD, Senior Project Assistant
BOARD ON TESTING AND ASSESSMENT
ROBERT L. LINN (Chair),
School of Education, University of Colorado
CARL F. KAESTLE (Vice Chair),
Department of Education, Brown University
RICHARD C. ATKINSON, President,
University of California
PAUL J. BLACK,
School of Education, King's College, London, England
RICHARD P. DURÁN,
Graduate School of Education, University of California, Santa Barbara
CHRISTOPHER F. EDLEY, JR.,
Harvard School of Law
RONALD FERGUSON,
John F. Kennedy School of Public Policy, Harvard University
PAUL W. HOLLAND,
Graduate School of Education, University of California, Berkeley
ROBERT M. HAUSER,
Department of Sociology, University of Wisconsin, Madison
RICHARD M. JAEGER,
School of Education, University of North Carolina, Greensboro
LORRAINE MCDONNELL,
Departments of Political Science and Education, University of California, Santa Barbara
BARBARA MEANS,
SRI, International, Menlo Park, California
KENNETH PEARLMAN,
Lucent Technologies, Inc., Warren, New Jersey
ANDREW C. PORTER,
Wisconsin Center for Education Research, University of Wisconsin, Madison
CATHERINE E. SNOW,
Graduate School of Education, Harvard University
WILLIAM L. TAYLOR,
Attorney at Law, Washington, DC
WILLIAM T. TRENT, Associate Chancellor,
University of Illinois, Champaign
VICKI VANDAVEER,
The Vandaveer Group, Inc., Houston, Texas
LAURESS L. WISE,
Human Resources Research Organization, Alexandria, Virginia
KENNETH I. WOLPIN,
Department of Economics, University of Pennsylvania
MICHAEL J. FEUER, Director
VIOLA C. HOREK, Administrative Associate
LISA D. ALSTON, Administrative Assistant
Acknowledgments
The Committee on Embedding Common Test Items in State and District Assessments wishes to thank the many people who helped to make possible the preparation of this report.
An important part of the committee's work was to gather data from research, policies, and practices on embedding. Many people gave generously of their time, at meetings and workshops of the committee and in interviews with committee staff.
The committee benefited tremendously from a presentation at its first meeting by Achieve, Inc. staff: Matthew Gandal, director of standards and assessment, Jennifer Vranek, senior policy analyst, and consultant David Wiley of Northwestern University. They provided the committee with a comprehensive overview of Achieve's efforts to develop a common national measure of student performance through embedding common items in state mathematics assessments.
At a committee workshop, Gordon M. Ambach, executive director of the Council of Chief State School Officers (CCSSO); Wayne H. Martin, director of the CCSSO State Education Assessment Center; and John R. Tanner, director of the Delaware Education Assessment and Analysis Office, offered local, state, and national perspectives on the purposes for which a common measure of student performance might be used. Don McLaughlin, chief scientist at the American Institutes of Research, and Michele Zimowski, senior survey methodologist at the
National Opinion Research Center of the University of Chicago, presented on-going research related to linking state mathematics assessments to the National Assessment of Educational Progress (NAEP). Michael Kolen, professor of education at the University of Iowa, discussed the inferences that educators and policy makers want to support with tests that produce individual scores and are linked to NAEP. Patricia Ann Kenney, research associate at the University of Pittsburgh's Learning Research and Development Center, presented her work on the content analysis of NAEP and demonstrated how differences in state content standards and assessments will affect the feasibility of embedding common NAEP items in uncommon tests. John Poggio, director of the Center for Educational Testing and Evaluation and professor of educational psychology and research at the University of Kansas, discussed Kansas' 1992 plan to embed NAEP items in the state testing program and why the plan was subsequently abandoned. Finally, Richard Hill, founder of the National Center for the Improvement of Educational Assessment, Inc., presented his study on the use of embedded NAEP items to estimate the rigor of Louisiana's performance standards relative to NAEP's. The committee is extremely grateful to all of these individuals who helped us clarify our thinking about many of the important issues surrounding our charge.
Other individuals provided information to the committee during small group discussions and telephone interviews. We are particularly grateful to Robert J. Mislevy, Educational Testing Service, and Eugene G. Johnson, American Institutes of Research, who gave us information about the NAEP marketbasket; Gage Kingsbury, research director of the Northwest Evaluation Association, who provided information about the NWEA item bank and locally developed tests; and Duncan MacQuarrie, Department of Curriculum and Assessment, Office of the Superintendent of Public Instruction, Washington State Department of Education, who provided us with information from the CCSSO State Collaborative on Assessment and Student Standards.
We owe a debt of gratitude to John Olson and Carl Andrews of CCSSO for providing the committee with important data about state testing programs. Without their help, and the help of Wayne Martin, we would not have been able to include the 1997-1998 school year information that is presented throughout this report.
We are especially grateful to Bert F. Green, who served as a consultant to the committee and provided invaluable assistance during all phases
of the study. He worked tirelessly on our behalf, analyzing the issues, gathering data, and drafting chapters. The timely preparation of this report on an accelerated time schedule could not have happened without his dedication and contributions.
The Board on Testing and Assessment, under the leadership of Robert Linn, provided the committee with both guidance and support. We were particularly fortunate to have Paul W. Holland, professor of statistics at University of California at Berkeley and a member of the board, as a liaison member to this committee. As the chair of the Committee on Equivalency and Linkage of Educational Tests, Paul was well acquainted with the issues confronting us and proved to be a valuable guide and sounding board as we pondered the complexities of embedding.
We are very grateful to the professional staff of the Commission on Behavioral and Social Sciences and Education, without whose guidance, support, and hard work we could not have completed this report. Barbara B. Torrey, executive director of the commission and Michael J. Feuer, director of the Board on Testing and Assessment (BOTA), created staff support and resources whenever we needed them and provided guidance to us as we navigated through the various stages of completing a National Research Council study in a mere nine months. BOTA staff members Naomi Chudowsky and Karen Mitchell made major contributions to our work, attending committee meetings and discussing ideas with the committee and staff. Karen was particularly gracious in her willingness to read and comment on the many drafts of this report that we endlessly piled on her desk. BOTA staff members Alexandra Beatty and Robert Rothman also read and commented on early drafts of this report; the finished product is better for their efforts. We would be remiss if we didn't also thank two new members of the BOTA staff: Judith Koenig, study director of the Committee on NAEP Reporting Practices, for sharing her library of testing books and journals with us; and Richard Noeth, study director of the Committee on the Evaluation of the Voluntary National Tests, Year 2, for his, guidance, support, and encouragement of our efforts.
John Shephard, although new to the Board, served unflappably and flawlessly as the committee's senior project assistant. He dealt smoothly with the logistics of our three committee meetings in four months, with our enormous collections and distributions of materials, and with a seemingly endless stream of text files, e-mail file attachments, and file revi-
sions in incompatible word-processing formats. His assistance at critical junctures along the way made the creation of this report possible.
John received support when he needed it from other wonderful project assistants: Lisa Alston, Dorothy Majewski, Susan McCutchen, Kim Saldin, and Jane Phillips. Viola Horek, administrative associate to BOTA, was always there, instrumental in seeing that the entire project ran smoothly.
We are deeply grateful to Eugenia Grohman, associate director for reports of the Commission on Behavioral and Social Sciences and Education, for her advice on structuring the contents of the report and for her expert editing of the text. Genie knows better than anyone else how to put a report together, from beginning to end.
Above all, we thank the committee members for their outstanding contributions to the study. They drafted text, prepared background materials, and helped to organize workshops and committee discussions. Everyone contributed constructive, critical thinking, serious concern about the difficult and complex issues that we faced, and an openmindedness that was essential to the success of the project.
This report has been reviewed in draft form by individuals chosen for their diverse perspectives and technical expertise, in accordance with procedures approved by the Report Review Committee of the National Research Council. The purpose of this independent review is to provide candid and critical comments that will assist the institution in making the published report as sound as possible and to ensure that the report meets institutional standards for objectivity, evidence, and responsiveness to the study charge. The review comments and draft manuscript remain confidential to protect the integrity of the deliberative process.
We thank the following individuals for their participation in the review of this report: Glenn Crosby, Department of Chemistry, Washington State University; John Guthrie, College of Education, University of Maryland; Lyle V. Jones, L.L. Thurstone Psychometric Laboratory, University of North Carolina, Chapel Hill; Stephen Raudenbush, School of Education, University of Michigan; Henry W. Riecken, Professor of Behavioral Sciences (emeritus), University of Pennsylvania School of Medicine; David Thissen, Graduate Program in Quantitative Psychology, University of North Carolina, Chapel Hill; Ewart A.C. Thomas, Department of Psychology, Stanford University; and Gary Williamson, Division of Accountability Services, North Carolina Department of Instruction, Raleigh.
Although the individuals listed above provided constructive comments and suggestions, it must be emphasized that responsibility for the final content of this report rests entirely with the authoring committee and the institution.
MERYL W. BERTENTHAL, STUDY DIRECTOR
DANIEL M. KORETZ, CHAIR
COMMITTEE ON EMBEDDING COMMON TEST ITEMS IN STATE AND DISTRICT ASSESSMENTS