Skip to main content

Currently Skimming:

9. Statistical Issues in Analysis of International Comparisons of Educational Achievement
Pages 267-294

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 267...
... This chapter considers statistical issues that arise in drawing conclusions from international studies of achievement. It is organized around two distinct but related uses of the data these studies yield.
From page 268...
... This chapter will consider the statistical issues underlying the validity of such inferences and interpretations. The second use of international comparative data involves causal explanation.
From page 269...
... With these constraints in mind, this chapter begins with the problem of description and comparison of national differences. It then turns to the role that statistical analyses of comparative data might play in causal explanations.
From page 270...
... Outcomes such as achievement and literacy typically are measured on an arbitrary scale. Mean differences between nations might look big, but there is no way of assessing their magnitude without knowing how much variation lies within societies.
From page 271...
... Displaying confidence intervals is more informative because it allows the reader to gauge how much weight to put on an observed mean difference. Mean differences will be misleading when statistical interactions are present.
From page 272...
... to study confounding variables and take necessary precautions that readers not misinterpret mean differences. lust how much complexity must be reported must be decided on a case-by-case basis.
From page 273...
... Even more problematic, distinctions between secondary and postsecondary school historically have become blurred. In the United States, persons failing to obtain a high school diploma may show up later in community colleges, where many will obtain GEDs (high school equivalency diplomas)
From page 274...
... Clearly, to summarize the differences between the two countries with a single number such as a mean difference would be misleading in this case. Of course, these national differences may not achieve statistical significance at any percentile, so more investigation is required.
From page 275...
... J a p a n ~ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - ~ mN~N_~ 0.0 · - - - - - - - - - - - - - - - - - - - - - - - - - - ~N~NNNNNNNNNNNNNNNN~ - INN ~ _ · _ · ~ - -hi- - - - - - - - - - · - - 0 200 400 600 800 1,000 International Mathematics Score . ~ FIGURE 9-lb Comparison of distributions (mathematics outcome)
From page 276...
... Indeed, the presence of confounding variables would certainly challenge the validity of causal inferences regarding national education systems as causes of national achievement differences. But our concern here simply involves accurate interpretation of descriptive statistics, not causal inference.
From page 277...
... STEPHEN W RAUDENBUSH AND JI-SOO KIM in the Japanese Sample (Population 2)
From page 278...
... combine to raise questions about the conception of the target population. Students in the same grade in the two countries appear to vary in exposure to schooling as a function of grade retention, casting doubt on the meaning of country comparisons for Population 2, even after adjusting for age.3 One might argue that the mean differences between the United States and Japan in mathematics are large in any case for Population 2, but comparisons between the United States and other countries may be quite sensitive to the differential selection of students into seventh and eighth grades.
From page 279...
... If, for any cohort, nations vary in the mean age of their populations, differences in mean age between cohorts will vary as well, and these differences are likely related to mean differences in achievement. Thus, age differences can masquerade as differences in "gains." In particular, if nations vary in the fraction of students retained in grade between grades four and eight, such differences will bias estimates of the "gains" of interest.4 Again using the framework already described, a description of cohort differences requires a viable definition of the target population for each cohort.
From page 280...
... However, that nation's eighth graders are still comparatively low on Construct B given their low starting point. A distance between cohorts that describes change in relative standing will be, in fact, the distance between Construct A for grade four and Construct B for grade eight.
From page 281...
... We shall consider this issue later under "The Role of International Comparative Data in Causal Explanation." That section will reveal substantial threats to the inference that age-related differences reflect the differential effectiveness of educational systems. Describing National Changes Over Time Based on Repeated Cross-Sections An important goal of international studies of educational achievement is to describe the improvements in a nation's achievement over time and to compare nations with respect to their rates of improvement.
From page 282...
... In a repeated cross-sectional design, age reasonably might be viewed as fully controlled, provided the education system has not changed in its basic structure. If, for example, eighth graders in the United States in 1999 had the same age composition as eighth graders did in 1995, then the mean difference between time 1995 and 1999 in the United States will be unconfounded with age.
From page 283...
... Cross-national differences in immigration and grade retention would then bias inferences about natural differences in age-related gain. THE ROLE OF INTERNATIONAL COMPARATIVE DATA IN CAUSAL EXPLANATION The foregoing discussion reveals that sound explanations of national achievement differences require sound descriptions and sound comparisons: descriptions of achievement within a nation at a given time and comparisons between nations at a given time; description of cohort differences at a given time and comparison between nations on cohort differences; description of achievement trends over historical time and comparison between countries in terms of their achievement trends.
From page 284...
... Can we conclude that differences between nations in their gains reflect differences in the effectiveness of the educational systems? To answer this question, we must reflect on a large literature that considers the adequacy of nonexperimental designs for drawing causal inferences.
From page 285...
... ~ C ~ ~ Country2 Grade 4 FIGURE 9-4 Causal inference based on two age groups.
From page 286...
... Between fourth and eighth grades, their ~ Country ~ ~ C;ounil:ry Grade Grade K Grade 8 FIGURE 9-6 Scenario 2: Unequal growth prior to grade 4.
From page 287...
... Other secular trends such as the increasing number of children in daycare, increasing nutrition, increasing survival rates of premature babies, changes in poverty rates, and changes in access to television and the Internet can contribute to achievement trends even if the schooling system remained invariant in its effectiveness. In sum, repeated cross-sections control age in allowing description of historical changes in achievement and comparison of countries in their achievement trends.
From page 288...
... Researchers using TIMSS data have sought not only to make causal inferences about the impact of educational systems on student learning; they have also sought to identify specific causal mechanisms that would explain these national effects. Not only are such explanations potentially important for policy and theory, they also may compensate for lack of methodological controls.
From page 289...
... Causal inferences based on comparing trend data between nations require similar cross-national extrapolations. Such extrapolations, although intriguing, should not be confused with reasonable causal inferences.
From page 290...
... We take the view that such data play a significant role in causal thinking by suggesting promising new causal explanations. For example, connections between the intended curriculum, the implemented curriculum, and the achieved curriculum in various countries participating in TIMSS have suggested a provocative explanation for shortcomings in U.S.
From page 291...
... Multilevel analyses of such data could test hypotheses suggested by between-country comparisons within a number of societies, a powerful design indeed. Difficulties with the third option include the high cost and managerial complexity of carrying on multiple longitudinal studies in varied nations.
From page 292...
... , in how cross-national differences are interpreted, and in how we think about curriculum and opportunity to learn as causal mechanisms. These deep issues of design and interpretation cannot be resolved by sophisticated statistical analytic methods.
From page 293...
... Thus, although language at home is likely a confounding variable (it is plausible that the U.S. sample has more nonnative English speakers than the Japanese sample has nonnative speakers of Japanese)
From page 294...
... (1999~. Estimation of the causal effect of a timevarying exposure on the marginal mean of repeated binary outcome.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.