Skip to main content

Currently Skimming:

2 The Potential of Alternative Data Sources to Modernize Elementary Indexes
Pages 25-62

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 25...
... Current methods are briefly reviewed, then alternative data sources -- focusing on various types of scanner and web-scraped data -- are assessed for their potential to improve the accuracy, coverage, and timeliness of elementary indexes. Challenges to implementing new data and new methods, of which BLS staff are keenly aware, are also considered.
From page 26...
... htm. A00858 -- Consumer Price Index REV.indd 26 8/15/22 11:54 AM
From page 27...
... They conclude that it has the effect of raising the expected values of an index based on nonlinear formulas, especially the geometric mean formulae and that more extensive use of large-sample scanner data sources may mitigate the problems. 5 For an overview of potential biases, see the following BLS article: www.bls.gov/cpi/­ notices/2017/methodology-changes.htm.
From page 28...
... Alternative data sources that have been explored for price measurement purposes include point-of-sale (POS) data (obtained either directly from bricks-and-mortar or online retailers or from firms that aggregate the data)
From page 29...
... A00858 -- Consumer Price Index REV.indd 29 8/15/22 11:54 AM
From page 30...
... successfully constructed a basic item-level index for coffee using scanner data. A Conference on Re search on Income and Wealth publication on scanner data and price indexes (Feenstra and Shapiro, 2003)
From page 31...
... Similarly, care must be taken to ensure that the product codes used in the scanner data internally track identical items over time and across retailers. When scanner data are obtained from aggregator firms, much of this processing is already done, reducing the production burden to a statistical agency.
From page 32...
... BLS Experience Using Scanner Data BLS has historically investigated the role of scanner and web-scraped data mainly as a way of obtaining price quotes, perhaps more easily than in-store price checking by field staff, within the current measurement frame work. BLS initiatives incorporating scanner data in the CPI program have focused on the food at home category.
From page 33...
... Use of Scanner Data by Other Statistical Agencies Statistical agencies in other countries and academic researchers have led the efforts demonstrating the feasibility of using alternative data sources to replace aspects of the existing sample-based structure for price measure ment.17 Across statistical agencies internationally, the motivations behind this work have been diverse, ranging from cost containment to the need to more quickly capture effects associated with the arrival of new goods and outlets, or the changing composition of spending patterns. For BLS, lessons learned from these efforts will have to be adapted to the unique legal and 17 This section references only a small sample of the national statistical offices advancing the use of alternative data sources in their CPI programs.
From page 34...
... , representing 20 percent of the CPI Basket Weight. The agency is aiming to collect 70–80 percent of its price quotes from alternative data sources, representing 55 percent of ­basket weight, by March 2023.19 ABS uses scanner data from retailers to obtain prices for about 16 per cent of Australia's CPI by item weight.
From page 35...
... OpenDocument. 22 Details of the UK ONS experience experimenting with multilateral indexes for scanner data can be found in "Using alternative data sources in consumer price indices: May 2019" www.ons.gov.uk/economy/inflationandpriceindices/articles/usingalternativedatasourcesin consumerpriceindices/may2019.
From page 36...
... . 26 See the appendix to this chapter on the use of multilateral methods for blending alternative data sources, including web-scraped data, to estimate price relatives.
From page 37...
... , investigating the similarity of online and offline prices using evidence from large multi-channel retailers, found that "price levels are identical about 72 percent of the time." A00858 -- Consumer Price Index REV.indd 37 8/15/22 11:54 AM
From page 38...
... On the quality side, price data from the web can be collected in a timelier manner than is possible when relying on surveys or third-party scanner data to be processed. On the cost side, web-scraping can automate price collection for some goods and ser vices, which can potentially reduce costs and increased coverage.
From page 39...
... Not knowing what scanner data 30 www.abs.gov.au/articles/web-scraping-australian-cpi. A00858 -- Consumer Price Index REV.indd 39 8/15/22 11:54 AM
From page 40...
... in a study using scanner data from a Dutch supermarket chain on detergents. The international price statistics community appears to have reached a consensus that multilateral methods, such as those proposed by Ivancic, Diewert, and Fox (2011)
From page 41...
... In the future, it may be possible for agencies to set up their own scanner data and web-scraping operations, but such a system is some ways off. The more immediate tasks would be to set up contracting arrangements that make sense for both BLS and data providers, ensure confidentiality given the sensitivity of the data, set up arrangements that ensure reliability of sources, and create contingency plans in the case of disruptions in the supply of CPI input data.
From page 42...
... An example of a more expansive framework is the Total Error Frame work (TEF) , which broadens the nonsampling error component to include measures of error associated with commercial and other types of data and 34 The Census Bureau has performed similar exercises with NPD scanner data, comparing store-level revenue data to that reported in their trade surveys.
From page 43...
... Embrac ing alternative data sources now, and moving forward aggressively with research for their integration, will ensure that the accuracy and timeliness of the CPI will not be compromised in the future. The data modernization strategy will involve: • Identifying promising alternative data sources and then prioritizing the work needed to evaluate and incorporate these data into the items/strata where they can be applied; • Continuing development of a robust research agenda that supports incorporation of alternative data and associated new methodolo gies more broadly beyond just price quote replacement; • Continue research assessing the quality of new types of data; 35 For a description of this framework, see Total Error in a Big Data World: Adapting the TSE Framework to Big Data (academic.oup.com/jssam/article-abstract/8/1/89/5728725?
From page 44...
... While BLS has certainly made progress using transaction data to replace price quotes, the agency has the opportunity to go much further. Recommendation 2.2: BLS should accelerate its research identifying alternative data sources that could potentially be integrated to replace price quotes collected within the current framework.
From page 45...
... This means that if scanner data cover about half of the CPI relative importance for goods, the total amounts to a bit less than one-fourth of the overall CPI. However, the missing goods are mainly vehicles, nonpackaged food, and energy, where other alternative data sources may be helpful.
From page 46...
... A00858 -- Consumer Price Index REV.indd 46 8/15/22 11:54 AM
From page 47...
... The panel recommends that BLS develop a robust research agenda that supports incorporation of alternative data more broadly beyond just price quote replacement. This will require accelerating research evaluating the role of the leading multilateral index approaches A00858 -- Consumer Price Index REV.indd 47 8/15/22 11:54 AM
From page 48...
... 41 As described in the appendix to this chapter, recent research has attempted to perform quality adjustment at scale, often with the use of scanner data. A00858 -- Consumer Price Index REV.indd 48 8/15/22 11:54 AM
From page 49...
... 43 For example, in work measuring price change for consumer electronics using scanner data, Statistics New Zealand has been employing time-dummy hedonic models. www.stats.govt.nz/ methods/measuring-price-change-for-consumer-electronics-using-scanner-data.
From page 50...
... Some retail chains have both online and in-store sales, and comparisons can be made to test the nature of any systematic price differences. A00858 -- Consumer Price Index REV.indd 50 8/15/22 11:54 AM
From page 51...
... The authors had access to quantity data, they were able to examine the importance of several issues raised in this report. For example, using the high-frequency data, they were able to directly test for chain drift and to assess the magnitude of the new A00858 -- Consumer Price Index REV.indd 51 8/15/22 11:54 AM
From page 52...
... The ability to integrate electronic transactions data -- ideally, data that are linked to households making purchases -- represents the ideal scenario for price measurement. A00858 -- Consumer Price Index REV.indd 52 8/15/22 11:54 AM
From page 53...
... These methods have emerged as "best practices" to exploit scanner data for price measurement. In contrast to bilateral index methods that compare prices across two time periods, multilateral index methods make price comparisons across three or more time periods (Chessa, 2016)
From page 54...
... ABS (2016) and the statistical agencies of Norway and Belgium have implemented matched-model GEKS-Törnqvist for the treatment of scanner data from supermarkets.
From page 55...
... Using these weighted bilateral TDH indexes as inputs in GEKS thus gives rise to a hedonic imputation GEKS-Törnqvist index. Statistics New Zealand implemented this method for scanner data on consumer electronics goods purchased from market research company GfK.
From page 56...
... For supermarket scanner data, ABS implemented rolling window matched-model GEKS-Törnqvist with a mean splice (ABS, 2017)
From page 57...
... First, multi­lateral methods do not satisfy the multiperiod identity test: when prices return to their initial level, multilateral price indexes, including GEKS, are not neces sarily equal to 1. At least from a theoretical perspective, violation of the A00858 -- Consumer Price Index REV.indd 57 8/15/22 11:54 AM
From page 58...
... induced by the COVID-19 pandemic. A00858 -- Consumer Price Index REV.indd 58 8/15/22 11:54 AM
From page 59...
... A00858 -- Consumer Price Index REV.indd 59 8/15/22 11:54 AM
From page 60...
... argued that the infinitely high implicit "demand reservation prices" of the CES model can result in overadjustment for new and disappearing varieties. A00858 -- Consumer Price Index REV.indd 60 8/15/22 11:54 AM
From page 61...
... Ehrlich et al., (2021) folded this method into their strategy for constructing price indexes with scanner data.53 The other direction taken in recent work has been in leveraging new artificial intelligence (AI)
From page 62...
... A00858 -- Consumer Price Index REV.indd 62 8/15/22 11:54 AM


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.