National Academies Press: OpenBook

A Consumer Food Data System for 2030 and Beyond (2020)

Chapter: Appendix C: Summary, Third Meeting, September 21, 2018

« Previous: Appendix B: Summary, Second Meeting, June 14, 2018
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

Appendix C

Summary, Third Meeting, September 21, 2018

The panel’s third meeting, held September 20-21, was intended to broaden the information-gathering phase of the study to include the broader research community that puts Economic Research Service (ERS) data to use. Colleen Heflin of Syracuse University, Justine Hastings of Brown University, and Chuck Courtemanche of Georgia State University presented ideas for improving food and nutrition data—including integration of commercial and administrative data—to inform key policy issues. Among the topics they discussed were the value (and limits) of linking Supplemental Nutrition Assistance Program (SNAP) administrative data with other types of administrative data, such as unemployment insurance (U/I), Medicaid, and K–12 education; the limits of existing survey data; use of retail panel loyalty card data and Rhode Island state administrative records (housed in a secure facility at Brown University) to analyze how SNAP benefits are spent; evidence needed to design a “smarter SNAP”; and food consumption data needs for obesity and other health research.

Amy O’Hara (panel member), Rachel Shattuck and John Eltinge of the Census Bureau, and Lisa Mirel of the National Center for Health Statistics (NCHS) gave presentations on the potential of data integration, linkages for policy research, and the use of administrative data. Practices being developed by the statistical agencies for combining data sources were also discussed, including the Next Generation Data Platform—a collaboration between Census, ERS, and the Food and Nutrition Service (FNS) that links SNAP data (in 19 states and in 39 counties in California) and the Special Supplemental Nutrition Program for Women, Infants, and Children (WIC) data (in 11 states) to Census survey and administrative data.

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

Rob Santos of the Urban Institute, who is a member of the Feeding America Technical Advisory Group, discussed the collaboration with the Urban Institute on a research program that attempts to detail the frequency of visits to food pantries by individuals, either as a temporary, emergency food source or as a regular supplemental food source. Alessandro Bonanno of Colorado State University discussed possible improvements to geospatial information in ERS’s food data system (e.g., for assessing the role of the accessibility of food outlets in SNAP participation and effectiveness).

A final open session was held on the use of proprietary data for food policy research. Mary Muth of RTI discussed types, sources, and considerations in using store scanner data, household scanner data, and nutrition data from labels for food policy research. Helen Jensen of Iowa State described the use of proprietary (scanner) data for understanding issues related to the WIC program. Carma Hogue of the U.S. Census Bureau described Census’s work on improving economic statistics through web scraping and machine learning to discover, collect, and process data from the web.

C.1. UPDATE ON RECENT DEVELOPMENTS AT ERS AND WITH FOODAPS-2

After a brief welcome from panel chair Marianne Bitler, Jay Variyam, division director at ERS, updated the panel on three significant developments at ERS:

  1. The USDA secretary has proposed realigning ERS with USDA’s Office of the Chief Economist—the ERS administrator would report directly to the chief economist instead of the undersecretary for research, education, and economics, as is the current practice.
  2. The USDA secretary also proposed relocating ERS functions, along with the National Institute of Food and Agriculture (NIFA), to a new, as-yet undisclosed or unselected location; the target date for relocation is the end of FY 2019. The relocation has implications for the operational side of the Consumer Food Data Program; for example, staffing will potentially be split between two locations, as a few dozen ERS staffers would remain in Washington, DC, while up to 300 others would move. Work with other federal partners, who will be based in Washington, DC, as well as the way ERS handles stakeholder interactions, would by necessity change.
  3. In light of new USDA program and policy priorities, ERS has paused FoodAPS-2 implementation. It is assessing the situation and, in the meantime, working with the contractor, Westat, to create a fully functional data collection app.
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

Mark Denbaly, deputy director for food economics data at ERS, noted that the above changes mean the panel’s role in helping ERS is even more important than before, because ERS needs a roadmap from experts in order to prioritize investments. Variyam pointed out that how ERS structures its staff after relocation will affect stakeholder interactions and other interagency activities, in particular the administrative data program, which requires close interaction with agencies within USDA and outside of it. Panel member Dianne Schanzenbach stated her concern that critical administrative data products produced in conjunction with other federal agencies, specifically with the Census Bureau, will be impacted if ERS staff relocate. Variyam and Denbaly did not speculate on what those impacts might be and stressed that the roadmap they seek from the panel will be key for the future of the Consumer Food Data Program.

C.2. IMPROVING DATA FOR POLICY RESEARCH

Colleen Heflin of Syracuse University began the session by talking about the role and value of administrative data as it relates to USDA data collection. In comparison to survey data, administrative data

  • help minimize the measurement error often found in the self-reporting of program participation;
  • can be used to observe monthly benefit receipts to learn about participation dynamics and intensity of participation; and
  • provide opportunities to learn about multiple-program participation (e.g., SNAP alone versus SNAP plus Medicaid or SNAP plus the Temporary Assistance for Needy Families [TANF]).

Heflin offered examples of combining SNAP data with three different domains of administrative data: Medicaid, U/I, and education.

Medicaid claims data offer rich information about diagnosis, the date a claim was made, in what setting it was made (emergency room, a hospitalization, a nursing home, a pharmacy), and the cost of a claim. Heflin noted a study (Basu et al., 2017) that looked at hospital admissions for hypoglycemia for low-income patients that occurred during the last week of each month when SNAP benefits may have been exhausted. Linking health data to data from food and nutrition programs can inform researchers about the return on investment of these food and nutrition programs.

Linking SNAP data with U/I data allows researchers to understand the dynamics of the relationship between SNAP participation and work. Specifically, one can observe employment behavior before a household goes on SNAP, what that household earns while it participates in SNAP, and changes that occur in times of transition, that is, what happens to wages preced-

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

ing SNAP participation and what occurs after participation is completed. Since this can be done by industry, one can get a sense of which industries have employees who participate in SNAP more than others. Looking closer, one can see which people exhaust U/I and then participate in SNAP, or whether they participate in both programs together.

Linking SNAP data with K–12 data—which includes academic performance, attendance, disability services utilized, suspensions, retention, graduation, participation in school meals, etc.—can offer insights into how the timing of participation in SNAP affects educational achievement and health. Participation in school meals programs is supposed to improve academic achievement, but without detailed education data outcomes cannot be observed. As education data include both SNAP and non-SNAP participants, one can observe differences in attainment among those groups.

Heflin noted that the limitation of administrative data in relation to survey data decline when data are linked across programs. Administrative data from SNAP only include participants, but linking the types of datasets mentioned above to SNAP administrative records allows for more coverage of the total population, undermining a key limitation of most administrative data. When SNAP participation dynamics and benefit amounts data are linked to health care claims, U/I, and education data, observations about other people in the household can be made. Thus, the limitation of a single administrative dataset is minimized by adding many more administrative datasets. Heflin stated that more of this should be done. Participation in SNAP and other food and nutrition programs may impact many other domains, such as interactions with the criminal justice system, wage records, health, and education. Survey data on these domains might not be trustworthy or representative, Heflin noted, which speaks to the value of more administrative record linking. Heflin emphasized that much of these data are housed at the state level, as is the case with SNAP. Getting a state to cooperate with research efforts is fraught with challenges. For example:

  • Creating data that are useful for researchers is costly for the state, both in the skill required to produce it and in associated opportunity costs.
  • Data for a single program are often in multiple files (i.e., there is a demographic file, an eligibility file, and a benefit file), and these files may not have the same timeframe.
  • To preserve confidentiality, unique IDs for participants must be created by the state that are not identifiable to researchers.
  • Research using state records may result in negative findings—an Urban Institute report (Mills et al., 2014) prepared for ERS found that in some states up to one in four SNAP clients experienced gaps in their food stamp benefits even though they were eligible. While they may be painful for some states to acknowledge, these types
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

of findings can have positive effects as state administrators and legislators become aware of problems.

Approaching a state to cooperate in a research project can present more challenges, as there are no national standards. Heflin noted that occasionally a researcher may have to meet with an institutional review board before a project is approved, but this varies by state. Data agreements are legal arrangements, so a researcher working at a university will have to involve that university’s lawyers, who are often not experts in data agreements. This can cause delays as lawyers are brought up to speed; states may also push back against individual components of a project. When a project is completed, states often stipulate that the researcher destroy the data. While this is a reasonable request for data privacy concerns, it also means that researchers cannot later add to that data, precluding any longitudinal analysis. States may also require that the resulting analysis be reviewed or approved by the state prior to dissemination, although this has not been an issue for Heflin—she has received useful comments or added context from the state that improved the final product.

If a researcher is attempting to link data across multiple agencies, each agency will have its own process: sometimes its own set of lawyers, its own institutional review board, its own data agreement language, and its own linking and de-identification procedures, which then need to be harmonized. This process gets multiplied at each addition of datasets. Finally, in many states refreshing the data means starting the data agreement process again—and since state actors frequently change, this may mean there will be no institutional knowledge of the previous work. Some state officials may remember the researcher from work completed years earlier, but often researchers must start afresh in explaining how the process worked the last time. All of this results in high costs to researchers to use state administrative data. The costs may be summarized this way:

  • time to get access (which can take months to years);
  • the enormous size of files (costing storage and computational space);
  • the requirement to have special skill sets (not just standard survey analysis);
  • lack of available codebooks;
  • the need to correct a large amount of error in the data; and
  • the many differences among the states, as well as each agency within a state sometimes having its own process.

Heflin believes that, nevertheless, these costs of obtaining administrative data are worth the investment, especially as survey responses rates continue to decline and costs associated with surveys rise. These administrative data

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

can produce longitudinal datasets to answer policy questions, for example, tracking investments made in early childhood and their educational outcomes. Once agreements can be reached with states, data become available to researchers in a timely fashion—another advantage over traditional surveys.

Heflin ended by offering six suggestions for improving access to administrative data:

  1. Encourage states to make data available to researchers for evaluative purposes when proper data safeguards are in place.
  2. Create data agreement standards.
  3. Establish 5-year minimum agreements; preferably with clauses that do not require the destruction of data.
  4. Make money available to underwrite the state costs (including data analytics training for staff).
  5. Make money available to academic researchers to use administrative data for policy-relevant purposes.
  6. Formally encourage states to collaborate with researchers to use their data to evaluate state policies and practices.

Justine Hastings of Brown University and Research Improving People’s Lives (RIPL) talked about her work combining SNAP data with grocery store scanner data in Rhode Island. This work is underpinned by a customized database created by RIPL that combined all administrative records in the state of Rhode Island for 20 years with detailed information on program participation. Algorithms were developed for identifying individuals across these records, and the data were then anonymized. The records are updated quarterly to keep the database current, something made possible due to the buy-in of state officials to allow access to the records. RIPL gained their confidence by employing robust security procedures—most data they hold are encrypted. If a researcher needs to unencrypt a piece of personally identifiable information, doing so requires a two-party password that sends automated, tamper-proof logs so that every senior team member knows exactly what was done, and when, with that file.

Hastings and state officials sought to understand how SNAP benefits are spent and whether changes in how they are distributed might help the program better meet people’s needs; to accomplish this, Rhode Island allowed RIPL access to state SNAP data. Other data Hastings’ team utilized were scanner data from a major grocery retailer, USDA FoodAPS data, as well as Nielsen Homescan data—these last two elements were used to see whether the grocery panelists were substantially different or similar to SNAP beneficiaries as a whole.

The store scanner data include loyalty card purchases from February 2006 to December 2012 made in five states by households that shop at the

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

chain at least every other month; this resulted in identifying 486,570 households through 608 million purchase occasions. Each purchase included the following information: main payment method used, characteristics of each product purchased (including product size and weight, text description, and location within taxonomy), and coupon redemption and offers.

Using identification strategies afforded by Rhode Island state participation data, Hastings’ team sought to use changes in SNAP enrollment to measure the causal impact of SNAP on food expenditure, such as what is the marginal propensity to consume food (MPCF) using a dollar of SNAP versus a dollar of cash, and to attempt to understand how SNAP enrollment changes measures of shopping effort, which they obtain from their grocery retailer data. They define shopping effort as coupon clipping (when coupons are available), and whether the purchase was for a store brand (i.e., purchase of a cheaper store brand or a more expensive national brand). To account for nutrition, the researchers have built a database to generate several measures of nutrition.

Hastings found (Hastings and Shapiro, 2018) that the MCPF out of SNAP benefits is 0.5 to 0.6 while the MPCF out of cash is much smaller; non-food purchases were not affected. Changes in gasoline prices affected disposable income but did not have an outsized impact on food spending. Hastings also found a small decrease in coupon redemption (shopping effort) and a decrease in the share of store brands purchased, but, again, not in non-food categories.

Hastings noted that these findings are consistent with a model of mental accounting where people feel food-wealthy when they receive a SNAP payment in one sum, and this was reinforced in responses from participants in interviews her team conducted.

Chuck Courtemarche of Georgia State began his remarks by echoing the challenges Colleen Heflin reported earlier: getting data from states can take a very long time. In reference to one of his studies, he noted that it took nearly 2 years for the state to start supplying data and that this occurred only after USDA officials interceded.

Courtemarche then discussed a recent paper (Courtemanche, Denteh, and Tchernis, 2019) about the impacts of SNAP participation on food insecurity, obesity, and food purchases. The motivation for the paper was to look at whether SNAP achieves its goal of improving food security, and whether that goal has unintended consequences. He noted that recent research on causal effects generally finds that SNAP participation reduces food insecurity (Hoynes and Schanzenbach, 2015), but evidence of the causal effect of SNAP participation on obesity is mixed (Gundersen, 2015).

Courtemarche looked at the less-studied phenomenon of measurement error in administrative data using data from FoodAPS. FoodAPS, he noted, offers a unique opportunity to examine misreporting and its consequences, since it contains both self-reported and administrative participation measures.

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

He and his team went into the study thinking the administrative data would be the accurate, “gold standard” benchmark by which they could examine the extent, causes, and consequences of errors in self-reported participation in SNAP. The availability of two different administrative measures (totals the state provides and totals that could be linked through EBT purchases, discussed below) that did not match one another led Courtemarche’s team to undertake a sensitivity analysis to get at the inconsistencies. Their research question changed to: How sensitive are misreporting rates and regression estimates to the use of different coding rules for each of the two administrative measures separately, and to different coding rules for combining the two administrative measures in addition to the self-report rate into a single “true” participation variable? This analysis did not meaningfully affect their initial conclusions. The composition of self-report data from the FoodAPS survey in addition to the two administrative datasets is noted below:

Data characteristics provided by FoodAPS are as follows:

  • a nationally representative survey of U.S. households to collect comprehensive data about household food purchases as well as health and nutrition outcomes;
  • 4,826 households (SNAP, nonparticipating low-income, and higher income);
  • Courtemarche’s sample included 2,108 households with income under 250 percent of the federal poverty level, with no missing data for outcomes and controls;
  • outcome variables—indicators for food insecurity, very low food security, Healthy Eating Index score, body mass index (BMI), indicators for overweight/obesity, obesity (BMI ≥ 30, severe obesity (BMI ≥ 35); and
  • covariates—self-reported SNAP participation and two administrative measures; gender, race, marital status, household size, income, education, age, work, rural tract, and WIC participation.

Data characteristics provided by state administrative data include

  • state caseload information from March to November 2012 (not quite a match to survey dates of April 2012 to January 2013);
  • variation in quality of data across states (e.g., monthly versus non-monthly data, disbursement date availability, period of caseload data); two states did not report disbursement dates, five did not provide caseload data at all; and
  • probabilistic matching of all respondents to SNAP caseload data—based on first name, last name, phone number, house address, and
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

“certain” matches identified by matching score being above predetermined level.

Data characteristics provided by Electronic Benefits Transfer (EBT) swipes and linkage techniques:

  • the state, store ID, EBT account number, date/time of event from April to December 2012;
  • deterministic matching—for households matched to caseload data using a known case IDs: only possible in 13 states where ID numbers are the same;
  • probabilistic matching—for other households, and a probabilistic match based on store ID, amount, and date; in order for matching to occur, the household had to have a purchase during the survey week; if participants stockpiled food the week before or already ran out of benefits, they would not be captured; and
  • no match attempted—if respondent did not self-report either SNAP receipt or any EBT-type payments; thus, they would miss true participants who misreported both of these activities.

Courtemarche concluded that while the FoodAPS’ administrative SNAP measures are not perfect, they are adequate, especially when compared to state data. Whatever error there might be does not seem to meaningfully affect conclusions. There is a low false-negative rate, which might be due to the presence of the administrative measures. Having three different measures, two of which are administrative, allows for a combined measure that is probably of a high quality. But the biggest drawback, Courtemarche believes, is missing data. He conceded that it would be better to have administrative measures of participation for other programs like WIC or Medicaid to improve matching.

The major drawback of this analysis is the lack of “causal” research questions that can be answered with FoodAPS. With less than 5,000 households in FoodAPS, it is hard to use inherently inefficient estimators like instrumental variables or regression discontinuity. The lack of time-series variation prevents difference-in-difference or fixed-effects models as well. FoodAPS-2 could be of great value if it allowed for repeated cross-sections, for example, so one could study effects of state- or county-level variables. If there were a way to track even a subset of households over time, Courtemarche thought, that would be a useful improvement.

During open discussion, panel member Michael Link asked the three presenters to consider the quality of matching administrative records. He pointed out that there is reasonable agreement on what quality survey methodology is, but the linkages as described by the three panelists are

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

less well-defined. Link also wanted to know whether states that hold these vast administrative datasets have realized their value and started creating their own linkages or have been more open to allowing access from researchers.

Colleen Heflin noted that it varies by state; some are more “enlightened” users than others and have begun their own linkages or allowed more access to researchers, but expertise and resource constraints often hold them back. Justine Hastings noted that a state or any other government entity would need the appropriate technical expertise not only to build the data infrastructure but also to use it. She thinks outside groups are better equipped to provide these services to states and to actually offer them the analyses they want and need. Courtemarche agreed that there must be incentives for both data providers and researchers to work together if we are to see real progress.

Panel member Diane Schanzenbach asked Hastings how the research community, including USDA, can obtain more and higher-quality data, whether these data are bought from private companies or obtained from government sources. Schanzenbach also asked for thoughts on how well separating or combining these multiple data sources represent actual spending patterns. Hastings pointed out that data from food pantries, soup kitchens, and from credit card purchases for food could enrich what is known about consumption thus informing spending. Credit card data would also be valuable in determining food-away-from-home purchases. Hasting thought survey data are useful, but the recall limitations of respondents as well as declining response rates is of concern to her.

C.3. DATA INTEGRATION AND LINKAGES FOR POLICY RESEARCH USING ADMINISTRATIVE DATA

Panel member Amy O’Hara began the session by describing the international set of best practices for the handling of sensitive data, especially the use of administrative data or health data. Key to this handling are the five “safes”: safe projects, safe people, safe settings, safe data, and safe outputs—the federal statistical system, as a whole, performs these functions well. The system has infrastructure in place so that the linkages described by earlier presenters can be done with the lowest risk possible.

Knowing why data are collected is another component. Are they being collected to answer questions pertinent to an agency or for Congressional oversight? Knowing who developed the collection, who approved it, and how much latitude the people that are conducting the collection and analysis are also valuable in determining whether data are fit for research purposes. Researchers must also consider how the data are handled, particularly when attempting linkages. Using linked, harmonized data relies

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

on the data providers curating their data for such purposes. Any breaks in either collection or treatment will affect linkages.

O’Hara pointed out that such curation has occurred at agencies such as NCHS, the Census Bureau, and some local governments, but access to these data can be limited to employees of the agency and certain academics who can navigate the process to make use of these, often, sensitive data. The Census Bureau has established policies for interested parties to gain access. The point is, O’Hara continued, that providers must have confidence that a data user would handle the data responsibly. Further, the location of any data analysis also affects access. Questions that must be answered include

  • Where will the work be done? Is it going to be at the Census Bureau? Is it going to be at the headquarters of the private company?
  • Will the researcher be furnished the data via a laptop?
  • Will the researcher have to go to a data enclave? This could be a federal research data center, an enclave administered by a third party, such as the National Opinion Research Center, or an enclave maintained by a state—Washington State and South Carolina have such enclaves.

Another best practice O’Hara mentioned for the handling of sensitive data concerns the output data and its quality. Research papers or dashboards may have to be reviewed prior to dissemination at least to ensure the correct privacy protections are being applied so that those individuals in the data cannot be re-identified and that they have consented to the new analysis being conducted. O’Hara noted that the Census Bureau’s surveys no longer ask individuals for consent for linkage because the data will be used only for statistical purposes when they are linked. The cost of standing up and maintaining a linkage operation is usually substantial, especially if one is interested in doing time-series analyses. With respect to output quality, O’Hara said that one has to be particularly interested in coverage. For example, the Longitudinal Employer-Household Dynamics Program (LEHD) program at the Census Bureau has data from only 13 states. While this may be sufficient for Census’s purposes, it may not be sufficient to answer broader policy questions such as levels of food security.

Rachel Shattuck of the Census Bureau described work being done at the bureau in estimating SNAP and WIC eligibility and participation. Congress authorizes the bureau to collect administrative records to improve survey operations. Examples include

  • researching and developing applications of administrative records for use in Census and survey operations including imputation, evaluating coverage, and sampling frame improvement;
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
  • conducting innovative social scientific academic research using linked data to improve estimates about characteristics and behavior of the U.S. population; and
  • linking multiple data sources to create new statistical products, for example, SNAP and WIC program eligibility and participation estimates.

An aspect of the Census Bureau’s partnership with USDA that is of note for the panel is linking administrative and survey data to understand and improve models of SNAP eligibility and participation rates. These linked data can also benefit states in that they gain information about participants and eligible nonparticipants as well as for outreach to prospective participants—24 states have agreements with the Bureau to share their SNAP data, while 11 states have a similar agreement to share WIC data.

The Census Bureau acquires administrative records via legal agreements with states and the data are encrypted when transmitted. When the files arrive at the Census Bureau they are placed on a secure, isolated server where a very small number of staff who have authorization to see these data create matching identifiers and remove all personally identifiable information (PII).The data then become available to researchers for use. Access to the data requires producing a proposal that describes data use, research question, and methods, etc. In some cases the agency that owns the data may need to review the output before submission for publication can occur. The final step performed at the Bureau is linkage of the administrative records with existing survey data.

For SNAP and WIC linking, Shattuck continued, states are requested to provide: participant PII such as name, date of birth and Social Security number (SSN), as well as address history, eligibility certification, and termination dates, and monthly history of benefits received. To link data, the Census Bureau uses the Person Validation System, which uses the PII and a probabilistic matching technique to assign a unique identifier called a Protected Identification Key (PIK). Address information is also used to generate a unique address identifier. Shattuck reiterated that before researchers can use the data, PII is removed, and what remains on the file is a unique identifier that also appears in survey data. The Bureau can then match the same individuals who appear in the administrative records to respondents in the survey data.

Specifically, for SNAP and WIC data, sources and estimation method involve

  • Modeling for eligibility. Data from the American Community Survey (ACS) that includes annual individual-level microdata with a reference period of 12 months prior to survey month. Those who can be modeled-eligible for SNAP are individuals in families with
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
  • annual income below the FNS eligibility threshold. For WIC, the modeled-eligible include children under age 4 years who are on Medicaid, or receiving SNAP or TANF (this is self-reported), and an annual family income below the FNS eligibility threshold. The Bureau cannot measure pregnancy with ACS data, so pregnant women are excluded from eligibility estimates.

  • Linkage of ACS sample records to administrative records to identify participation. For SNAP, this includes individuals of all ages, while for WIC it is children ages 0-4 years, and women ages 15 years and older.
  • Aggregation and calculation of coverage rates and distributions of characteristics at state and county level. This is done to create table packages and data visualizations, which are sent to states after being cleared for disclosure avoidance.

Estimates provided by the Bureau to states generally include more information about participants—such as sub-state eligibility and coverage rates that are often stratified by demographic and economic characteristics, and by county—than the states can collect on their own. The Bureau can also estimate an eligible nonparticipating population that include characteristics—this can be helpful with outreach to eligible people who are not participating.

The Bureau faces challenges in producing these estimates. In particular, full state and territory participation is hindered by high rates of turnover in state agencies and limited resources. Some states might want to give their data to the Bureau but they may have limited technical ability for how to do so. Some states are reluctant to share their data because they have concerns about confidentiality, while some are concerned that their data will be made public and they may be compared unfavorably to other states.

Speaking about data quality, Shattuck said the Bureau tries to make it as easy as possible for states to share their data with them, emphasizing the basic information that they need from states to create a unique identifier and the basic information needed to model participation. They tend to get what they need to create table packages, but other data on the file may vary from state to state that affects usability. While administrative records are not representative of the U.S. population, they can have information on hard to count populations, such as low-income children who do not appear in the decennial census. The Bureau has newly created a data quality branch that helps with technical issues, while program staff are tasked with verifying that files can be accessed, generating SAS datasets, producing documentation, and utilizing multiple analysts for quality assurance. Shattuck mentioned next steps in data quality at the Bureau, which include more research on where data quality issues typically occur and how they can be

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

anticipated and addressed, more automation of the quality control process, and better standardization of variables and documentation across states.

John Eltinge, Census Bureau, spoke about data quality issues when integrating multiple data sources from the perspective of the Federal Committee on Statistical Methodology of the Washington Statistical Society, of which he is a member.

The first, inferential quality, involves having a clear vision when communicating what an estimate encompasses in a given setting, and the related inferential goals or questions one is trying to address with respect to those estimates. Transparency of methods and processes is required, especially the level of aggregation (e.g., geography), the quality of the information at the specified level of aggregation, the extent of stakeholder risk incurred through poor quality or break in series, as well as conveying the value of transparency of the above concepts. It may be challenging, Eltinge continued, to convey the importance of inferential quality to technical specialists, “power users,” the media, and general public—the last two groups especially so—but it should be attempted nonetheless.

Discussing quality of data sources, Eltinge sees a need to allocate resources to ensure satisfactory balance of multiple dimensions of quality, risk, and cost—all elements that affect the design of any data collection and analysis. Methodology will also have to be improved to create an extension of standard total survey error models to integration of multiple sources, especially in relation to population coverage and missing variables. Practically speaking, taking action on the above items would involve finding better data sources, such as more administrative records or bridge surveys, and making inferences about current sources and accounting for errors.

The risks to data quality involve the loss of, or major changes in, data sources—this is well known to any researcher or analyst. Changes in a production system and the related costs, as well as disclosure, are other factors. These issues also need to be addressed through tools designed to identify and manage risk. Below the federal level there are implications for management and integration of regional data sources, especially the costs incurred in linking datasets. These costs borne by agencies and researchers can be substantial, so they must be included in budgets

Eltinge concluded that in the “old world” when sample surveys were a dominant mode of data collection, there was a high degree of control over nearly everything that took place in data collection, analysis, and inference, but this is not the case when linking multiple data sources that include administrative records.

In the discussion that followed this session, panel member Jim Ziliak asked whether states had asked explicitly for data products or other resources when sharing their data with the Census Bureau. Amy O’Hara said that, in her experience at the Bureau, if states asked for money to defray

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

reasonable costs, the Bureau paid. John Eltinge pointed out that some states and smaller geographical units are keen to acquire economic measures about their locale, but these often very small units do not have enough observations in data to be released in full. The recent Bureau notion of a “privacy budget” in disclosure limitation influences how much local-area data can be released.

Panel member Craig Gundersen asked about the varying level of competence in data science in states and whether they might want an expert to help them organize their data in more effective ways to produce more useful products for both the state and federal partners. He also asked whether states were imposing higher restrictions on federal partners, in terms of confidentiality, than the states themselves are imposing. Chuck Courtemanche thought aligning incentives and giving states something of value—a specific data product—is important because many states do not consider the output of researchers, by itself, to be a worthwhile investment. He noted that the disposition of an individual official at a state can greatly affect that state’s cooperation; showing that person how the research will benefit their office or agency can be helpful.

In terms of confidentiality, from a federal perspective, Eltinge pointed out that federal requirements vary by agency and type of data handled (health, tax, education, etc.). He thought the implementation of a privacy budget approach to assess what the incremental risk of disclosure is regardless of what other entities, including states, do would be a fundamental change.

Cordell Golden, NCHS, described a data linkage project his unit is currently undertaking using information from Department of Housing and Urban Development (HUD) Rental Assistance programs.

The motivation for the linkage lies in the strategic goals of both NCHS and HUD, results of the Foundations for Evidence-Based Policymaking Act, and several directives on the use of administrative records issued by the U.S. Office of Management and Budget (OMB) of the Executive Office of the President.

These three rental assistance programs to which NCHS has linked data are (i) Public Housing (PH), which is federally funded and regulated but managed by local housing authorities, (ii) Housing Choice Voucher (HCV), which is HUD’s largest rental assistance program for monthly rental assistance payment to assist very low income families, and (iii) Multifamily (MF), where there is a contract between HUD and owners of a development.

NCHS views its partnership with HUD to be mutually beneficial, where both agencies bring some level of expertise to the table. NCHS has experts in health and data linkages—the Special Projects Branch is the data linkage program at NCHS. HUD brings experts on housing dynamics.

A memorandum of understanding was signed in which NCHS would perform the linkage and would also waive the federal statistical research

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

data center (FSRDC) fees for researchers from HUD that would use the data. HUD would be tasked with providing geocoding services for the NCHS surveys that were to be linked. Two NCHS surveys used in the linkage project were the National Health Interview Survey (NHIS) and the National Health and Nutrition Examination Survey (NHANES).

Augmenting the survey data with longitudinal administrative data facilitates richer analysis and allows NCHS to address questions that cannot be addressed with survey data alone. It also enhances the administrative data by adding socio-demographics, health behaviors, and other outcomes from the survey. The linkage criteria are as follows: the respondent must

  • provide sufficient personally identifying information (SSN, name, date of birth, sex);
  • not explicitly refuse linkage; and
  • not refuse to answer question about public housing (NHIS-only).

For child respondents, only information gathered prior to their 18th birthday may be linked due to consent rules. NCHS follows a deterministic approach and uses SSN, date of birth, sex, and name as identifier.

NCHS has produced several reports based on these linkages. One such report describes the methodology for the linkage.1 Although this report was produced by NCHS, it was done in collaboration with HUD, particularly on issues related to the guidelines on how the data should be analyzed. Other examples of NCHS research include “Housing Assistance and Blood Lead Levels: Children in the United States, 2005–2012A” and “HUD Housing Assistance Associated with Lower Uninsurance Rates and Unmet Medical Need,” which examines whether receiving HUD housing assistance is associated with improved access to health care. Many reports have also been produced by HUD that describe adults and children who receive HUD benefits.2

Access to the linked data is similar to accessing Census Bureau data described by Rachel Shattuck. NCHS has research data centers in Atlanta and Washington, DC, and they are affiliated with FSRDCs around the country as well. Research proposals are required but NCHS has feasibility files on its website that provide an indicator on eligibility status, whether or not the survey participant was eligible to be included in the linkage, and whether the participant provided consent. The files also tell researchers whether NCHS found the respondent in a HUD program. These files are designed for researchers, as they prepare their FSRDC proposal, to estimate their maximum analytic sample.

___________________

1 See https://www.cdc.gov/nchs/data/series/sr_01/sr01_060.pdf.

2 See https://www.huduser.gov/portal/publications/Health-Picture-of-HUD.html and https://www.huduser.gov/portal/publications/Health-Picture-of-HUD-Assisted-Children.html.

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

Golden concluded by noting that the linkage project with HUD demonstrates an effective collaboration between two federal agencies, which both agencies plan to continue to produce this rich data source.

C.4. ADDITIONAL NONGOVERNMENTAL SOURCES FOR FILLING DATA GAPS IN ERS’S CONSUMER FOOD DATA SYSTEM PROGRAM

Rob Santos, The Urban Institute, spoke about projects and reports coming out of Feeding America’s (FA) flagship survey, Hunger in America (HIA).3 As background, Santos noted that Feeding America is a network of 200 independent food banks throughout the country. These food banks partner with more than 60,000 agencies (pantries, meal programs, etc). Annually, they provide service to more than 40 million individuals and give out over 3 million pounds of food nationally.

FA has a robust research group that attempts to answer three principal research questions: who are the clients, what are their needs, and how can we serve them better. To help better identify their clients, FA has an ongoing effort to track client data, registering every distinct individual and then tracking them over time, including how often they go to the food bank, what do they get, and so forth. The client portrait can identify specific vulnerable subgroups to determine what the group’s needs are, with a heavy emphasis on social demographic characteristics. This information is also used during FA fundraising activities to inform funders about the types of clients FA have and are helping. Any changes in distinct client count can be used to assess overall productiveness of FA programs. HIA includes rich data specific to clients, and FA gathers information on food insecurity, nutrition, and any ancillary additional measures such as housing stability, health issues, employment, basic household needs, and food insecurity. Evaluation research in assessing outcomes of FA’s clients results in pilot programs. This is done to ensure clients’ needs are met as circumstances change.

HIA is the largest study of charitable feeding in the country with 63,000 interviewed participants using Audio-CASI (in the most recent version). It is done in all participating food banks on a quadrennial basis. The design involves total probability sampling, sample surveys, multistage sampling, and clustered design—about 16,000 agencies are sampled to get the 60,000 completed interviews. The size and scope allows FA to have a micro-database with different types of characteristics allowing deep dives into small subgroups such as seniors or veterans. It provides valid national statistical estimates, and at the food bank level.

___________________

3 See https://www.feedingamerica.org/hunger-in-america.

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

Disadvantages of HIA include the cost of the survey—about $10 million, which includes food banks providing staff to help coordinate the sampling and the interviewing operation the field—and its periodicity of being done once every 4 years. Santos noted that results that are 4 years old may not represent the population, especially during a time of economic upheaval. There is a desire for more contemporaneous data to enable rapid interventions and to collect information about hot topics of the day. This desire to be more nimble and contemporary has led to redesign attempts to lower the cost of the survey. An ACS-style rolling survey of 5,000 respondents a year was discussed, but operation costs only decreased slightly while burden on food bank staff was still high.

Another attempt to redesign HIA is under way. Since the gold standard, 60,000 respondent survey is untenable from a budget perspective, FA is reducing the scope. Statistical national estimates will not be available but suggestive insights—Santos’s terminology—would still be useful. FA would not be able to say nationally that X percent of clients belong to this ethnic or age group, with a margin of error, but it might acquire enough information to make decisions. Making the collection a data analytics operation is the first step. This involves creating a taxonomy of food banks that sorts them using analytic approaches and the information FA has on food banks and clients to create 12 to 20 groups, and then selecting about 10 percent of them for further analysis. The results might look like a surveillance type operation. Data could be analyzed and combined for, say, 20 sites to look at the different subgroups. The insights gleamed from the analysis would be suggestive, as opposed to point estimates, of the population. Santos thinks this might be good enough to create strategies and prioritize programs.

Santos concluded by noting that the panel, in its recommendations, has the challenge of operating within the current policy environment, which means smaller budgets and pressure to do more. The new process Santos outlined may be beneficial for the panel in thinking through ways of getting the types of data that ERS needs to make decisions without necessarily making it a point estimate with a margin of error.

Alessandro Bonanno, Colorado State University, provided some thoughts and insights improving geospatial information in ERS’s food data system. He started by describing the common metrics that have been used in the analysis of food access: store location, distance traveled, store availability, and pricing. He said that these are related to metrics listed in a systematic review by Crepsi et al. (2012): availability, accessibility, affordability, acceptability, and accommodation. Availability and accessibility are covered in the common metrics. Affordability deals with prices. He noted that few studies have looked at either the combined cost of food and time to get to the store or food price differentials. Acceptability is typically measured in

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

consumer surveys of consumer perceptions about a store. Accommodation is not really being focused on in research.

Bonanno’s first research question was whether access to food stores affects a household’s decision to participate in SNAP. Bonanno said that he does not think this question has been addressed as yet. Another research question, he said, is whether the effectiveness of SNAP benefits is affected by access to food stores. A number of researchers have used ERS products to answer this question. For example, researchers have used FoodAPS to look at the facts of the SNAP food cycle, whether participants “stretch” benefits by shopping at cheaper stores, how SNAP benefits affect healthfulness of diet, and the relationship between food security, SNAP participation and the food environment (a new project under way at Colorado State University).

He described the ERS Food Access Research Atlas (FARA) as a product that is very useful in assessing some issues of food access at the census tract and county levels. It does not provide geocoded data, but instead provides aggregate data at the census tract level. Tracts are marked as being low-income or not, and low-food-access or not. Indicators include vehicle access, households with limited access (by number of children and mile radius). Previous versions also included a food desert indicator.

FARA is currently available for 2015 (the previous version was 2010). However, the methods have changed across the years, and though they are well documented, the differences between the 2010 and 2015 versions make analysis over time difficult. Having regularly updated versions of FARA would benefit researchers.

Bonanno said that he thinks that FoodAPS is the one dataset that allows researchers to best understand the environment that low-income households are exposed to. It has the largest amount of information to help answer questions about food insecurity, SNAP participation, and how SNAP benefits are used along with information on store location and distance traveled to stores. FoodAPS includes the geocoded location where the food acquisition event took place, including whether it was a SNAP authorized store, and the geocodes (latitude and longitude) of the household, so that distances between store of purchase and home can be calculated, and are also part of the data record. FoodAPS includes an indicator as to whether the tract is a low-access area and whether household has vehicle access.

Bonanno described research questions as a way to motivate a discussion of data needs. The simplistic question was, “Does where you shop affect how SNAP benefits are used?”

He noted that the first thing to determine is why a low-income/SNAP household decided to shop (or use their SNAP benefits) at a given food outlet. To properly address this question the following information is needed: geocodes of household and store locations—both where they shopped and locations of alternative stores; number of SNAP-approved

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

stores within driving distance; store characteristics, locations, and prices (to model the household decision); and where shoppers work, commuting routes, and changes in routes across seasons. Observations over time are important to address changes in preference.

Bonanno said that FoodAPS has much of this information, but not all. Data needed for a good analysis of this simple question include a time series of FoodAPS with a detailed geocoded place component; information about shopping habits, and commuting patterns; geocoded information on store location and type provided as a time series (including a record of store openings and closings over time); and summaries of driving distance from home to different stores.

Bonanno stated that enabling researchers to match the many existing restricted use, administrative, and proprietary datasets produced by different agencies and companies might benefit research more than collecting more data. He cited Courtemanche, Denteh, and Tchernis (2019), who linked the Consumer Expenditure Survey, information from the food security supplement (with respondent’s location), and Walmart data at an FSRDC.

C.5. USING PROPRIETARY DATA FOR FOOD POLICY RESEARCH

Mary Muth, Research Triangle Institute, described proprietary data: the types, sources, and considerations in using store scanner data, household scanner data, and nutrition data from labels for food policy research. Muth said she has worked with scanner data since about 2000. Originally, such work was done for the Food and Drug Administration (FDA), and more recently with ERS, the USDA’s Food Safety and Inspection Service (FSIS), FNS, and the Robert Wood Johnson Foundation. To start, she defined the terminology used for different types of scanner data:

Store scanner data—weekly transactions data provided by retailers

  • Includes products with barcodes and random-weight products
  • Data obtained by ERS comprise sales data from individual stores or retailer marketing areas and represent an unprojected (unweighted) subset of the total IRI store data

Household scanner data—purchases recorded by a panel of households using an in-home scanner or mobile app

  • Includes products with barcodes and, for a portion of the panel, random-weight products
  • Data obtained by ERS represent the entire panel, both static households (with weights) and non-static households (without weights)
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

Nutrition label data—information from labels including calories, nutrient quantities, daily values, serving size, product claims, and (sometimes) ingredient lists

These data are collected for commercial purposes, and are not necessarily designed for research purposes, and data vendors must protect their competitive information and confidentiality of households.

Types of Data and Suppliers

Muth first noted common terminology about scanner data, saying that one hears about Universal Product Codes (UPC), Global Trade Identification Numbers (GTIN), and European Article Numbering (EAN). She noted that the official term is GTIN, but they are commonly called UPCs. She likes to use the term barcode, because everybody understands what that means. These terms are all interchangeable in some sense.

Muth then summarized the data suppliers for household and store data and their products. She said that there are three different suppliers of household and store data within the United States and four suppliers worldwide.

  • IRI provides household data in its Consumer Network and store data, in InfoScan. It collects data in 10 other countries. They also collect auxiliary data, including the label data.
  • Nielson provides household data in Homescan, and they also provide household data in 25 other countries. Nielsen provides store data in Scantrack and also provides store data in 100 other countries. Homescan data are from scanning panels; Nielsen also has household data that is collected through nonscanning panels.
  • SPINS has no household data and includes only natural and specialty gourmet stores in the United States.
  • Kantar is a big supplier outside of the United States. This is important, because Kantar data are used by many researchers outside the United States.

Muth noted that there is only one consumer panel in the United States, the National Consumer Panel. It is a joint venture by Nielsen and IRI to avoid the duplication inherent in having two different panels operating in the United States. The National Consumer Panel includes about 120,000 households. Both companies use the Consumer Network Panel to prepare their household panel products, but they process the data in different ways. She summarized her analysis of the methodologies used by Nielsen and IRI, explaining differences between the two in how they determine which households to include in the static panel, in price assignment methods, and in procedures for weighting the data to get national totals.

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

The household data that ERS obtains from IRI represents the entire panel, both static households (those with weights for estimating national estimates) and non-static households (no weights). This has some advantages in terms of being able to look at the differences between the entire panel and those that are not considered reliable enough reporters to be included in the static panel.

The store data that ERS obtains from IRI are a portion of all of the data that IRI collects from stores. They comprise sales data from individual stores or from what IRI calls retailer marketing areas, and make up an unweighted subset of the total IRI dataset. The data that IRI is not providing to ERS are data from smaller stores. IRI considers small store data as extremely proprietary because they are used to produce other IRI data products for their main clients, retailers and food manufacturers.

Muth cautioned that it is important to recognize that companies like IRI and Nielsen put these databases together for their own commercial purposes. They use these data for analysis products to provide to retailers and to manufacturers. The data are not necessarily designed for research purposes. That does not mean the data should not be used, but rather that the user needs to understand the data to best interpret and use them.

Muth described the nutrition label data and its suppliers. Food labels link the barcode with label information such as calories, nutrient, quantities, daily values, serving sizes, and product claims. Sometimes ingredient lists are also available. She identified eight suppliers of label data in the United States, some of whom also collect the data in other countries: FoodSwitch, Gladstone and Nutritionix, IRI, Kentar, Label Insight, Mintel, Nielsen BrandBank, and the USDA Branded Food Products Database (described in Appendix B). Some of suppliers may collect information through apps where consumers scan the barcodes of things that they are consuming, particularly through fitness apps.

Muth said that she recently learned that Gladstone, Label Insight, and Nielsen are offering these data products to retailers to help them optimize location of products on shelves.

Analyzing Scanner Data

Muth described her own experiences in assessing and analyzing scanner data. She said that many years ago, when ERS started working with the Nielsen data, they had the foresight to try to understand more about how the data are collected and what the statistical properties are. As a result, Muth, Siegel, and Zhen (2007) were prepared to assess and document the properties of Nielsen Homescan data. This was important because commercial companies, such as Nielsen, do not necessarily prepare the kind of documentation a researcher needs. Also as part of that effort Zhen and colleagues

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

(2009) compared weighted expenditures from the Nielsen Homescan data with those from the Consumer Expenditure Survey, documenting possible underreporting in the scanner data. Sweitzer and colleagues (2018) found that the IRI Consumer Network also showed underreporting of food expenses when compared to the Consumer Expenditure Survey and FoodAPS.

Muth said that these studies document the underreporting of purchases recorded by panelists who are asked to scan all purchases. It is clear that not all participating households scan everything they buy. This may be because of the burden of participating on the panel, or there is something else going on to cause this difference.

Muth and colleagues (2013) describe a project in which questions were taken from the Health and Diet Survey and the Flexible Consumer Behavior Survey. A sample was selected from the Consumer Network Panel and asked the same questions. A comparison of responses revealed that the household panelists were more price-conscious, more concerned with taste, less concerned with ease of food preparation, and prepared and ate fewer meals at home. Muth et al. (2017) used information from Homescan and from NHANES to try to better estimate food losses from purchase to consumption for the ERS Loss Adjusted Food Availability System.

For the FDA, Muth and her colleagues developed the models that were used to estimate the costs of the new nutrition facts panel. Underlying the two models for estimating the cost of labeling and the cost of reformulation were the Nielsen Scantrack data. Those models have also been used by other agencies to look at different types of labeling regulations. See Muth et al. (2015a, 2015b).

Muth pointed to her ERS research (Muth et al., 2016) that assessed and documented the IRI Consumer Network, Infoscan, and IRI label data. The information from that report was discussed in the second workshop (see Appendix B). Giombi, Muth, and Levin (2018) compared hedonic models using the nutrition data from the IRI food label database versus Gladstone label data.

Muth said that in collaboration with ERS, she analyzed the differences in reported expenditures between commercial household scanner data and Consumer Expenditure Survey matter in a food demand system. She said a study for FDA that is in review used scanner data to model the impact of health communications on market outcomes. Current work for ERS will use IRI consumer data to update some estimates of consumer-level food loss. Under a RWJF grant she and ERS colleagues are using the IRI data to track the reformulation of foods over time and to simulate the effects of improving nutritional quality of foods commonly purchased by households with children. She is also working on a project for FSIS, in collaboration with ERS, to estimate the cost of updating safe handling instructions on all meat and poultry products.

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

Muth described the considerations for researchers who use proprietary household, store, and label data. For household data, researchers should remember that

  • households that participate are likely different from the general population
    • the intensive data collection process is somewhat burdensome
    • participants are possibly more aware or more price-conscious consumers
  • some types of households are less likely to meet criteria for inclusion in a static panel
    • For example, in IRI Consumer Network data, younger (under age 35) households, lower income households, Black and Hispanic households, and households with children are less likely to meet static panel criteria
  • prices are typically not exact prices paid by the household
    • prices are assigned using store scanner data based on where household shopped
  • data are weighted based on demographics, not shipment or expenditure totals

For Store Data, researchers should remember that

  • Not all stores are represented in the data
    • Data collection process is not designed to capture sales at smaller, independent stores (data may be collected but not available for research)
  • Private-label product data (about 18 percent of all food)
    • Not provided by all retailers
    • Aggregation of data by some retailers prevents calculation of unit prices
  • Random-weight data (e.g., produce, meat, deli, bakery)
    • Not provided by all stores
    • Product information is fairly limited
  • Projection factors (or weights)
    • Not provided to ERS with the data they purchase; therefore unable to calculate representative estimates (possible to obtain weighted totals but not by store)

RTI has a contract to develop weights for use by ERS in which control totals are being calculated using restricted Census Bureau data.

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

For Food Label Data, researchers should remember that

  • Tracking products over time is challenging
    • Manufacturers assign new barcodes to existing products when substantial changes occur; therefore, difficult to distinguish new product entrants from existing products with new barcodes
  • Label databases are not necessarily updated for all products every year
    • Need to match label data with sales data to ensure active products
  • Not all vendors include the ingredient list or include it as one long concatenated field
    • Can require substantial effort to parse ingredient lists
  • May require multiple data sources to cover all products of interest
  • Data fields will be changing with the roll-out of the new Nutrition Facts Label (e.g., added sugars, vitamin D, potassium)

Muth provided her thoughts on needed future research concerning the various proprietary data sources. The first need is to better understand the differences between households in the static panel versus the entire panel in terms of demographics and differences in knowledge, attitudes, and behaviors. Because IRI provides information on the static and the non-static panel, that analyses can be done now. Second, she said, is the need to better understand the implications of the price assignment methods used by IRI and Nielsen, and particularly how much variation there is across stores and locations in a chain. Third is to understand more about food manufacturer and retailer practices regarding barcode assignments; the assignment of barcodes affects the ability to track changes in the healthiness of the food supply over time. Fourth is to consider improving the coverage of label data by using multiple vendors. Fifth, and finally, is the need to consider how best to use loyalty card data, should it become available. However, loyalty card data should not be considered a replacement for panel data because a household may shop at multiple outlets.

Muth noted that as part of her research she and her colleagues have identified about 150 peer-reviewed publications on food policy research projects using some form of scanner or label data. She thinks that proprietary data will continue to be important for analyzing the effects of changes in federal nutrition programs and changes in the regulations on the healthiness of the food supply, and/or cost benefit analyses of new labeling regulations. They are already being used in local jurisdictions for analyzing effects of local regulations particularly for beverage tax initiates. They could also be useful for analyzing effects of voluntary industry initiatives, such as the convenience store initiative. Two other important applications

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

include looking at the effects of food safety outbreaks on sales and calculating price indices that can be used as a basis for other research studies.

In conclusion, Muth said, despite all the challenges they present, there is really no comparable data source to store-based and household-based scanner data, in terms of the granularity, detail, and frequency, and for much food policy research that needs to be done, there is no alternative data source at all.

Helen Jensen, from Iowa State University, was a member of the panel that authored the 2017 report, Review of WIC Food Packages: Improving Balance and Choice: Final Report. The panel used Homescan household panel data as part of this effort. She has also worked on other studies that use scanner data in collaboration with ERS.

Jensen said that for purposes of the WIC analysis, household scanner data were most important. The three main goals of the WIC analysis were, first, to study household purchase behavior; second, to determine prices and price indices for WIC food items; and third, to evaluate the cost of alternative WIC food packages, assess package design, and conduct a regulatory impact analysis. Because of the flexibility of using the Homescan scanner data, Jensen and colleagues were able to evaluate various food package contents meeting food item specifications (types of milk, types of yogurt, etc.), evaluate those items, and talk about the implications for the cost of the program. These analyses contributed to the development of costing for the regulatory impact analysis.

Jensen’s research focused on looking at the purchase behavior of WIC participants and WIC-eligible low-income nonparticipant households to examine food selection and choices for at home use. For these population groups, they wanted to determine the share of expenses associated with different types of approved WIC products, such as whole-fat milk versus other milk products. These may be affected by WIC program participation, and the information can be used to evaluate program changes and regional differences over time.

Jensen said that in 2009, states moved to expand the list of whole grain foods that were included in the WIC food packages. The panel used the Homescan household panels that bridged this period, incorporating detailed data by state on the timing of the implementation of the regulations, and looking at the purchase behavior for whole grain products before and after the switch.

First, Jensen commented on representation of the low-income population on Homescan. She showed estimated percentages of the self-reported WIC population in Homescan for several years, both weighted and unweighted. In all years, the unweighted percentages were much lower, indicating that the low-income population is underrepresented in the Homescan panel.

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

To verify self-reported WIC status, the panel analyzed the household composition, income reporting, and age over time (for those in multiple panel years). They found that among those reporting that they were WIC participants, nearly 40 percent were ineligible according to their analysis.

After the categories of WIC participant and eligible WIC nonparticipant were refined, the panel used pooled 3-year data to estimate expenditures and quantities before and after the package changed. Jensen said that both descriptive statistics and subsequent analysis using propensity score matching and a difference-in-difference approach supported the conclusion that the package change increased the amount of whole grain cereal products that were purchased by the household.

Jensen said that one of the advantages of the scanner data for the purpose of her study was the ability to construct detailed prices for food items with the characteristics that the WIC program was dictating for those items. This was based on searching food label databases and identifying keywords associated with approved products. Store brands were a challenge because they did not have product descriptors. Depending on the state, not all store brands are accepted as WIC food options. With the estimated prices, the panel was able to evaluate the cost of various food package options as part of a regulatory impact analysis.

Jensen summarized the advantages and concerns or challenges they had during their study. First, detailed product descriptions are a key advantage. The timeliness of the data, and the ability to match purchases to the household data for the analysis and evaluation, were all benefits from the store data.

Jensen noted that the first challenge was a concern as to whether the population was representative. Homescan updates household data once a year. As a result, the data have a once-a-year report as to whether or not a household was participating in WIC. The assumption is that WIC expenditures are captured through purchase transactions, but these may not be complete.

Jensen said that there is evidence that most WIC participants use larger retail stores. However, they were uncertain as to what was happening with the WIC formulate, the infant formula. A participant might go to Walmart when they have a car and buy most of their formula, because it is heavy. They might also go to some store that was not participating in the IRI Infoscan. She noted that non-UPC-coded products were not well captured. The data are not complete for purchases of fruit and vegetable components with the WIC case value voucher (CVV). The data would have included a 3-pound bag of apples with a UPC code, but not a purchase of loose apples.

Jensen continued that there are opportunities to link datasets now that did not exist before. Extending those possibilities will improve what can be

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

done with the data. Using the barcode may allow better linkages to some of the state-maintained databases on state-approved products for WIC.

Jensen noted that another panel she worked on (NASEM 2017) was charged with considering science breakthroughs in food and nutrition for 2030. Much of the advancement is projected to be in the food production area. However, it is clear that there are breakthroughs in terms of the types of data that are being generated for the food supply, improvements due to block chain technology and other data technologies in terms of being able to track and identify the flow of products. These will not necessarily impact WIC. But they do offer opportunities for improvements in data and analysis in the long term.

She noted that capturing data in the future may be even more challenging, for example if food is purchased through Amazon and delivered through drones. Finally, Jensen noted that the states are maintaining much more information on foods that are approved and redemption data in electronic form. Developing or maintaining the ability to link to these data through coding may offer future benefits in tracking capabilities.

Carma Hogue, U.S. Census Bureau, described the Census Bureau’s work on using web scraping and machine learning to discover, collect, and process data from the web, with a goal to improve economic statistics. First, Hogue provided the big data context and some web-scraping background. She talked about SABLE (a web-scraping software product) and some of its experiences with web scraping.

For the big data context, she said that the economic directorate has been researching alternative data sources and big data methodologies for 4 or 5 years. They are considering quality, costs, and skill sets, whether they have them and what it would take to get them. As background for web scraping she noted that the Census Bureau has many surveys, including surveys of federal, state, and local governments. For some of these entities, much of the data to be collected on surveys is available online. Currently, analysts manually access the data from websites. If Census could develop an automated way to scrape that data, it could reduce respondent and analyst burden.

Hogue said that their definition of web scraping is an automated process of collecting data from an online source. Web crawling is an automated process of systematically visiting and reading web pages.

She described policy issues as well, first the issue of informed consent. Census is currently evaluating a new notice about web-scraping activities that Statistics Canada has just posted. It is also considering Federal Register notices; however, they were informed that web scraping is not a passive collection but an active one. As a result, informed consent is needed. Hogue also reported that many private companies have terms of use on their websites, which say no scraping, no crawling, no bots, etc. Government websites do not tend to have such restrictions. Since it would be burden-

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

some to have researchers read all of the terms of use, Census is considering what they can do. She said that for now, Census is limiting the crawling and scraping to just federal state and local government websites.

Hogue said that the second policy issue is PII. She noted that the policy and legal people are very concerned about the unintentional scraping of PII. This raises questions about whether such a record would be a Title 13 record or a federal record. If it is, what are the disposal rules?

Hogue went on to describe the software product Scraping Assisted by Learning (SABLE), a collection of tools for crawling websites, scraping documents in data, and classifying the text. The models, which are based on text analysis and machine learning, are implemented using free, open-source Apache Nutch and Python.

SABLE has three main tasks: SABLE will crawl and scan the website, find the documents, and extract the text. Then a model is applied to determine whether the document is useful or not. If useful, it scrapes using a model to find the useful data and extract the numerical values and the corresponding text.

In order to move into a production environment, Census must have an authority to operate that requires a risk profile, a security assessment, documentation, audit trails, and subversion for code management. On August 22, 2019, SABLE was approved and ready for production. It is available on the Census Bureau’s GitHub account.4

Census has used SABLE to seek out and collect information from state Comprehensive Annual Financial Reports (CAFR) and other online publications that contain tax revenue data. These CAFRs are used for much of the data that Census collects from state and local governments. Census used SABLE to crawl through state government sites, and found about 60,000 PDFs. Census staff manually evaluated a random sample of about 6,000 of them and manually put the sample through a useful/not-useful test. They then used this information to apply machine learning to build text classification models based on word sequences. There is no product based on this example, as yet.

Hogue said that they did the same thing on pension statistics data and are trying to release this as a product for the Bureau of Economic Analysis (BEA). BEA asked Census to scrape service costs and interest statistics found on the CAFRs. A two-stage approach of first finding the tables using word sequences and then applying a scraping algorithm was used to accomplish the task.

Hogue said that Census analysts also rely on Securities and Exchange Commission (SEC) filing data, online databases of financial reports for

___________________

4 See https://www.github.com/uscensusbureau/SABLE programs, supplementary files, examples, and documentation.

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

publicly traded companies. Now analysts do not know when new reports are posted. Census has a Really Simple Syndication (RSS) feed that provides information on recent SEC filings. There is a current project to query this RSS feed to determine filing dates for various types of reports and to package it into a useful product for Census analysis.

Another scraping project is to target data in online building permit jurisdiction databases. Census releases construction indicators, such as housing starts, based on their Building Permits Survey (BPS), its Survey of Construction (SOC), and Nonresidential Coverage Evaluation (NCE). Information on new privately owned construction is often available in building permit databases. A few years ago, Census investigated the feasibility of using publicly available building permit data to supplement these surveys. It started in Chicago and Seattle, two cities with permit data available through APIs. The initial research showed that the data were very timely and of high quality. The problem was that there were many differences in the definitions used by different jurisdictions and not enough detail to actually use what would be scraped. It waited a few years and looked at seven more jurisdictions. The data are available in many different formats, but the classifications are becoming more standard. There is still a lack of information on housing units.

Hogue summarized by saying the challenges to using the building permits data are their representativeness and their inconsistency in terminology and formats. Census continues to explore the quality of scraped data by comparing them to survey data, and it is also looking at third-party data sources, such as Zillow and Construction Monitor.

Hogue said that the next step is to use SABLE in production, and to release a data product based in part on scraped data. Census would like to develop the SEC filing product, discussed above. After that, next steps will be guided by a new working group to address policy issues regarding web scraping and web crawling.

Hague briefly summarized the third party data that Census has used. It has looked at retailer data from NPD, both in aggregate and individual company data. NPD includes more than 1,300 retailers, both brick and mortar and e-commerce, and collects point-of-sale data. NPD captures some retailers that do not report to Census. The aggregate NPD data did not track well with Census estimates. However, the individual company data looked pretty good. Census is beginning to this examine data to impute for survey nonresponse.

Hogue went on to describe Census’s use of credit card raw data. She said those data are perfect. The problem was that the company had a change in leadership and decided that it did not want to share its data with Census anymore.

The final third-party data she described is credit/debit/gift card process data. Census is trying to use the information to provide more geographic

Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

granularity for the monthly retail trade survey. It has purchased raw data, but there was a lot of suppression. It will be able to use the data in a product soon.

Hogue concluded by summarizing some of the issues and challenges with using third-party data: acquiring these data can take a long time, costs are not fixed—they can increase or decrease due to change in management or other company practices, the lack of transparent methods used to collect and clean these data, the quality of these data can be difficult to judge, and disclosure avoidance policies can be difficult to discern.

C.4. MEETING AGENDA

Panel on Improving USDA’s Consumer Data for Food and Nutrition Policy Research
Third Meeting, September 21, 2018

The National Academy of Sciences Building, Room 120
2101 Constitution Ave NW, Washington, DC

Open Session

9:00 Plan for the day, goals for the meeting and of the panel more broadly
  • Marianne Bitler, Chair
9:10 Update on recent developments at ERS and with FoodAPS-2
  • Jay Variyam, Mark Denbaly, ERS
9:30 Improving data for policy research. For this session, researchers are asked to discuss their ideas for improving food and nutrition data—including integration of commercial and administrative data—to inform key policy issues.
  • The value (and limits) of linking SNAP administrative data with other types of administrative data (unemployment Insurance, Medicaid, K-12 education) as well as the limits of existing survey data.
    • Colleen Heflin, Syracuse University
  • Use of retail panel loyalty card data and Rhode Island state administrative records (housed in a secure facility at Brown University) to analyze how SNAP benefits are spent. Evidence needed to design a “smarter SNAP”
    • Justine Hastings, Brown University
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
  • Understanding/assessing the quality of administrative SNAP data used in FoodAPS. Broader thoughts on food consumption data needs for obesity and other health research.
    • Chuck Courtemanche, Georgia State University
  • Open discussion
11:15 Data Integration and linkages for policy research, use of administrative data
There is high value to ERS’s Consumer Food Data System of linkages to external data sets—e.g., to NHANES, Nielsen datasets, IRI datasets, SNAP administrative data, CPS, SIPP, ACS, BRFSS, CEX, Nationwide Food Consumption Survey, PSID, state and local-level datasets with information on low-income households, etc. During this session, we discuss practices being developed by the statistical agencies for combining data sources.
  • Administrative data linkages in the federal statistical system; the Next-Generation Data Platform—a collaboration between Census, ERS and FNS that links SNAP (19 States and 39 counties in CA) and WIC data (11 states) to Census survey data and administrative data: 17 State TANF agencies, VA, HUD and HHS data (Medicare and Medicaid) in a secure environment (described in the White paper provided to the panel).
    • Amy O’Hara, panel member and Georgetown University, will introduce the topic, highlighting strengths and weakness, potential and limitations, of statistical agency data linking approaches.
    • Rachel Shattuck, Census Bureau, to present views on the quality and utility of what ERS calls the Next-Generation Data Platform, challenges you see for the project going forward, and how the partners ERS/FNS and Census could improve it in the future.
    • John Eltinge, Census Bureau, to discuss interagency workshops and explorations related to quality issues associated with using multiple data sources (including proprietary data).
  • Record Linkage programs at NCHS. NCHS has developed a record linkage program designed to maximize the scientific value of the Center’s population-based surveys. Linked data files create new data resources that can support research to inform the development and evaluation of public health programs and policies. The focus of this presentation will be on existing linkages between NCHS national
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×

population health surveys, such as the National Health Interview Survey (NHIS) and the National Health and Nutrition Examination Survey (NHANES), and administrative data collected from the Department of Housing and Urban Development’s (HUD) largest housing assistance programs including, the Housing Choice Voucher program, federally supported public housing, and privately owned, subsidized multifamily housing. The presentation will include an assessment of the concordance between survey and administrative data sources and present results from studies looking at comparisons of health characteristics between persons receiving housing assistance and those who do not.

  • Cordell Golden (for Lisa Mirel) NCHS
1:30 Additional non-government sources for filling data gaps in ERS’s Consumer Food Data System program
  • Feeding America—Collaborates with Urban Institute on a research program that attempts to detail the frequency of visits to food pantries by individuals, either as a temporary, emergency food source or as a regular supplemental food source. Feeding America also has a program to study the effectiveness and efficiency of a range of program interventions. What data do they use; what data do they produce; what are the unmet data needs.
    • Rob Santos, Urban Institute. Member of Feeding America Technical Advisory Group
  • Improving geospatial information in ERS’s food data system (for example, for assessing the role of accessibility of food outlets role in SNAP participation and effectiveness)
    • Alessandro Bonanno, Colorado State University
  • Open discussion
2:30 Using proprietary data for food policy research
  • Types, sources, and considerations in using store scanner data, household scanner data, and nutrition data from labels for food policy research. Presentation is based on researching statistical properties of the data; investigating sources, coverage, and uses of the data; and conducting analyses.
    • Mary Muth, RTI
  • Understanding WIC issues with proprietary (scanner) data. Comments as a long time user of ERS Consumer Food
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
  • Data; ideas about how the data system should evolve over the next decade.
    • Helen Jensen, Iowa State
  • The Census Bureau’s work on improving economic statistics through web scraping and machine learning to discover, collect, and process data from the web.
    • Carma Hogue, U.S. Census Bureau
  • Open discussion
4:00 Adjourn
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 175
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 176
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 177
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 178
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 179
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 180
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 181
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 182
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 183
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 184
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 185
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 186
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 187
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 188
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 189
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 190
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 191
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 192
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 193
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 194
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 195
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 196
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 197
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 198
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 199
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 200
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 201
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 202
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 203
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 204
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 205
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 206
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 207
Suggested Citation:"Appendix C: Summary, Third Meeting, September 21, 2018." National Academies of Sciences, Engineering, and Medicine. 2020. A Consumer Food Data System for 2030 and Beyond. Washington, DC: The National Academies Press. doi: 10.17226/25657.
×
Page 208
Next: Appendix D: Biographical Sketches of Panel Members »
A Consumer Food Data System for 2030 and Beyond Get This Book
×
 A Consumer Food Data System for 2030 and Beyond
Buy Paperback | $70.00 Buy Ebook | $54.99
MyNAP members save 10% online.
Login or Register to save!
Download Free PDF

Patterns of food consumption and nutritional intake strongly affect the population's health and well-being. The Food Economics Division of USDA's Economic Research Service (ERS) engages in research and data collection to inform policy making related to the leading federal nutrition assistance programs managed by USDA's Food and Nutrition Service. The ERS uses the Consumer Food Data System to understand why people choose foods, how food assistance programs affect these choices, and the health impacts of those choices.

At the request of ERS, A Consumer Food Data System for 2030 and Beyond provides a blueprint for ERS's Food Economics Division for its data strategy over the next decade. This report explores the quality of data collected, the data collection process, and the kinds of data that may be most valuable to researchers, policy makers, and program administrators going forward. The recommendations of A Consumer Food Data System for 2030 and Beyond will guide ERS to provide and sustain a multisource, interconnected, reliable data system.

READ FREE ONLINE

  1. ×

    Welcome to OpenBook!

    You're looking at OpenBook, NAP.edu's online reading room since 1999. Based on feedback from you, our users, we've made some improvements that make it easier than ever to read thousands of publications on our website.

    Do you want to take a quick tour of the OpenBook's features?

    No Thanks Take a Tour »
  2. ×

    Show this book's table of contents, where you can jump to any chapter by name.

    « Back Next »
  3. ×

    ...or use these buttons to go back to the previous chapter or skip to the next one.

    « Back Next »
  4. ×

    Jump up to the previous page or down to the next one. Also, you can type in a page number and press Enter to go directly to that page in the book.

    « Back Next »
  5. ×

    Switch between the Original Pages, where you can read the report as it appeared in print, and Text Pages for the web version, where you can highlight and search the text.

    « Back Next »
  6. ×

    To search the entire text of this book, type in your search term here and press Enter.

    « Back Next »
  7. ×

    Share a link to this book page on your preferred social network or via email.

    « Back Next »
  8. ×

    View our suggested citation for this chapter.

    « Back Next »
  9. ×

    Ready to take your reading offline? Click here to buy this book in print or download it as a free PDF, if available.

    « Back Next »
Stay Connected!