Skip to main content

Currently Skimming:

4 Challenges Associated with Data Collection, Aggregation, and Sharing
Pages 47-64

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 47...
... (Butte, Omberg) • The use of interoperable standards for electronic health records can facilitate data sharing and contribute to developing a lingua franca across industry, academia, and regulators.
From page 48...
... This session's objectives were to discuss how big data can be used to identify which patients will respond best to a particular regenerative medicine and to highlight challenges in data collection and data sharing such as small sample sizes in clinical trials, proprietary issues, and patient privacy. TOWARD OPEN SCIENCE IN OMICS ANALYSIS AND DISEASE MODELING Larsson Omberg, the vice president of systems biology at Sage Bionetworks, spoke about the opportunities and challenges involved in using open science approaches in omics analyses and disease modeling.
From page 49...
... Sage Bionetworks has taken on a coordinating role to facilitate numerous additional related efforts, including 1 For more information on the NIH genomics data sharing policy, see https://osp.od.nih.gov/ scientific-sharing/genomic-data-sharing (accessed January 19, 2021)
From page 50...
... Initially, the consortium was working with highly heterogeneous data from postmortem brain samples collected using varying technologies and analytical approaches, which made it impossible to make direct comparisons. To homogenize these data, several groups within the consortium worked together to create a canonical dataset that could be used in downstream analysis to derive insights.
From page 51...
... in individuals, Omberg said.10 In one of those studies, the researchers recruited individuals with PD to 9 More information about the DREAM Challenges program is available at http://dream challenges.org (accessed December 11, 2020)
From page 52...
... He described a challenge that Sage Bionetworks held to help develop impartial benchmarks from mPower data. Sage Bionetworks asked researchers to build diagnostic digital biomarkers using the mPower accelerometer data to determine if an individual has PD and, if so, the severity of the individual's disease.
From page 53...
... USING BIG DATA FOR CLINICAL STRATIFICATION OF PATIENTS Atul Butte, the Priscilla Chan and Mark Zuckerberg Distinguished Professor and the director of the Bakar Computational Health Sciences 13 More information about the Digital Mammography DREAM Challenge is available at https://sagebionetworks.org/research-projects/digital-mammography-dream-challenge (accessed December 11, 2020)
From page 54...
... Thus, Butte said, the operational need to harmonize practice data across the entire system motivated the decision to aggregate all of their health data in a single place. Centralizing Health Care Data Across the University of California Health System Today, health care data from across the six UC medical schools15 is stored both locally and in the centralized UCH data warehouse.16 These data include basic data from more than 15 million patients treated since 2005 and detailed electronic health record (EHR)
From page 55...
... Each campus moves its data to OMOP to facilitate centralization using commonly shared and governed tools. As of this writing, the database contains structured data from 2012 to the present, including data for 7.3 million patients, 192 million encounters, 553 million procedures, 739 million medical orders, 661 million diagnosis codes, and 2.1 billion laboratory tests and vital signs.
From page 56...
... The consortium benefits from common contracting and the institutional review board reliance process, allowing the group to scale large trials for cancer and cancer therapies quickly, Butte said. For example, Foundation Medicine performs cancer genomics testing for cancer patients across UC and other institutions.
From page 57...
... Both randomized controlled trials and analysis of real-world patients are valuable, Butte said, but this combination of data will be especially useful in studying regenerative medicines. Wang also prepared data, largely using automated tools provided by the consortium, which compared progression-free survival and overall survival at 12 months for patients treated with axicabtagene ciloleucel.
From page 58...
... , a federal standard for data formats and an application programming interface for exchanging electronic health records. The UCH system uses this standard primarily to export data to patients via the FHIR feed.
From page 59...
... DATA COLLECTION, AGGREGATION, AND SHARING 59 TABLE 4-1  Use Cases for Real-World Data Category of Use Use Cases for Real-World Data Post-approval safety • Updating side effect rates • Discovering novel side effects Supporting • Conducting single-arm experimental trials regulatory approval • Supporting "digital approvals" • Evaluating biosimilar development Informing clinical • Improving patient selection trials design • Increasing efficiency of data collection ("trimming the trials") Continually • Assessing the efficacy–effectiveness gap establishing efficacy • Searching for efficacy in specific populations • Informing effect modifiers and precision medicine • Evaluating long-term, post-trial outcomes Comparative • Integrating costs with comparative effectiveness effectiveness • Understanding effects of pharmacy practices on health care use • Studying novel on-label pharmaceuticals versus older off-label drugs Studying the • Improving quality of practice and reducing medical errors practice of medicine • Standardizing care and care delivery • Studying the effect of payors on medical care • Evaluating impact of new-generation diagnostics on outcomes Data-driven • Improving clinical decision support: the provider perspective decision support • Improving clinical decision support: the patient perspective • Improving clinical decision support: the community perspective SOURCES: Atul Butte workshop presentation, October 22, 2020.
From page 60...
... Single-Investigator Versus Consortia-Driven Research One audience member asked about Omberg's earlier comments about insufficient data when it comes to finding "druggable" targets in addition to the challenges surrounding the use of single-investigator-initiated, hypothesis-driven research, which is often aimed at understanding the fundamental processes of disease rather than just hunting for targets. The audience member questioned whether the best use of a systems approach is to find "druggable" targets or, alternatively, whether it can involve higher priorities such as finding variables that control disease pathway and other preferences.
From page 61...
... Differences were also found between age groups; for example, the response rates to Levodopa varied, with some patients responding in the way they performed a finger tapping test (a measure of bradykinesia) , while other had less gait freezing while walking.
From page 62...
... Researchers and health systems are also subject to the policies of data governance before any patient data -- even de-identified patient data -- may be exported for any purpose. In cases where there is clearly mutual benefit in sharing de-identified patient data, the contracts used will require that the recipient of the data not re-identify any patients, further protecting patients' data once they have been exported.21 In his work related to the DREAM Challenge, Omberg said, he used particular contract language in dealings with the Kaiser Permanente health system, but the contracts were not created with the individuals representing various organizations that participated in the challenge because they did not have access to the data.
From page 63...
... It includes a cloud-based, secure database where de-identified clinical, imaging, and genomics data can be viewed within the health system by researchers who are in compliance with the requisite data governance processes.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.