Skip to main content

Currently Skimming:

3 Cost and the Value of Data
Pages 33-43

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 33...
... The text delineates the principal economic issues in creating cost forecasts and the significant variables (i.e., "cost drivers") affecting the forecast in the biomedical research data life cycle.
From page 34...
... For example, existing open-source software might be incorporated as a component of a new data information resource, so some amount of development costs for that software would not be included in the present forecast. However, there will still be marginal costs for adapting, maintaining, and integrating that existing software that would need to be incorporated into the cost forecast.
From page 35...
... Service providers may benefit from much greater economies of scale and thus lower cost than an individual institution or researcher, but their lower costs will not necessarily translate into lower prices for the science community. Even if prices accurately reflect past (marginal)
From page 36...
... Such inaccurate cost forecasts may reflect excessive optimism about what can be achieved, a lack of clarity or precision regarding what is to be accomplished, or deliberate "lowballing" on the part of a proposer seeking to win approval for an initiative. Principal Elements of Operating Costs For most public and private enterprises, the principal elements of operating costs are consumable inputs (e.g., power, vendor services)
From page 37...
... . Relative Costs of Storage Media and Hardware A difficult issue in forecasting costs for a data-intensive enterprise is how to deal with the information infrastructure (i.e., the storage media and hardware)
From page 38...
... But at a minimum, the cost forecaster owes decision makers a warning and discussion about the existence of those uncertainties -- even if they cannot be precisely characterized. BOX 3.3 Changing Behaviors Given Changing Storage and Compute Scenarios The committee heard from researchers about their ability to "experiment" with data-intensive computations, at no additional cost to them, when data resources were hosted by their research institutions.
From page 39...
... It is possible and perhaps even advantageous at times to aggregate data from smaller studies to increase statistical power and to train new machine learning algorithms to take advantage of heterogeneous data. Aggregating heterogeneous data allows a more complete and robust model of preclinical research to emerge, as each individual laboratory samples a small slice of a larger, multidimensional picture (Ferguson, 2019; Williams, 2019)
From page 40...
... , in which investigations using data from multiple laboratories or multiple genetic strains lead to more robust clinical predictions than investigations using more limited data. These results suggest that while a small individual data set on its own may be of limited value, when aggregated with other data, it can potentially increase the value of the pool of data.
From page 41...
... . Services around the data may be required for data to be usable, and significant labor will likely be necessary to implement and provide those services, including maintaining data standards.
From page 42...
... Asset value concerns direct or indirect monetization. Direct monetization includes buying, sell ing, or trading data.
From page 43...
... 2019. Presentation to the National Academies Workshop on Forecasting Costs for Preserving and Promoting Access to Biomedical Data, July 11.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.