Skip to main content

Currently Skimming:

Appendix L: The Science and Technology of Privacy Protection
Pages 263-280

The Chapter Skim interface presents what we've algorithmically identified as the most significant single chunk of text within every page in the chapter.
Select key terms on the right to highlight them within pages of the chapter.


From page 263...
... L.1 THE CYBERSECURITY DIMENSION OF PRIVACY Respecting privacy interests necessarily means that parties that should not have access to personal information do not have such access. Security breaches are incompatible with protecting the privacy of personal information, and good cybersecurity for electronically stored personal information is a necessary (but not sufficient)
From page 264...
... A data-breach chronology reports losses of 104 million records (for example, in lost laptop computers) containing personally identifiable information from January 2005 to February 2007.4 The Department of Homeland Security National Cyber Security Division reports that over 25 new vulnerabilities were discovered each day in 2006.5 The state of government information security is unnecessarily weak.
From page 265...
... Such risks are illustrated, in part, by an increasing number of security incidents experienced by federal agencies.6 Such performance is reflected in the public's lack of trust in government agencies' ability to protect personal information.7 Security of government information systems is poor despite many relevant regulations and guidelines.8 Most communication and information systems are unnecessarily vulnerable to attack because of poor security practices, and 6 Statement of Gregory C Wilshusen, GAO Director for Information Security Issues, "Infor mation Security: Progress Reported, but Weaknesses at Federal Agencies Persist," Testimony Before the Subcommittee on Federal Financial Management, Government Information, Federal Services, and International Security, Committee on Homeland Security and Governmental Affairs, U.S.
From page 266...
... is viewed as being incapable of protecting privacy, and public confidence is undermined when it asserts that it will be a responsible steward of the personal information it collects in its counterterrorism mission. L.2 PRIVACY-PRESERVING DATA ANALYSIS L.2.1 Basic Concepts It is intuitive that the goal of privacy-preserving data analysis is to allow the learning of particular facts or kinds of facts about individuals (units)
From page 267...
... However, it might be possible to limit the amount of information revealed about those who do not satisfy the profile, perhaps by controlling the information and sources used or by editing them after they are acquired. That would require major efforts and attention to the quality and utility of information in integrated databases.
From page 268...
... Another suggestion is to monitor query sequences to rule out attacks of the nature just described. Such a suggestion is problematic for two reasons: it may be computationally infeasible to determine whether a query sequence compromises privacy,10 and, more surprising, the refusal to answer a query may itself reveal information.11 A different approach to preventing the set differencing attack is to add random noise to the true answer to a query; for example, the response to a query about the average income of a set of individuals is the sum of the true answer and some random noise.
From page 269...
... Talwar, "The price of privacy and the limits of LP decoding," pp. 85-94 in Proceedings of the th Annual ACM SIGACT Symposium on Theory of Computing, Association for Computing Machinery, New York, N.Y., 2007.
From page 270...
... Smith, "Calibrating noise to sensitivity of functions in private data analysis," pp. 265-284 in Proceedings of the Thirty-Ninth Annual ACM Symposium on Theory of Computing, Association for Computing Machinery, New York, N.Y., 2006, and references therein.
From page 271...
... individuals in the system.17 A more traditional example of the difficulties posed by context begins with the publication of redacted confidential data. The Census Bureau receives confidential information from enterprises as part of the economic census and publishes a redacted version in which identifying information on companies is suppressed.
From page 272...
... The linkage software may use any collection of data fields, or variables, to determine that records in two distinct data sets correspond to the same person. And if the "privacy-protected" or deidentified records include values for additional variables that are not yet public, simple record-linkage tools might let an intruder identify a person (that is, match files)
From page 273...
... almost no difference between the behavior of a sys 19 D.B. Rubin, "Discussion: Statistical disclosure limitation," Journal of Official Statistics 9(2)
From page 274...
... Moreover, if we believe that data are of higher quality and that profiles are more accurate than they actually are, the rate of false negatives -- people who are potential terrorists but go undetected -- will also grow, and this endangers all of us. Record linkage also lies at the heart of data-fusion methods and has major implications for privacy protection and harm to people.
From page 275...
... Similarly, as the size of blocks used for sorting data for matching purposes grows, so too do both the computational demands for comparing records in pairs and the probabilities of correct matches. Low-quality record-linkage results will almost certainly increase the rates of both false positives and false negatives when merged databases are used to attempt to identify terrorists or potential terrorists.
From page 276...
... The perception of privacy violations depends heavily on the trust of the subject that the government and everyone who has access to the data will abide by the stated policy on data collection and use. • Analytical methods inoled.
From page 277...
... For example, it is possible in most cases to infer the names of people associated with individual medical records that contain only birthdates and ZIP codes if that data set is merged with a census database that contains names, ZIP codes, and birthdates. L.4 STATISTICAL AGENCY DATA AND APPROACHES Government statistical agencies have been concerned with confidentiality protection since early in the 20th century and work very hard to
From page 278...
... That is, the nature of redaction of individually identifiable information seems to yield redacted data that are of little value for this purpose. L.4.1 Confidentiality Protection and Public Data Release Statistical agencies often promise confidentiality to their respondents regarding all data provided in connection with surveys and censuses, and, as noted above, these promises are often linked to legal statutes and provisions.
From page 279...
... . For some other approaches to agency confidentiality and data release in the European context, see Willenborg and de Waal.28 L.4.2 Record Linkage and Public Use Files One activity that is highly developed in the context of statisticalagency data is record linkage.
From page 280...
... Those who prepared the PUMS file have done sufficient testing to offer specific guarantees regarding the protection of individuals whose data went into the preparation of the file. This example illustrates not only the complexity of data protection associated with record linkage but the likely lack of utility of statistical-agency data for terrorism prevention, because linked files cannot be matched to individuals.


This material may be derived from roughly machine-read images, and so is provided only to facilitate research.
More information on Chapter Skim is available.