When excessive perturbation goes wrong and why IPUMS-international relies instead on sampling, suppression, swapping, and other minimally harmful methods to protect privacy of census microdata

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

IPUMS-International disseminates population census microdata at no cost for 69 countries. Currently, a series of 212 samples totaling almost a half billion person records are available to researchers. Registration is required for researchers to gain access to the microdata. Statistics from Google Analytics show that IPUMS-International's lengthy, probing registration form is an effective deterrent for unqualified applicants. To protect data privacy, we rely principally on sampling, suppression of geographic detail, swapping of records across geographic boundaries, and other minimally harmful methods such as top and bottom coding. We do not use excessively perturbative methods. A recent case of perturbation gone wrongthe household samples of the 2000 census of the USA (PUMS), the 2003-2006 American Community Survey, and the 2004-2009 Current Population Survey, an empirical study of the impact of perturbation on the usability of UK census microdatathe Individual SARs of the 1991 census of the UK, and a mathematical demonstration in a timely compendium of statistical confidentiality practices confirm the wisdom of IPUMS microdata management protocols and statistical disclosure controls.

Original languageEnglish (US)
Title of host publicationPrivacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2012, Proceedings
Pages179-187
Number of pages9
Volume7556 LNCS
DOIs
StatePublished - Oct 22 2012
EventInternational Conference on Privacy in Statistical Databases, PSD 2012 - Palermo, Italy
Duration: Sep 26 2012Sep 28 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7556 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

OtherInternational Conference on Privacy in Statistical Databases, PSD 2012
CountryItaly
CityPalermo
Period9/26/129/28/12

Fingerprint

Census
Privacy
Sampling
Perturbation
Data privacy
Registration
Demonstrations
Statistical Disclosure Control
Statistics
Confidentiality
Empirical Study
Usability
Costs
Person
Coding
Series

Keywords

  • data dissemination
  • data privacy
  • IPUMS-International
  • microdata samples
  • population census
  • statistical disclosure controls

Cite this

Cleveland, L. L., McCaa, R., Ruggles, S., & Sobek, M. (2012). When excessive perturbation goes wrong and why IPUMS-international relies instead on sampling, suppression, swapping, and other minimally harmful methods to protect privacy of census microdata. In Privacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2012, Proceedings (Vol. 7556 LNCS, pp. 179-187). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7556 LNCS). https://doi.org/10.1007/978-3-642-33627-0-14

When excessive perturbation goes wrong and why IPUMS-international relies instead on sampling, suppression, swapping, and other minimally harmful methods to protect privacy of census microdata. / Cleveland, Lara L; McCaa, Robert; Ruggles, Steven; Sobek, Matthew.

Privacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2012, Proceedings. Vol. 7556 LNCS 2012. p. 179-187 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7556 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Cleveland, LL, McCaa, R, Ruggles, S & Sobek, M 2012, When excessive perturbation goes wrong and why IPUMS-international relies instead on sampling, suppression, swapping, and other minimally harmful methods to protect privacy of census microdata. in Privacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2012, Proceedings. vol. 7556 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7556 LNCS, pp. 179-187, International Conference on Privacy in Statistical Databases, PSD 2012, Palermo, Italy, 9/26/12. https://doi.org/10.1007/978-3-642-33627-0-14
Cleveland LL, McCaa R, Ruggles S, Sobek M. When excessive perturbation goes wrong and why IPUMS-international relies instead on sampling, suppression, swapping, and other minimally harmful methods to protect privacy of census microdata. In Privacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2012, Proceedings. Vol. 7556 LNCS. 2012. p. 179-187. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-33627-0-14
Cleveland, Lara L ; McCaa, Robert ; Ruggles, Steven ; Sobek, Matthew. / When excessive perturbation goes wrong and why IPUMS-international relies instead on sampling, suppression, swapping, and other minimally harmful methods to protect privacy of census microdata. Privacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2012, Proceedings. Vol. 7556 LNCS 2012. pp. 179-187 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{685fa0633fc249f5aee2ad29ab009eab,
title = "When excessive perturbation goes wrong and why IPUMS-international relies instead on sampling, suppression, swapping, and other minimally harmful methods to protect privacy of census microdata",
abstract = "IPUMS-International disseminates population census microdata at no cost for 69 countries. Currently, a series of 212 samples totaling almost a half billion person records are available to researchers. Registration is required for researchers to gain access to the microdata. Statistics from Google Analytics show that IPUMS-International's lengthy, probing registration form is an effective deterrent for unqualified applicants. To protect data privacy, we rely principally on sampling, suppression of geographic detail, swapping of records across geographic boundaries, and other minimally harmful methods such as top and bottom coding. We do not use excessively perturbative methods. A recent case of perturbation gone wrongthe household samples of the 2000 census of the USA (PUMS), the 2003-2006 American Community Survey, and the 2004-2009 Current Population Survey, an empirical study of the impact of perturbation on the usability of UK census microdatathe Individual SARs of the 1991 census of the UK, and a mathematical demonstration in a timely compendium of statistical confidentiality practices confirm the wisdom of IPUMS microdata management protocols and statistical disclosure controls.",
keywords = "data dissemination, data privacy, IPUMS-International, microdata samples, population census, statistical disclosure controls",
author = "Cleveland, {Lara L} and Robert McCaa and Steven Ruggles and Matthew Sobek",
year = "2012",
month = "10",
day = "22",
doi = "10.1007/978-3-642-33627-0-14",
language = "English (US)",
isbn = "9783642336263",
volume = "7556 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "179--187",
booktitle = "Privacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2012, Proceedings",

}

TY - GEN

T1 - When excessive perturbation goes wrong and why IPUMS-international relies instead on sampling, suppression, swapping, and other minimally harmful methods to protect privacy of census microdata

AU - Cleveland, Lara L

AU - McCaa, Robert

AU - Ruggles, Steven

AU - Sobek, Matthew

PY - 2012/10/22

Y1 - 2012/10/22

N2 - IPUMS-International disseminates population census microdata at no cost for 69 countries. Currently, a series of 212 samples totaling almost a half billion person records are available to researchers. Registration is required for researchers to gain access to the microdata. Statistics from Google Analytics show that IPUMS-International's lengthy, probing registration form is an effective deterrent for unqualified applicants. To protect data privacy, we rely principally on sampling, suppression of geographic detail, swapping of records across geographic boundaries, and other minimally harmful methods such as top and bottom coding. We do not use excessively perturbative methods. A recent case of perturbation gone wrongthe household samples of the 2000 census of the USA (PUMS), the 2003-2006 American Community Survey, and the 2004-2009 Current Population Survey, an empirical study of the impact of perturbation on the usability of UK census microdatathe Individual SARs of the 1991 census of the UK, and a mathematical demonstration in a timely compendium of statistical confidentiality practices confirm the wisdom of IPUMS microdata management protocols and statistical disclosure controls.

AB - IPUMS-International disseminates population census microdata at no cost for 69 countries. Currently, a series of 212 samples totaling almost a half billion person records are available to researchers. Registration is required for researchers to gain access to the microdata. Statistics from Google Analytics show that IPUMS-International's lengthy, probing registration form is an effective deterrent for unqualified applicants. To protect data privacy, we rely principally on sampling, suppression of geographic detail, swapping of records across geographic boundaries, and other minimally harmful methods such as top and bottom coding. We do not use excessively perturbative methods. A recent case of perturbation gone wrongthe household samples of the 2000 census of the USA (PUMS), the 2003-2006 American Community Survey, and the 2004-2009 Current Population Survey, an empirical study of the impact of perturbation on the usability of UK census microdatathe Individual SARs of the 1991 census of the UK, and a mathematical demonstration in a timely compendium of statistical confidentiality practices confirm the wisdom of IPUMS microdata management protocols and statistical disclosure controls.

KW - data dissemination

KW - data privacy

KW - IPUMS-International

KW - microdata samples

KW - population census

KW - statistical disclosure controls

UR - http://www.scopus.com/inward/record.url?scp=84867536500&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867536500&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-33627-0-14

DO - 10.1007/978-3-642-33627-0-14

M3 - Conference contribution

SN - 9783642336263

VL - 7556 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 179

EP - 187

BT - Privacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2012, Proceedings

ER -