TY - GEN
T1 - When excessive perturbation goes wrong and why IPUMS-international relies instead on sampling, suppression, swapping, and other minimally harmful methods to protect privacy of census microdata
AU - Cleveland, Lara
AU - McCaa, Robert
AU - Ruggles, Steven
AU - Sobek, Matthew
PY - 2012
Y1 - 2012
N2 - IPUMS-International disseminates population census microdata at no cost for 69 countries. Currently, a series of 212 samples totaling almost a half billion person records are available to researchers. Registration is required for researchers to gain access to the microdata. Statistics from Google Analytics show that IPUMS-International's lengthy, probing registration form is an effective deterrent for unqualified applicants. To protect data privacy, we rely principally on sampling, suppression of geographic detail, swapping of records across geographic boundaries, and other minimally harmful methods such as top and bottom coding. We do not use excessively perturbative methods. A recent case of perturbation gone wrongthe household samples of the 2000 census of the USA (PUMS), the 2003-2006 American Community Survey, and the 2004-2009 Current Population Survey, an empirical study of the impact of perturbation on the usability of UK census microdatathe Individual SARs of the 1991 census of the UK, and a mathematical demonstration in a timely compendium of statistical confidentiality practices confirm the wisdom of IPUMS microdata management protocols and statistical disclosure controls.
AB - IPUMS-International disseminates population census microdata at no cost for 69 countries. Currently, a series of 212 samples totaling almost a half billion person records are available to researchers. Registration is required for researchers to gain access to the microdata. Statistics from Google Analytics show that IPUMS-International's lengthy, probing registration form is an effective deterrent for unqualified applicants. To protect data privacy, we rely principally on sampling, suppression of geographic detail, swapping of records across geographic boundaries, and other minimally harmful methods such as top and bottom coding. We do not use excessively perturbative methods. A recent case of perturbation gone wrongthe household samples of the 2000 census of the USA (PUMS), the 2003-2006 American Community Survey, and the 2004-2009 Current Population Survey, an empirical study of the impact of perturbation on the usability of UK census microdatathe Individual SARs of the 1991 census of the UK, and a mathematical demonstration in a timely compendium of statistical confidentiality practices confirm the wisdom of IPUMS microdata management protocols and statistical disclosure controls.
KW - IPUMS-International
KW - data dissemination
KW - data privacy
KW - microdata samples
KW - population census
KW - statistical disclosure controls
UR - http://www.scopus.com/inward/record.url?scp=84867536500&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867536500&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-33627-0_14
DO - 10.1007/978-3-642-33627-0_14
M3 - Conference contribution
AN - SCOPUS:84867536500
SN - 9783642336263
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 179
EP - 187
BT - Privacy in Statistical Databases - UNESCO Chair in Data Privacy, International Conference, PSD 2012, Proceedings
PB - Springer Verlag
T2 - International Conference on Privacy in Statistical Databases, PSD 2012
Y2 - 26 September 2012 through 28 September 2012
ER -