The Census Bureau plans a new approach to disclosure control for the 2020 census that will add noise to every statistic the agency produces for places below the state level. The Bureau argues the new approach is needed because the confidentiality of census responses is threatened by “database reconstruction,” a technique for inferring individual-level responses from tabular data. The Census Bureau constructed hypothetical individual-level census responses from public 2010 tabular data and matched them to internal census records and to outside sources. The Census Bureau did not compare these results to a null model to demonstrate that their success in matching would not be expected by chance. This is analogous to conducting a clinical trial without a control group. We implement a simple simulation to assess how many matches would be expected by chance. We demonstrate that most matches reported by the Census Bureau experiment would be expected randomly. To extend the metaphor of the clinical trial, the treatment and the placebo produced similar outcomes. The database reconstruction experiment therefore fails to demonstrate a credible threat to confidentiality.
Bibliographical noteFunding Information:
Alfred P. Sloan Foundation G-2019-12589, “Implications of Differential Privacy on Decennial Census Data Accuracy and Utility.” Eunice Kennedy Shriver National Institute of Child Health and Human Development P2C HD041023, “Minnesota Population Center.”
© 2021, The Author(s).
- Database reconstruction
- Differential privacy
- Disclosure control