Harmonization of census data: IPUMS - International

Research output: Chapter in Book/Report/Conference proceedingChapter

1 Scopus citations

Abstract

IPUMS International harmonizes and disseminates census microdata collected over multiple decades by roughly 100 countries. It is the world's largest publicly available population microdata collection. Each census is unique, and IPUMS has developed extensive data infrastructure to manage this heterogeneity. The infrastructure consists of multiple metadata components that describe the source data and the transformations needed to produce harmonized output. Custom-designed software interprets these metadata to execute data transformations and populate the web dissemination system. IPUMS research staff manipulate correspondence tables that translate disparate input values into a single global classification scheme developed for each categorical variable. In order not to lose detail, variable composite coding structures are employed in which the leading digits identify broadly available categories, and trailing digits provide additional detail available in a subset of the censuses. The metadata-based approach to variable harmonization is largely self-documenting and easily modified to incorporate new data. The IPUMS infrastructure, dissemination system, and internal processes evolved over the last 20 years to address new challenges and some deficiencies in early methods. An additional processing stage was introduced to standardize the data prior to harmonization and develop robust metadata describing the source material. Offering unharmonized source variables to users provides access to the substantively unaltered original data in parallel to the internationally harmonized variables. A tagging system associates questionnaire text with variables, aiding internally with harmonization and offered to users within the web dissemination system. IPUMS continues to evolve to maximize processing efficiency and empower users of this massive data collection.

Original languageEnglish (US)
Title of host publicationSurvey Data Harmonization in the Social Sciences
PublisherWiley-Blackwell
Pages207-226
Number of pages20
ISBN (Electronic)9781119712206
ISBN (Print)9781119712176
DOIs
StatePublished - Jul 31 2023

Bibliographical note

Publisher Copyright:
© 2024 John Wiley & Sons Inc. Published 2024 by John Wiley & Sons Inc. All rights reserved.

Keywords

  • Census
  • Demography
  • Harmonization
  • Metadata
  • Population

Fingerprint

Dive into the research topics of 'Harmonization of census data: IPUMS - International'. Together they form a unique fingerprint.

Cite this