The IPUMS project: An update

Research output: Contribution to journalArticlepeer-review

14 Scopus citations
Original languageEnglish (US)
Pages (from-to)102-110
Number of pages9
JournalHistorical Methods
Issue number3
StatePublished - Jun 1999

Bibliographical note

Funding Information:
In 1997, the NIH and NSF jointly awarded a four-year grant for “Electronic Dissemination” of the IPUMS (Grants HD34714 and SBR-9617820, respectively).

Funding Information:
he Integrated Public Use Microdata Series (IPUMS) is a coherent national database describing T the characteristics of 55 million Americans in thirteen census years spanning the period from 1850 through 1990. It combines census microdata files produced by the U. S. Census Bureau since 1960 with new historical census files produced at the University of Minnesota and elsewhere. By putting the samples in the same format, impos-ing consistent variable coding, and carefully documenting changes in variables over time, the IPUMS is designed to facilitate the use of the census samples as a time series. The database includes comprehensive and comprehensible documentation amounting to some three thousand pages of text, including detailed analyses of the comparability of every variable across every census year. Both the database and the documentation are distributed through an on-line data access system at http://www.ipums.umn.edwu,h ich provides powerful extraction and search capabilities for easy access to both metadata and microdata. The project is funded by the National Science Foundation (NSF) and the National Institutes of Health (NIH), so all data and documentation are available without cost. The characteristics of the IPUMS samples are detailed in table I. I There are only two census years missing from the series: 1890 (data destroyed by fire) and 1930 (data still subject to seventy-two-year census confidentiality rules). In 2002, it will be possible to add 1930 to the series. For most years before 1970, we have a 1 percent random sample of the population. In several cases-1860, 1870, 1900, and 1910-the preliminary versions of the samples now available are smaller, but each will eventually contain 1 percent of the population. We have just begun a project to expand the sample. Thus, by 2007, we expect to have collected-in a consistent manner-samples of at least 1-in-100 density for every possible census year since 1 850.2 The early samples are much smaller than recent ones, partly because the population was smaller. Moreover, the sample density available for the period since 1970 is much higher than in earlier census years: For each of the past three censuses, we have microdata on at least 6 percent of the population, which allows analysis of very small population subgroups. The census also tended to ask more questions over time (see table 1 for the number of variables). Although the earlier samples are smaller and less detailed than their modern counterparts, they are still the largest and richest sources available for quantitative historical research in that period and are capable of supporting research on topics ranging from marriage and fertility to social stratification and household structure.

Funding Information:
The IPUMS project began in 1992 with a three-year grant from the NSF (Grant SBR-9 1 18299). Some panel members were skeptical about whether the census samples were sufficiently comparable to be integrated, and some felt that if we succeeded in what we proposed, we might lower the bar sufficiently for the less sophisticated to gain access to the data and make mistakes. Fortunately, such opinions did not prevail, and the project was funded.

Cite this