Reflections on Coding 90 Million Historical Occupations

Research output: Contribution to conferencePaper


The North Atlantic Population Project--bringing together late nineteenthcentury census records of 90 million individuals from North America, Britain, Norway andIceland, faced the daunting task of consistently classifying more than 2 million distinctoccupations in four different languages, with regional variation in occupational terminology inEnglish, the most common language of respondents. This paper provides a retrospective on thetask, and provides some generalizations from our experience that other researchers working withhistorical occupations may find useful. We coded our occupations into a modified version ofHISCO. Our modifications to HISCO were primarily designed to reduce the number of codes inuse, simplify the classification of laborers and other semi- and un-skilled workers who did notprovide much information on the specific tasks they were engaged in, and provide more definiteheadings for commonly used occupational responses that HISCO provided more flexible codes.Our coding project was successful. We achieved consistent classification across all fivecountries, and coded all occupations in the dataset into the modified HISCO scheme. HISCO'saddition of subsidiary variables (STATUS, RELATE, PRODUCT) was a useful method forretaining information in occupational responses that would otherwise have been lost. Thehierarchical structure of the codes simplified coding and consistency checking. HISCO would beimproved by adding codes for industry that would allow separation of occupations that occur indifferent industries but are otherwise reported as the same job. Our use of the contemporary UNproduct classification for 19th century occupations was surprisingly successful.
Original languageEnglish (US)
StatePublished - 2012


Dive into the research topics of 'Reflections on Coding 90 Million Historical Occupations'. Together they form a unique fingerprint.

Cite this