Reflections on Coding 90 Million Historical Occupations

Research output: Contribution to conferencePaper

Abstract

The North Atlantic Population Project--bringing together late nineteenthcentury census records of 90 million individuals from North America, Britain, Norway andIceland, faced the daunting task of consistently classifying more than 2 million distinctoccupations in four different languages, with regional variation in occupational terminology inEnglish, the most common language of respondents. This paper provides a retrospective on thetask, and provides some generalizations from our experience that other researchers working withhistorical occupations may find useful. We coded our occupations into a modified version ofHISCO. Our modifications to HISCO were primarily designed to reduce the number of codes inuse, simplify the classification of laborers and other semi- and un-skilled workers who did notprovide much information on the specific tasks they were engaged in, and provide more definiteheadings for commonly used occupational responses that HISCO provided more flexible codes.Our coding project was successful. We achieved consistent classification across all fivecountries, and coded all occupations in the dataset into the modified HISCO scheme. HISCO'saddition of subsidiary variables (STATUS, RELATE, PRODUCT) was a useful method forretaining information in occupational responses that would otherwise have been lost. Thehierarchical structure of the codes simplified coding and consistency checking. HISCO would beimproved by adding codes for industry that would allow separation of occupations that occur indifferent industries but are otherwise reported as the same job. Our use of the contemporary UNproduct classification for 19th century occupations was surprisingly successful.
Original languageEnglish (US)
StatePublished - 2012

Fingerprint

coding
occupation
skilled worker
industry
regional difference
language
technical language
Norway
census
experience

Cite this

Reflections on Coding 90 Million Historical Occupations. / Roberts, Evan W.

2012.

Research output: Contribution to conferencePaper

@conference{958be92dd2fa4e37b0860798b777cf9d,
title = "Reflections on Coding 90 Million Historical Occupations",
abstract = "The North Atlantic Population Project--bringing together late nineteenthcentury census records of 90 million individuals from North America, Britain, Norway andIceland, faced the daunting task of consistently classifying more than 2 million distinctoccupations in four different languages, with regional variation in occupational terminology inEnglish, the most common language of respondents. This paper provides a retrospective on thetask, and provides some generalizations from our experience that other researchers working withhistorical occupations may find useful. We coded our occupations into a modified version ofHISCO. Our modifications to HISCO were primarily designed to reduce the number of codes inuse, simplify the classification of laborers and other semi- and un-skilled workers who did notprovide much information on the specific tasks they were engaged in, and provide more definiteheadings for commonly used occupational responses that HISCO provided more flexible codes.Our coding project was successful. We achieved consistent classification across all fivecountries, and coded all occupations in the dataset into the modified HISCO scheme. HISCO'saddition of subsidiary variables (STATUS, RELATE, PRODUCT) was a useful method forretaining information in occupational responses that would otherwise have been lost. Thehierarchical structure of the codes simplified coding and consistency checking. HISCO would beimproved by adding codes for industry that would allow separation of occupations that occur indifferent industries but are otherwise reported as the same job. Our use of the contemporary UNproduct classification for 19th century occupations was surprisingly successful.",
author = "Roberts, {Evan W}",
year = "2012",
language = "English (US)",

}

TY - CONF

T1 - Reflections on Coding 90 Million Historical Occupations

AU - Roberts, Evan W

PY - 2012

Y1 - 2012

N2 - The North Atlantic Population Project--bringing together late nineteenthcentury census records of 90 million individuals from North America, Britain, Norway andIceland, faced the daunting task of consistently classifying more than 2 million distinctoccupations in four different languages, with regional variation in occupational terminology inEnglish, the most common language of respondents. This paper provides a retrospective on thetask, and provides some generalizations from our experience that other researchers working withhistorical occupations may find useful. We coded our occupations into a modified version ofHISCO. Our modifications to HISCO were primarily designed to reduce the number of codes inuse, simplify the classification of laborers and other semi- and un-skilled workers who did notprovide much information on the specific tasks they were engaged in, and provide more definiteheadings for commonly used occupational responses that HISCO provided more flexible codes.Our coding project was successful. We achieved consistent classification across all fivecountries, and coded all occupations in the dataset into the modified HISCO scheme. HISCO'saddition of subsidiary variables (STATUS, RELATE, PRODUCT) was a useful method forretaining information in occupational responses that would otherwise have been lost. Thehierarchical structure of the codes simplified coding and consistency checking. HISCO would beimproved by adding codes for industry that would allow separation of occupations that occur indifferent industries but are otherwise reported as the same job. Our use of the contemporary UNproduct classification for 19th century occupations was surprisingly successful.

AB - The North Atlantic Population Project--bringing together late nineteenthcentury census records of 90 million individuals from North America, Britain, Norway andIceland, faced the daunting task of consistently classifying more than 2 million distinctoccupations in four different languages, with regional variation in occupational terminology inEnglish, the most common language of respondents. This paper provides a retrospective on thetask, and provides some generalizations from our experience that other researchers working withhistorical occupations may find useful. We coded our occupations into a modified version ofHISCO. Our modifications to HISCO were primarily designed to reduce the number of codes inuse, simplify the classification of laborers and other semi- and un-skilled workers who did notprovide much information on the specific tasks they were engaged in, and provide more definiteheadings for commonly used occupational responses that HISCO provided more flexible codes.Our coding project was successful. We achieved consistent classification across all fivecountries, and coded all occupations in the dataset into the modified HISCO scheme. HISCO'saddition of subsidiary variables (STATUS, RELATE, PRODUCT) was a useful method forretaining information in occupational responses that would otherwise have been lost. Thehierarchical structure of the codes simplified coding and consistency checking. HISCO would beimproved by adding codes for industry that would allow separation of occupations that occur indifferent industries but are otherwise reported as the same job. Our use of the contemporary UNproduct classification for 19th century occupations was surprisingly successful.

M3 - Paper

ER -