A computational pipeline for crowdsourced transcriptions of Ancient Greek papyrus fragments

Alex C. Williams, John F. Wallin, Haoyu Yu, Marco Perale, Hyrum D. Carroll, Anne Francoise Lamblin, Lucy F Fortson, Dirk Obbink, Chris J. Lintott, James H. Brusuelas

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Citations (Scopus)

Abstract

In the late nineteenth century, two excavators from the University of Oxford uncovered a vast trove of naturally deteriorated papyri, numbering over 500,000 fragments, from the city of Oxyrhynchus. With varying levels and forms of deterioration, the identification of a papyrus fragment can become a repetitive, long, and exhausting process for a professional papyrologist. The University of Oxford's Ancient Lives project aims to accelerate the identification process through citizen science (or crowdsourcing). In the Ancient Lives interface, volunteer users identify letters by clicking on a location in the image to designate the presence of a letter. To date, over 7 million letter identifications from users across the world have been recorded in the Ancient Lives database. In this paper, we present a computational pipeline for converting crowdsourced letter identifications made through the Ancient Lives interface into digital consensus transcriptions of papyrus fragments. We conclude by explaining the usefulness of the pipeline output in the context of additional computational projects that aim to further accelerate the identification process.

Original languageEnglish (US)
Title of host publicationProceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014
EditorsWo Chang, Jun Huan, Nick Cercone, Saumyadipta Pyne, Vasant Honavar, Jimmy Lin, Xiaohua Tony Hu, Charu Aggarwal, Bamshad Mobasher, Jian Pei, Raghunath Nambiar
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages100-105
Number of pages6
ISBN (Electronic)9781479956654
DOIs
StatePublished - Jan 7 2015
Event2nd IEEE International Conference on Big Data, IEEE Big Data 2014 - Washington, United States
Duration: Oct 27 2014Oct 30 2014

Publication series

NameProceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014

Other

Other2nd IEEE International Conference on Big Data, IEEE Big Data 2014
CountryUnited States
CityWashington
Period10/27/1410/30/14

Fingerprint

Transcription
Pipelines
Excavators
User interfaces
Deterioration

Keywords

  • Big data
  • Crowdsourcing
  • Human computation
  • Papyrus transcription

Cite this

Williams, A. C., Wallin, J. F., Yu, H., Perale, M., Carroll, H. D., Lamblin, A. F., ... Brusuelas, J. H. (2015). A computational pipeline for crowdsourced transcriptions of Ancient Greek papyrus fragments. In W. Chang, J. Huan, N. Cercone, S. Pyne, V. Honavar, J. Lin, X. T. Hu, C. Aggarwal, B. Mobasher, J. Pei, ... R. Nambiar (Eds.), Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014 (pp. 100-105). [7004460] (Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/BigData.2014.7004460

A computational pipeline for crowdsourced transcriptions of Ancient Greek papyrus fragments. / Williams, Alex C.; Wallin, John F.; Yu, Haoyu; Perale, Marco; Carroll, Hyrum D.; Lamblin, Anne Francoise; Fortson, Lucy F; Obbink, Dirk; Lintott, Chris J.; Brusuelas, James H.

Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014. ed. / Wo Chang; Jun Huan; Nick Cercone; Saumyadipta Pyne; Vasant Honavar; Jimmy Lin; Xiaohua Tony Hu; Charu Aggarwal; Bamshad Mobasher; Jian Pei; Raghunath Nambiar. Institute of Electrical and Electronics Engineers Inc., 2015. p. 100-105 7004460 (Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Williams, AC, Wallin, JF, Yu, H, Perale, M, Carroll, HD, Lamblin, AF, Fortson, LF, Obbink, D, Lintott, CJ & Brusuelas, JH 2015, A computational pipeline for crowdsourced transcriptions of Ancient Greek papyrus fragments. in W Chang, J Huan, N Cercone, S Pyne, V Honavar, J Lin, XT Hu, C Aggarwal, B Mobasher, J Pei & R Nambiar (eds), Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014., 7004460, Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014, Institute of Electrical and Electronics Engineers Inc., pp. 100-105, 2nd IEEE International Conference on Big Data, IEEE Big Data 2014, Washington, United States, 10/27/14. https://doi.org/10.1109/BigData.2014.7004460
Williams AC, Wallin JF, Yu H, Perale M, Carroll HD, Lamblin AF et al. A computational pipeline for crowdsourced transcriptions of Ancient Greek papyrus fragments. In Chang W, Huan J, Cercone N, Pyne S, Honavar V, Lin J, Hu XT, Aggarwal C, Mobasher B, Pei J, Nambiar R, editors, Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014. Institute of Electrical and Electronics Engineers Inc. 2015. p. 100-105. 7004460. (Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014). https://doi.org/10.1109/BigData.2014.7004460
Williams, Alex C. ; Wallin, John F. ; Yu, Haoyu ; Perale, Marco ; Carroll, Hyrum D. ; Lamblin, Anne Francoise ; Fortson, Lucy F ; Obbink, Dirk ; Lintott, Chris J. ; Brusuelas, James H. / A computational pipeline for crowdsourced transcriptions of Ancient Greek papyrus fragments. Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014. editor / Wo Chang ; Jun Huan ; Nick Cercone ; Saumyadipta Pyne ; Vasant Honavar ; Jimmy Lin ; Xiaohua Tony Hu ; Charu Aggarwal ; Bamshad Mobasher ; Jian Pei ; Raghunath Nambiar. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 100-105 (Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014).
@inproceedings{5bdcb049b4fc4d05bcdb10d6135f1d76,
title = "A computational pipeline for crowdsourced transcriptions of Ancient Greek papyrus fragments",
abstract = "In the late nineteenth century, two excavators from the University of Oxford uncovered a vast trove of naturally deteriorated papyri, numbering over 500,000 fragments, from the city of Oxyrhynchus. With varying levels and forms of deterioration, the identification of a papyrus fragment can become a repetitive, long, and exhausting process for a professional papyrologist. The University of Oxford's Ancient Lives project aims to accelerate the identification process through citizen science (or crowdsourcing). In the Ancient Lives interface, volunteer users identify letters by clicking on a location in the image to designate the presence of a letter. To date, over 7 million letter identifications from users across the world have been recorded in the Ancient Lives database. In this paper, we present a computational pipeline for converting crowdsourced letter identifications made through the Ancient Lives interface into digital consensus transcriptions of papyrus fragments. We conclude by explaining the usefulness of the pipeline output in the context of additional computational projects that aim to further accelerate the identification process.",
keywords = "Big data, Crowdsourcing, Human computation, Papyrus transcription",
author = "Williams, {Alex C.} and Wallin, {John F.} and Haoyu Yu and Marco Perale and Carroll, {Hyrum D.} and Lamblin, {Anne Francoise} and Fortson, {Lucy F} and Dirk Obbink and Lintott, {Chris J.} and Brusuelas, {James H.}",
year = "2015",
month = "1",
day = "7",
doi = "10.1109/BigData.2014.7004460",
language = "English (US)",
series = "Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "100--105",
editor = "Wo Chang and Jun Huan and Nick Cercone and Saumyadipta Pyne and Vasant Honavar and Jimmy Lin and Hu, {Xiaohua Tony} and Charu Aggarwal and Bamshad Mobasher and Jian Pei and Raghunath Nambiar",
booktitle = "Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014",

}

TY - GEN

T1 - A computational pipeline for crowdsourced transcriptions of Ancient Greek papyrus fragments

AU - Williams, Alex C.

AU - Wallin, John F.

AU - Yu, Haoyu

AU - Perale, Marco

AU - Carroll, Hyrum D.

AU - Lamblin, Anne Francoise

AU - Fortson, Lucy F

AU - Obbink, Dirk

AU - Lintott, Chris J.

AU - Brusuelas, James H.

PY - 2015/1/7

Y1 - 2015/1/7

N2 - In the late nineteenth century, two excavators from the University of Oxford uncovered a vast trove of naturally deteriorated papyri, numbering over 500,000 fragments, from the city of Oxyrhynchus. With varying levels and forms of deterioration, the identification of a papyrus fragment can become a repetitive, long, and exhausting process for a professional papyrologist. The University of Oxford's Ancient Lives project aims to accelerate the identification process through citizen science (or crowdsourcing). In the Ancient Lives interface, volunteer users identify letters by clicking on a location in the image to designate the presence of a letter. To date, over 7 million letter identifications from users across the world have been recorded in the Ancient Lives database. In this paper, we present a computational pipeline for converting crowdsourced letter identifications made through the Ancient Lives interface into digital consensus transcriptions of papyrus fragments. We conclude by explaining the usefulness of the pipeline output in the context of additional computational projects that aim to further accelerate the identification process.

AB - In the late nineteenth century, two excavators from the University of Oxford uncovered a vast trove of naturally deteriorated papyri, numbering over 500,000 fragments, from the city of Oxyrhynchus. With varying levels and forms of deterioration, the identification of a papyrus fragment can become a repetitive, long, and exhausting process for a professional papyrologist. The University of Oxford's Ancient Lives project aims to accelerate the identification process through citizen science (or crowdsourcing). In the Ancient Lives interface, volunteer users identify letters by clicking on a location in the image to designate the presence of a letter. To date, over 7 million letter identifications from users across the world have been recorded in the Ancient Lives database. In this paper, we present a computational pipeline for converting crowdsourced letter identifications made through the Ancient Lives interface into digital consensus transcriptions of papyrus fragments. We conclude by explaining the usefulness of the pipeline output in the context of additional computational projects that aim to further accelerate the identification process.

KW - Big data

KW - Crowdsourcing

KW - Human computation

KW - Papyrus transcription

UR - http://www.scopus.com/inward/record.url?scp=84921820531&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84921820531&partnerID=8YFLogxK

U2 - 10.1109/BigData.2014.7004460

DO - 10.1109/BigData.2014.7004460

M3 - Conference contribution

AN - SCOPUS:84921820531

T3 - Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014

SP - 100

EP - 105

BT - Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014

A2 - Chang, Wo

A2 - Huan, Jun

A2 - Cercone, Nick

A2 - Pyne, Saumyadipta

A2 - Honavar, Vasant

A2 - Lin, Jimmy

A2 - Hu, Xiaohua Tony

A2 - Aggarwal, Charu

A2 - Mobasher, Bamshad

A2 - Pei, Jian

A2 - Nambiar, Raghunath

PB - Institute of Electrical and Electronics Engineers Inc.

ER -