Identification of Ancient Greek Papyrus Fragments Using Genetic Sequence Alignment Algorithms

Alex C. Williams, Hyrum D. Carroll, John F. Wallin, James Brusuelas, Lucy F Fortson, Anne Francoise Lamblin, Haoyu Yu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Papyrologists analyze, transcribe, and edit papyrus fragments in order to enrich modern lives by better understanding the linguistics, culture, and literature of the ancient world. One of their common tasks is to match an unknown fragment to a known manuscript. This is especially challenging when the fragments are damaged and contain only limited information (e.g., due to deterioration). In the last 100 years, only about 10% of the more than 500,000 fragments recovered from the Egyptian village of Oxyrhynchus have been edited. We do not know what new ancient texts might be found and what can be learned from them, but using current methods of identification this process will take in excess of 1000 years. The identification of an anonymous string of characters with a collection of known text sequences is ubiquitous in computational biology. Genes are often represented by a sequence of continuous characters, each of which denotes an amino acid. Relationships are inferred by finding multi-letter patterns shared between the anonymous sequence and a known sequence. This process is commonly referred to as genetic sequence alignment. In this paper, we introduce a novel methodology that uses modern genetic sequence alignment algorithms as a method for identifying Ancient Greek text fragments. This application will offer papyrologists and other professionals in the humanities the ability to rapidly identify severely damaged texts. This approach leverages a new form of non-contextual, multi-line text identification for the Greek language that can greatly accelerate the tedious task of transcription and identification.

Original languageEnglish (US)
Title of host publicationProceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5-10
Number of pages6
ISBN (Electronic)9781479942886
DOIs
StatePublished - Dec 2 2014
Event10th IEEE International Conference on eScience, eScience 2014 - Guaruja, Brazil
Duration: Oct 20 2014Oct 24 2014

Publication series

NameProceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014
Volume2

Other

Other10th IEEE International Conference on eScience, eScience 2014
CountryBrazil
CityGuaruja
Period10/20/1410/24/14

Fingerprint

Transcription
Linguistics
Deterioration
Amino acids
Genes

Keywords

  • Ancient Greek
  • genetic sequence alignment
  • identification
  • papyrus

Cite this

Williams, A. C., Carroll, H. D., Wallin, J. F., Brusuelas, J., Fortson, L. F., Lamblin, A. F., & Yu, H. (2014). Identification of Ancient Greek Papyrus Fragments Using Genetic Sequence Alignment Algorithms. In Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014 (pp. 5-10). [6972089] (Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014; Vol. 2). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/eScience.2014.14

Identification of Ancient Greek Papyrus Fragments Using Genetic Sequence Alignment Algorithms. / Williams, Alex C.; Carroll, Hyrum D.; Wallin, John F.; Brusuelas, James; Fortson, Lucy F; Lamblin, Anne Francoise; Yu, Haoyu.

Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014. Institute of Electrical and Electronics Engineers Inc., 2014. p. 5-10 6972089 (Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014; Vol. 2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Williams, AC, Carroll, HD, Wallin, JF, Brusuelas, J, Fortson, LF, Lamblin, AF & Yu, H 2014, Identification of Ancient Greek Papyrus Fragments Using Genetic Sequence Alignment Algorithms. in Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014., 6972089, Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014, vol. 2, Institute of Electrical and Electronics Engineers Inc., pp. 5-10, 10th IEEE International Conference on eScience, eScience 2014, Guaruja, Brazil, 10/20/14. https://doi.org/10.1109/eScience.2014.14
Williams AC, Carroll HD, Wallin JF, Brusuelas J, Fortson LF, Lamblin AF et al. Identification of Ancient Greek Papyrus Fragments Using Genetic Sequence Alignment Algorithms. In Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014. Institute of Electrical and Electronics Engineers Inc. 2014. p. 5-10. 6972089. (Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014). https://doi.org/10.1109/eScience.2014.14
Williams, Alex C. ; Carroll, Hyrum D. ; Wallin, John F. ; Brusuelas, James ; Fortson, Lucy F ; Lamblin, Anne Francoise ; Yu, Haoyu. / Identification of Ancient Greek Papyrus Fragments Using Genetic Sequence Alignment Algorithms. Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014. Institute of Electrical and Electronics Engineers Inc., 2014. pp. 5-10 (Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014).
@inproceedings{0ceb91f3401b4c2895d007121da25f12,
title = "Identification of Ancient Greek Papyrus Fragments Using Genetic Sequence Alignment Algorithms",
abstract = "Papyrologists analyze, transcribe, and edit papyrus fragments in order to enrich modern lives by better understanding the linguistics, culture, and literature of the ancient world. One of their common tasks is to match an unknown fragment to a known manuscript. This is especially challenging when the fragments are damaged and contain only limited information (e.g., due to deterioration). In the last 100 years, only about 10{\%} of the more than 500,000 fragments recovered from the Egyptian village of Oxyrhynchus have been edited. We do not know what new ancient texts might be found and what can be learned from them, but using current methods of identification this process will take in excess of 1000 years. The identification of an anonymous string of characters with a collection of known text sequences is ubiquitous in computational biology. Genes are often represented by a sequence of continuous characters, each of which denotes an amino acid. Relationships are inferred by finding multi-letter patterns shared between the anonymous sequence and a known sequence. This process is commonly referred to as genetic sequence alignment. In this paper, we introduce a novel methodology that uses modern genetic sequence alignment algorithms as a method for identifying Ancient Greek text fragments. This application will offer papyrologists and other professionals in the humanities the ability to rapidly identify severely damaged texts. This approach leverages a new form of non-contextual, multi-line text identification for the Greek language that can greatly accelerate the tedious task of transcription and identification.",
keywords = "Ancient Greek, genetic sequence alignment, identification, papyrus",
author = "Williams, {Alex C.} and Carroll, {Hyrum D.} and Wallin, {John F.} and James Brusuelas and Fortson, {Lucy F} and Lamblin, {Anne Francoise} and Haoyu Yu",
year = "2014",
month = "12",
day = "2",
doi = "10.1109/eScience.2014.14",
language = "English (US)",
series = "Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "5--10",
booktitle = "Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014",

}

TY - GEN

T1 - Identification of Ancient Greek Papyrus Fragments Using Genetic Sequence Alignment Algorithms

AU - Williams, Alex C.

AU - Carroll, Hyrum D.

AU - Wallin, John F.

AU - Brusuelas, James

AU - Fortson, Lucy F

AU - Lamblin, Anne Francoise

AU - Yu, Haoyu

PY - 2014/12/2

Y1 - 2014/12/2

N2 - Papyrologists analyze, transcribe, and edit papyrus fragments in order to enrich modern lives by better understanding the linguistics, culture, and literature of the ancient world. One of their common tasks is to match an unknown fragment to a known manuscript. This is especially challenging when the fragments are damaged and contain only limited information (e.g., due to deterioration). In the last 100 years, only about 10% of the more than 500,000 fragments recovered from the Egyptian village of Oxyrhynchus have been edited. We do not know what new ancient texts might be found and what can be learned from them, but using current methods of identification this process will take in excess of 1000 years. The identification of an anonymous string of characters with a collection of known text sequences is ubiquitous in computational biology. Genes are often represented by a sequence of continuous characters, each of which denotes an amino acid. Relationships are inferred by finding multi-letter patterns shared between the anonymous sequence and a known sequence. This process is commonly referred to as genetic sequence alignment. In this paper, we introduce a novel methodology that uses modern genetic sequence alignment algorithms as a method for identifying Ancient Greek text fragments. This application will offer papyrologists and other professionals in the humanities the ability to rapidly identify severely damaged texts. This approach leverages a new form of non-contextual, multi-line text identification for the Greek language that can greatly accelerate the tedious task of transcription and identification.

AB - Papyrologists analyze, transcribe, and edit papyrus fragments in order to enrich modern lives by better understanding the linguistics, culture, and literature of the ancient world. One of their common tasks is to match an unknown fragment to a known manuscript. This is especially challenging when the fragments are damaged and contain only limited information (e.g., due to deterioration). In the last 100 years, only about 10% of the more than 500,000 fragments recovered from the Egyptian village of Oxyrhynchus have been edited. We do not know what new ancient texts might be found and what can be learned from them, but using current methods of identification this process will take in excess of 1000 years. The identification of an anonymous string of characters with a collection of known text sequences is ubiquitous in computational biology. Genes are often represented by a sequence of continuous characters, each of which denotes an amino acid. Relationships are inferred by finding multi-letter patterns shared between the anonymous sequence and a known sequence. This process is commonly referred to as genetic sequence alignment. In this paper, we introduce a novel methodology that uses modern genetic sequence alignment algorithms as a method for identifying Ancient Greek text fragments. This application will offer papyrologists and other professionals in the humanities the ability to rapidly identify severely damaged texts. This approach leverages a new form of non-contextual, multi-line text identification for the Greek language that can greatly accelerate the tedious task of transcription and identification.

KW - Ancient Greek

KW - genetic sequence alignment

KW - identification

KW - papyrus

UR - http://www.scopus.com/inward/record.url?scp=84919657825&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84919657825&partnerID=8YFLogxK

U2 - 10.1109/eScience.2014.14

DO - 10.1109/eScience.2014.14

M3 - Conference contribution

AN - SCOPUS:84919657825

T3 - Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014

SP - 5

EP - 10

BT - Proceedings - 2014 IEEE 10th International Conference on eScience, eScience 2014

PB - Institute of Electrical and Electronics Engineers Inc.

ER -