Relational operators for prioritizing candidate biomarkers in high-throughput differential expression data

Getiria Onsongo, Hongwei Xie, Timothy J. Griffin, John V. Carlis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent developments in high-throughput proteomics technologies have made it possible to detect and identify low abundance proteins. These technologies provide a new window through which proteomes can be analyzed. Despite holding great promise, the contribution of mass spectrometry based proteomics in identifying novel diagnostic biomarkers has been disappointing. This failure has, in part, been attributed to the lack of effective strategies for determining candidate biomarkers that justify more expensive and time-consuming validation studies. An approach that bridges the gap between unbiased experimental paradigm emphasizing comprehensive characterizations of proteins and a candidate-driven paradigm would overcome this limitation [38]. To this end, we have developed database operators that extend the database management systems to analyze high-throughput proteomics and genomics data. By analyzing differentially expressed genes and proteins using pathway databases, these operators take advantage of established expert domain knowledge in pathway annotation to prioritize candidate biomarkers. They provide a systematic way of bridging the gap between unbiased experimental paradigm and candidate-driven paradigm. To test the operators, we analyzed a dataset of salivary proteins differentially expressed between pre-malignant and malignant oral lesions. Six proteins are identified as candidate biomarkers worth of validation studies. A literature search reveals these high priority candidate biomarkers interact with proteins implicated in cancer development highlighting their potential utility as biomarkers demonstrating the effectiveness of our operators. The developed operators will help overcome one of the main challenges of high-throughput computational techniques; provide a systematic way of bridging the gap between unbiased data driven approach and hypothesis driven approach to prioritize candidate biomarkers worth of more expensive and time consuming validation studies.

Original languageEnglish (US)
Title of host publication2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010
Pages25-34
Number of pages10
DOIs
StatePublished - Oct 25 2010
Event2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010 - Niagara Falls, NY, United States
Duration: Aug 2 2010Aug 4 2010

Publication series

Name2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010

Other

Other2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010
CountryUnited States
CityNiagara Falls, NY
Period8/2/108/4/10

Fingerprint

Biomarkers
Throughput
Proteins
Validation Studies
Proteomics
Database Management Systems
Bridge approaches
Databases
Salivary Proteins and Peptides
Technology
Proteome
Genomics
Mass spectrometry
Mass Spectrometry
Genes
Neoplasms

Cite this

Onsongo, G., Xie, H., Griffin, T. J., & Carlis, J. V. (2010). Relational operators for prioritizing candidate biomarkers in high-throughput differential expression data. In 2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010 (pp. 25-34). (2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010). https://doi.org/10.1145/1854776.1854786

Relational operators for prioritizing candidate biomarkers in high-throughput differential expression data. / Onsongo, Getiria; Xie, Hongwei; Griffin, Timothy J.; Carlis, John V.

2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010. 2010. p. 25-34 (2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Onsongo, G, Xie, H, Griffin, TJ & Carlis, JV 2010, Relational operators for prioritizing candidate biomarkers in high-throughput differential expression data. in 2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010. 2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010, pp. 25-34, 2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010, Niagara Falls, NY, United States, 8/2/10. https://doi.org/10.1145/1854776.1854786
Onsongo G, Xie H, Griffin TJ, Carlis JV. Relational operators for prioritizing candidate biomarkers in high-throughput differential expression data. In 2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010. 2010. p. 25-34. (2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010). https://doi.org/10.1145/1854776.1854786
Onsongo, Getiria ; Xie, Hongwei ; Griffin, Timothy J. ; Carlis, John V. / Relational operators for prioritizing candidate biomarkers in high-throughput differential expression data. 2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010. 2010. pp. 25-34 (2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010).
@inproceedings{41ee8b11886e4d07a1e4c1dcf2b3df92,
title = "Relational operators for prioritizing candidate biomarkers in high-throughput differential expression data",
abstract = "Recent developments in high-throughput proteomics technologies have made it possible to detect and identify low abundance proteins. These technologies provide a new window through which proteomes can be analyzed. Despite holding great promise, the contribution of mass spectrometry based proteomics in identifying novel diagnostic biomarkers has been disappointing. This failure has, in part, been attributed to the lack of effective strategies for determining candidate biomarkers that justify more expensive and time-consuming validation studies. An approach that bridges the gap between unbiased experimental paradigm emphasizing comprehensive characterizations of proteins and a candidate-driven paradigm would overcome this limitation [38]. To this end, we have developed database operators that extend the database management systems to analyze high-throughput proteomics and genomics data. By analyzing differentially expressed genes and proteins using pathway databases, these operators take advantage of established expert domain knowledge in pathway annotation to prioritize candidate biomarkers. They provide a systematic way of bridging the gap between unbiased experimental paradigm and candidate-driven paradigm. To test the operators, we analyzed a dataset of salivary proteins differentially expressed between pre-malignant and malignant oral lesions. Six proteins are identified as candidate biomarkers worth of validation studies. A literature search reveals these high priority candidate biomarkers interact with proteins implicated in cancer development highlighting their potential utility as biomarkers demonstrating the effectiveness of our operators. The developed operators will help overcome one of the main challenges of high-throughput computational techniques; provide a systematic way of bridging the gap between unbiased data driven approach and hypothesis driven approach to prioritize candidate biomarkers worth of more expensive and time consuming validation studies.",
author = "Getiria Onsongo and Hongwei Xie and Griffin, {Timothy J.} and Carlis, {John V.}",
year = "2010",
month = "10",
day = "25",
doi = "10.1145/1854776.1854786",
language = "English (US)",
isbn = "9781450304382",
series = "2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010",
pages = "25--34",
booktitle = "2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010",

}

TY - GEN

T1 - Relational operators for prioritizing candidate biomarkers in high-throughput differential expression data

AU - Onsongo, Getiria

AU - Xie, Hongwei

AU - Griffin, Timothy J.

AU - Carlis, John V.

PY - 2010/10/25

Y1 - 2010/10/25

N2 - Recent developments in high-throughput proteomics technologies have made it possible to detect and identify low abundance proteins. These technologies provide a new window through which proteomes can be analyzed. Despite holding great promise, the contribution of mass spectrometry based proteomics in identifying novel diagnostic biomarkers has been disappointing. This failure has, in part, been attributed to the lack of effective strategies for determining candidate biomarkers that justify more expensive and time-consuming validation studies. An approach that bridges the gap between unbiased experimental paradigm emphasizing comprehensive characterizations of proteins and a candidate-driven paradigm would overcome this limitation [38]. To this end, we have developed database operators that extend the database management systems to analyze high-throughput proteomics and genomics data. By analyzing differentially expressed genes and proteins using pathway databases, these operators take advantage of established expert domain knowledge in pathway annotation to prioritize candidate biomarkers. They provide a systematic way of bridging the gap between unbiased experimental paradigm and candidate-driven paradigm. To test the operators, we analyzed a dataset of salivary proteins differentially expressed between pre-malignant and malignant oral lesions. Six proteins are identified as candidate biomarkers worth of validation studies. A literature search reveals these high priority candidate biomarkers interact with proteins implicated in cancer development highlighting their potential utility as biomarkers demonstrating the effectiveness of our operators. The developed operators will help overcome one of the main challenges of high-throughput computational techniques; provide a systematic way of bridging the gap between unbiased data driven approach and hypothesis driven approach to prioritize candidate biomarkers worth of more expensive and time consuming validation studies.

AB - Recent developments in high-throughput proteomics technologies have made it possible to detect and identify low abundance proteins. These technologies provide a new window through which proteomes can be analyzed. Despite holding great promise, the contribution of mass spectrometry based proteomics in identifying novel diagnostic biomarkers has been disappointing. This failure has, in part, been attributed to the lack of effective strategies for determining candidate biomarkers that justify more expensive and time-consuming validation studies. An approach that bridges the gap between unbiased experimental paradigm emphasizing comprehensive characterizations of proteins and a candidate-driven paradigm would overcome this limitation [38]. To this end, we have developed database operators that extend the database management systems to analyze high-throughput proteomics and genomics data. By analyzing differentially expressed genes and proteins using pathway databases, these operators take advantage of established expert domain knowledge in pathway annotation to prioritize candidate biomarkers. They provide a systematic way of bridging the gap between unbiased experimental paradigm and candidate-driven paradigm. To test the operators, we analyzed a dataset of salivary proteins differentially expressed between pre-malignant and malignant oral lesions. Six proteins are identified as candidate biomarkers worth of validation studies. A literature search reveals these high priority candidate biomarkers interact with proteins implicated in cancer development highlighting their potential utility as biomarkers demonstrating the effectiveness of our operators. The developed operators will help overcome one of the main challenges of high-throughput computational techniques; provide a systematic way of bridging the gap between unbiased data driven approach and hypothesis driven approach to prioritize candidate biomarkers worth of more expensive and time consuming validation studies.

UR - http://www.scopus.com/inward/record.url?scp=77958052636&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77958052636&partnerID=8YFLogxK

U2 - 10.1145/1854776.1854786

DO - 10.1145/1854776.1854786

M3 - Conference contribution

SN - 9781450304382

T3 - 2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010

SP - 25

EP - 34

BT - 2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010

ER -