TY - GEN
T1 - Relational operators for prioritizing candidate biomarkers in high-throughput differential expression data
AU - Onsongo, Getiria
AU - Xie, Hongwei
AU - Griffin, Timothy J.
AU - Carlis, John V.
PY - 2010
Y1 - 2010
N2 - Recent developments in high-throughput proteomics technologies have made it possible to detect and identify low abundance proteins. These technologies provide a new window through which proteomes can be analyzed. Despite holding great promise, the contribution of mass spectrometry based proteomics in identifying novel diagnostic biomarkers has been disappointing. This failure has, in part, been attributed to the lack of effective strategies for determining candidate biomarkers that justify more expensive and time-consuming validation studies. An approach that bridges the gap between unbiased experimental paradigm emphasizing comprehensive characterizations of proteins and a candidate-driven paradigm would overcome this limitation [38]. To this end, we have developed database operators that extend the database management systems to analyze high-throughput proteomics and genomics data. By analyzing differentially expressed genes and proteins using pathway databases, these operators take advantage of established expert domain knowledge in pathway annotation to prioritize candidate biomarkers. They provide a systematic way of bridging the gap between unbiased experimental paradigm and candidate-driven paradigm. To test the operators, we analyzed a dataset of salivary proteins differentially expressed between pre-malignant and malignant oral lesions. Six proteins are identified as candidate biomarkers worth of validation studies. A literature search reveals these high priority candidate biomarkers interact with proteins implicated in cancer development highlighting their potential utility as biomarkers demonstrating the effectiveness of our operators. The developed operators will help overcome one of the main challenges of high-throughput computational techniques; provide a systematic way of bridging the gap between unbiased data driven approach and hypothesis driven approach to prioritize candidate biomarkers worth of more expensive and time consuming validation studies.
AB - Recent developments in high-throughput proteomics technologies have made it possible to detect and identify low abundance proteins. These technologies provide a new window through which proteomes can be analyzed. Despite holding great promise, the contribution of mass spectrometry based proteomics in identifying novel diagnostic biomarkers has been disappointing. This failure has, in part, been attributed to the lack of effective strategies for determining candidate biomarkers that justify more expensive and time-consuming validation studies. An approach that bridges the gap between unbiased experimental paradigm emphasizing comprehensive characterizations of proteins and a candidate-driven paradigm would overcome this limitation [38]. To this end, we have developed database operators that extend the database management systems to analyze high-throughput proteomics and genomics data. By analyzing differentially expressed genes and proteins using pathway databases, these operators take advantage of established expert domain knowledge in pathway annotation to prioritize candidate biomarkers. They provide a systematic way of bridging the gap between unbiased experimental paradigm and candidate-driven paradigm. To test the operators, we analyzed a dataset of salivary proteins differentially expressed between pre-malignant and malignant oral lesions. Six proteins are identified as candidate biomarkers worth of validation studies. A literature search reveals these high priority candidate biomarkers interact with proteins implicated in cancer development highlighting their potential utility as biomarkers demonstrating the effectiveness of our operators. The developed operators will help overcome one of the main challenges of high-throughput computational techniques; provide a systematic way of bridging the gap between unbiased data driven approach and hypothesis driven approach to prioritize candidate biomarkers worth of more expensive and time consuming validation studies.
UR - http://www.scopus.com/inward/record.url?scp=77958052636&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77958052636&partnerID=8YFLogxK
U2 - 10.1145/1854776.1854786
DO - 10.1145/1854776.1854786
M3 - Conference contribution
AN - SCOPUS:77958052636
SN - 9781450304382
T3 - 2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010
SP - 25
EP - 34
BT - 2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010
T2 - 2010 ACM International Conference on Bioinformatics and Computational Biology, ACM-BCB 2010
Y2 - 2 August 2010 through 4 August 2010
ER -