TY - GEN
T1 - Integrating multi-source biological data for transcriptional regulatory module discovery
AU - Ressom, Habtom W.
AU - Zhang, Yuji
AU - Xuan, Jianhua
AU - Wang, Yue
AU - Clarke, Robert
PY - 2007
Y1 - 2007
N2 - The design principles of gene transcriptional regulation networks in cells have been puzzles due to their unknown dynamic and nonlinear mechanisms. Although high-throughput biotechnologies have generated unprecedented amounts of data, the integration of multi-source data to better understand the process of gene regulation has been a challenge in post genomics era. Gene expression data are limited in providing information about the underlying causal relationships among genes. Prior biological knowledge such as protein binding data and gene ontology annotation, albeit limited in quantity, reflects physical processes of gene regulation. In this paper, we introduce a computational framework for utilizing time course gene expression patterns, protein binding data, and gene ontology information to infer transcriptional regulatory modules. The proposed method mainly consists of three parts: (1) a fuzzy c-means clustering approach that exploits gene functional category information to define gene clusters; (2) a network motif detection tool that classifies the transcription factors into different kinds of regulatory modules based on protein binding data; and (3) a recurrent neural network model for each transcription factor that mimics the architecture of the predicted regulatory module. A hybrid of genetic algorithm and particle swarm optimization method is applied to search for gene cluster that may be regulated by the transcription factor and to determine the parameters of the recurrent neural network. The proposed method is tested on yeast cell cycle process. The inferred gene transcriptional regulatory networks are compared with previously reported results in the literature.
AB - The design principles of gene transcriptional regulation networks in cells have been puzzles due to their unknown dynamic and nonlinear mechanisms. Although high-throughput biotechnologies have generated unprecedented amounts of data, the integration of multi-source data to better understand the process of gene regulation has been a challenge in post genomics era. Gene expression data are limited in providing information about the underlying causal relationships among genes. Prior biological knowledge such as protein binding data and gene ontology annotation, albeit limited in quantity, reflects physical processes of gene regulation. In this paper, we introduce a computational framework for utilizing time course gene expression patterns, protein binding data, and gene ontology information to infer transcriptional regulatory modules. The proposed method mainly consists of three parts: (1) a fuzzy c-means clustering approach that exploits gene functional category information to define gene clusters; (2) a network motif detection tool that classifies the transcription factors into different kinds of regulatory modules based on protein binding data; and (3) a recurrent neural network model for each transcription factor that mimics the architecture of the predicted regulatory module. A hybrid of genetic algorithm and particle swarm optimization method is applied to search for gene cluster that may be regulated by the transcription factor and to determine the parameters of the recurrent neural network. The proposed method is tested on yeast cell cycle process. The inferred gene transcriptional regulatory networks are compared with previously reported results in the literature.
UR - http://www.scopus.com/inward/record.url?scp=50849091581&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=50849091581&partnerID=8YFLogxK
U2 - 10.1109/LSSA.2007.4400915
DO - 10.1109/LSSA.2007.4400915
M3 - Conference contribution
AN - SCOPUS:50849091581
SN - 9781424418138
T3 - 2007 IEEE/NIH Life Science Systems and Applications Workshop, LISA
SP - 184
EP - 187
BT - 2007 IEEE/NIH Life Science Systems and Applications Workshop, LISA
PB - IEEE Computer Society
T2 - 2007 IEEE/NIH Life Science Systems and Applications Workshop, LISA
Y2 - 8 November 2007 through 9 November 2007
ER -