TY - GEN
T1 - Integrative network component analysis for regulatory network reconstruction
AU - Wang, Chen
AU - Xuan, Jianhua
AU - Chen, Li
AU - Zhao, Po
AU - Wang, Yue
AU - Clarke, Robert
AU - Hoffman, Eric P.
PY - 2008
Y1 - 2008
N2 - Network Component Analysis (NCA) has shown its effectiveness in regulator identification by inferring the transcription factor activity (TFA) when both microarray data and ChIP-on-chip data are available. However, the NCA scheme is not applicable to many biological studies due to the lack of complete ChIP-on-chip data. In this paper, we propose an integrative NCA (iNCA) approach to combine motif information, limited ChIP-on-chip data, and gene expression data for regulatory network inference. Specifically, a Bayesian framework is adopted to develop a novel strategy, namely stability analysis with topological sampling, to infer key TFAs and their downstream gene targets. The iNCA approach with stability analysis reduces the computational cost by avoiding a direct estimation of the high-dimensional distribution in a traditional Bayesian approach. Stability indices are designed to measure the goodness of the estimated TFAs and their connectivity strengths. The approach can also be used to evaluate the confidence level of different data sources, considering the inevitable inconsistency among the data sources. The iNCA approach has been applied to a time course microarray data set of muscle regeneration. The experimental results show that iNCA can effectively integrate motif information, ChIP-on-chip data and microarray data to identify key regulators and their gene targets in muscle regeneration. In particular, several identified TFAs like those of MyoD, myogenin and YY1 are well supported by biological experiments.
AB - Network Component Analysis (NCA) has shown its effectiveness in regulator identification by inferring the transcription factor activity (TFA) when both microarray data and ChIP-on-chip data are available. However, the NCA scheme is not applicable to many biological studies due to the lack of complete ChIP-on-chip data. In this paper, we propose an integrative NCA (iNCA) approach to combine motif information, limited ChIP-on-chip data, and gene expression data for regulatory network inference. Specifically, a Bayesian framework is adopted to develop a novel strategy, namely stability analysis with topological sampling, to infer key TFAs and their downstream gene targets. The iNCA approach with stability analysis reduces the computational cost by avoiding a direct estimation of the high-dimensional distribution in a traditional Bayesian approach. Stability indices are designed to measure the goodness of the estimated TFAs and their connectivity strengths. The approach can also be used to evaluate the confidence level of different data sources, considering the inevitable inconsistency among the data sources. The iNCA approach has been applied to a time course microarray data set of muscle regeneration. The experimental results show that iNCA can effectively integrate motif information, ChIP-on-chip data and microarray data to identify key regulators and their gene targets in muscle regeneration. In particular, several identified TFAs like those of MyoD, myogenin and YY1 are well supported by biological experiments.
KW - ChIP-on-chip
KW - Gene regulatory networks
KW - Microarray data analysis
KW - Muscle regeneration
KW - Network component analysis
UR - http://www.scopus.com/inward/record.url?scp=49949094443&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=49949094443&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-79450-9_19
DO - 10.1007/978-3-540-79450-9_19
M3 - Conference contribution
AN - SCOPUS:49949094443
SN - 3540794492
SN - 9783540794493
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 196
EP - 207
BT - Bioinformatics Research and Applications - Fourth International Symposium, ISBRA 2008, Proceedings
T2 - 4th International Symposium on Bioinformatics Research and Applications, ISBRA 2008
Y2 - 6 May 2008 through 9 May 2008
ER -