TY - JOUR
T1 - Identification of hepatocellular carcinoma-related genes with a machine learning and network analysis
AU - Gui, Tuantuan
AU - Dong, Xiao
AU - Li, Rudong
AU - Li, Yixue
AU - Wang, Zhen
N1 - Publisher Copyright:
© 2015 Mary Ann Liebert, Inc.
PY - 2015/1/1
Y1 - 2015/1/1
N2 - Liver cancer is one of the leading causes of cancer mortality worldwide. Hepatocellular carcinoma (HCC) is the main type of liver cancer. We applied a machine learning approach with maximum-relevance-minimum-redundancy (mRMR) algorithm followed by incremental feature selection (IFS) to a set of microarray data generated from 43 tumor and 52 nontumor samples. With the machine learning approach, we identified 117 gene probes that could optimally separate tumor and nontumor samples. These genes not only include known HCC-relevant genes such as MT1X, BMI1, and CAP2, but also include cancer genes that were not found previously to be closely related to HCC, such as TACSTD2. Then, we constructed a molecular interaction network based on the protein-protein interaction (PPI) data from the STRING database and identified 187 genes on the shortest paths among the genes identified with the machine learning approach. Network analysis reveals new potential roles of ubiquitin C in the pathogenesis of HCC. Based on gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, we showed that the identified subnetwork is significantly enriched in biological processes related to cell death. These results bring new insights of understanding the process of HCC.
AB - Liver cancer is one of the leading causes of cancer mortality worldwide. Hepatocellular carcinoma (HCC) is the main type of liver cancer. We applied a machine learning approach with maximum-relevance-minimum-redundancy (mRMR) algorithm followed by incremental feature selection (IFS) to a set of microarray data generated from 43 tumor and 52 nontumor samples. With the machine learning approach, we identified 117 gene probes that could optimally separate tumor and nontumor samples. These genes not only include known HCC-relevant genes such as MT1X, BMI1, and CAP2, but also include cancer genes that were not found previously to be closely related to HCC, such as TACSTD2. Then, we constructed a molecular interaction network based on the protein-protein interaction (PPI) data from the STRING database and identified 187 genes on the shortest paths among the genes identified with the machine learning approach. Network analysis reveals new potential roles of ubiquitin C in the pathogenesis of HCC. Based on gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis, we showed that the identified subnetwork is significantly enriched in biological processes related to cell death. These results bring new insights of understanding the process of HCC.
KW - Hepatocellular carcinoma
KW - maximum relevance minimum redundancy
KW - protein-protein interaction.
UR - http://www.scopus.com/inward/record.url?scp=84920276400&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84920276400&partnerID=8YFLogxK
U2 - 10.1089/cmb.2014.0122
DO - 10.1089/cmb.2014.0122
M3 - Article
C2 - 25247452
AN - SCOPUS:84920276400
SN - 1066-5277
VL - 22
SP - 63
EP - 71
JO - Journal of Computational Biology
JF - Journal of Computational Biology
IS - 1
ER -