TY - GEN
T1 - Multiple sources classification of gene position on chromosomes using statistical significance of individual classification results
AU - Alnemer, Loai M.
AU - Al-Azzam, Omar
AU - Chitraranjan, Charith
AU - Denton, Anne M.
AU - Bassi, Filippo M.
AU - Iqbal, Muhammad J.
AU - Kianian, Shahryar F.
PY - 2011/12/1
Y1 - 2011/12/1
N2 - In data mining applications it is common to have more than one data source available to describe the same record. For example, in biological sciences, the same genes may be characterized through many types of experiments. Which of the data sources proves to be most reliable in predictions may depend on the record in question. For some records pieces of information may be unavailable because an experiment has not yet been done, or certain type of inferences may not be applicable, such as when a gene does not have a homologue in some species. We demonstrate how multi-classifier systems can allow classification in cases where any individual source is scarce or unreliable to provide an accurate prediction model by itself. We propose a method to predict a class label using statistical significance of individual classification results. We show that the proposed approach increases the accuracy of results compared with conventional techniques in a problem related to gene mapping in wheat.
AB - In data mining applications it is common to have more than one data source available to describe the same record. For example, in biological sciences, the same genes may be characterized through many types of experiments. Which of the data sources proves to be most reliable in predictions may depend on the record in question. For some records pieces of information may be unavailable because an experiment has not yet been done, or certain type of inferences may not be applicable, such as when a gene does not have a homologue in some species. We demonstrate how multi-classifier systems can allow classification in cases where any individual source is scarce or unreliable to provide an accurate prediction model by itself. We propose a method to predict a class label using statistical significance of individual classification results. We show that the proposed approach increases the accuracy of results compared with conventional techniques in a problem related to gene mapping in wheat.
KW - Density-based algorithms
KW - classifier-fusion
KW - gene mapping
KW - synteny information
UR - http://www.scopus.com/inward/record.url?scp=84857863049&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84857863049&partnerID=8YFLogxK
U2 - 10.1109/ICMLA.2011.101
DO - 10.1109/ICMLA.2011.101
M3 - Conference contribution
AN - SCOPUS:84857863049
SN - 9780769546070
T3 - Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011
SP - 7
EP - 12
BT - Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011
T2 - 10th International Conference on Machine Learning and Applications, ICMLA 2011
Y2 - 18 December 2011 through 21 December 2011
ER -