TY - GEN
T1 - Feature selection via probabilistic outputs
AU - Arnosti, Nicholas A.
AU - Danyluk, Andrea Pohoreckyj
PY - 2012
Y1 - 2012
N2 - This paper investigates two feature-scoring criteria that make use of estimated class probabilities: one method proposed by Shen et al. (2008) and a complementary approach proposed below. We develop a theoretical framework to analyze each criterion and show that both estimate the spread (across all values of a given feature) of the probability that an example belongs to the positive class. Based on our analysis, we predict when each scoring technique will be advantageous over the other and give empirical results validating our predictions.
AB - This paper investigates two feature-scoring criteria that make use of estimated class probabilities: one method proposed by Shen et al. (2008) and a complementary approach proposed below. We develop a theoretical framework to analyze each criterion and show that both estimate the spread (across all values of a given feature) of the probability that an example belongs to the positive class. Based on our analysis, we predict when each scoring technique will be advantageous over the other and give empirical results validating our predictions.
UR - http://www.scopus.com/inward/record.url?scp=84867126717&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867126717&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84867126717
SN - 9781450312851
T3 - Proceedings of the 29th International Conference on Machine Learning, ICML 2012
SP - 1791
EP - 1798
BT - Proceedings of the 29th International Conference on Machine Learning, ICML 2012
T2 - 29th International Conference on Machine Learning, ICML 2012
Y2 - 26 June 2012 through 1 July 2012
ER -