M3U: Minimum Mean Minimum Uncertainty Feature Selection for Multiclass Classification

Zisheng Zhang, Keshab K Parhi

Research output: Contribution to journalArticle

Abstract

This paper presents a novel multiclass feature selection algorithm based on weighted conditional entropy, also referred to as uncertainty. The goal of the proposed algorithm is to select a feature subset such that, for each feature sample, there exists a feature that has a low uncertainty score in the selected feature subset. Features are first quantized into different bins. The proposed feature selection method first computes an uncertainty vector from weighted conditional entropy. Lower the uncertainty score for a class, better is the separability of the samples in that class. Next, an iterative feature selection method selects a feature in each iteration by (1) computing the minimum uncertainty score for each feature sample for all possible feature subset candidates, (2) computing the average minimum uncertainty score across all feature samples, and (3) selecting the feature that achieves the minimum of the mean of the minimum uncertainty score. The experimental results show that the proposed algorithm outperforms mRMR and achieves lower misclassification rates using various types of publicly available datasets. In most cases, the number of features necessary for a specified misclassification error is less than that required by traditional methods. For all datasets, the misclassification error is reduced by 5∼25% on average, compared to a traditional method.

Original languageEnglish (US)
JournalJournal of Signal Processing Systems
DOIs
StatePublished - Jan 1 2019

Fingerprint

Multi-class Classification
Feature Selection
Feature extraction
Uncertainty
Misclassification Error
Conditional Entropy
Subset
Entropy
Misclassification Rate
Computing
Bins
Multi-class
Separability
Set theory
Iteration
Necessary
Experimental Results

Keywords

  • Feature selection
  • Multi-class classification
  • Mutual information
  • Uncertainty score
  • Weighted conditional entropy

Cite this

@article{35674fa2754d453da6f82da5f99d2346,
title = "M3U: Minimum Mean Minimum Uncertainty Feature Selection for Multiclass Classification",
abstract = "This paper presents a novel multiclass feature selection algorithm based on weighted conditional entropy, also referred to as uncertainty. The goal of the proposed algorithm is to select a feature subset such that, for each feature sample, there exists a feature that has a low uncertainty score in the selected feature subset. Features are first quantized into different bins. The proposed feature selection method first computes an uncertainty vector from weighted conditional entropy. Lower the uncertainty score for a class, better is the separability of the samples in that class. Next, an iterative feature selection method selects a feature in each iteration by (1) computing the minimum uncertainty score for each feature sample for all possible feature subset candidates, (2) computing the average minimum uncertainty score across all feature samples, and (3) selecting the feature that achieves the minimum of the mean of the minimum uncertainty score. The experimental results show that the proposed algorithm outperforms mRMR and achieves lower misclassification rates using various types of publicly available datasets. In most cases, the number of features necessary for a specified misclassification error is less than that required by traditional methods. For all datasets, the misclassification error is reduced by 5∼25{\%} on average, compared to a traditional method.",
keywords = "Feature selection, Multi-class classification, Mutual information, Uncertainty score, Weighted conditional entropy",
author = "Zisheng Zhang and Parhi, {Keshab K}",
year = "2019",
month = "1",
day = "1",
doi = "10.1007/s11265-019-1443-6",
language = "English (US)",
journal = "Journal of Signal Processing Systems",
issn = "1939-8018",
publisher = "Springer New York",

}

TY - JOUR

T1 - M3U

T2 - Minimum Mean Minimum Uncertainty Feature Selection for Multiclass Classification

AU - Zhang, Zisheng

AU - Parhi, Keshab K

PY - 2019/1/1

Y1 - 2019/1/1

N2 - This paper presents a novel multiclass feature selection algorithm based on weighted conditional entropy, also referred to as uncertainty. The goal of the proposed algorithm is to select a feature subset such that, for each feature sample, there exists a feature that has a low uncertainty score in the selected feature subset. Features are first quantized into different bins. The proposed feature selection method first computes an uncertainty vector from weighted conditional entropy. Lower the uncertainty score for a class, better is the separability of the samples in that class. Next, an iterative feature selection method selects a feature in each iteration by (1) computing the minimum uncertainty score for each feature sample for all possible feature subset candidates, (2) computing the average minimum uncertainty score across all feature samples, and (3) selecting the feature that achieves the minimum of the mean of the minimum uncertainty score. The experimental results show that the proposed algorithm outperforms mRMR and achieves lower misclassification rates using various types of publicly available datasets. In most cases, the number of features necessary for a specified misclassification error is less than that required by traditional methods. For all datasets, the misclassification error is reduced by 5∼25% on average, compared to a traditional method.

AB - This paper presents a novel multiclass feature selection algorithm based on weighted conditional entropy, also referred to as uncertainty. The goal of the proposed algorithm is to select a feature subset such that, for each feature sample, there exists a feature that has a low uncertainty score in the selected feature subset. Features are first quantized into different bins. The proposed feature selection method first computes an uncertainty vector from weighted conditional entropy. Lower the uncertainty score for a class, better is the separability of the samples in that class. Next, an iterative feature selection method selects a feature in each iteration by (1) computing the minimum uncertainty score for each feature sample for all possible feature subset candidates, (2) computing the average minimum uncertainty score across all feature samples, and (3) selecting the feature that achieves the minimum of the mean of the minimum uncertainty score. The experimental results show that the proposed algorithm outperforms mRMR and achieves lower misclassification rates using various types of publicly available datasets. In most cases, the number of features necessary for a specified misclassification error is less than that required by traditional methods. For all datasets, the misclassification error is reduced by 5∼25% on average, compared to a traditional method.

KW - Feature selection

KW - Multi-class classification

KW - Mutual information

KW - Uncertainty score

KW - Weighted conditional entropy

UR - http://www.scopus.com/inward/record.url?scp=85062040868&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062040868&partnerID=8YFLogxK

U2 - 10.1007/s11265-019-1443-6

DO - 10.1007/s11265-019-1443-6

M3 - Article

JO - Journal of Signal Processing Systems

JF - Journal of Signal Processing Systems

SN - 1939-8018

ER -