TY - GEN
T1 - Big data clustering via random sketching and validation
AU - Traganitis, Panagiotis A.
AU - Slavakis, Konstantinos
AU - Giannakis, Georgios B
PY - 2015/4/24
Y1 - 2015/4/24
N2 - As the number and dimensionality of data increases, development of new efficient processing tools has become a necessity. The present paper introduces a novel dimensionality reduction approach for fast and efficient clustering of high-dimensional data. The new methods extend random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to the dimensionality reduction problem. The advocated random sketching and validation K-means (SkeVa K-means) and Divergence SkeVa algorithms can achieve high performance, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.
AB - As the number and dimensionality of data increases, development of new efficient processing tools has become a necessity. The present paper introduces a novel dimensionality reduction approach for fast and efficient clustering of high-dimensional data. The new methods extend random sampling and consensus (RANSAC) arguments, originally developed for robust regression tasks in computer vision, to the dimensionality reduction problem. The advocated random sketching and validation K-means (SkeVa K-means) and Divergence SkeVa algorithms can achieve high performance, with the latter being able to afford lower computational footprint than the former. Extensive numerical tests on synthetic and real datasets highlight the potential of the proposed algorithms, and demonstrate their competitive performance relative to state-of-the-art random projection alternatives.
KW - Clustering
KW - K-means
KW - big data
KW - feature selection
KW - high-dimensional data
KW - random sampling and consensus
KW - random sketching and validation
UR - http://www.scopus.com/inward/record.url?scp=84940479443&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84940479443&partnerID=8YFLogxK
U2 - 10.1109/ACSSC.2014.7094614
DO - 10.1109/ACSSC.2014.7094614
M3 - Conference contribution
AN - SCOPUS:84940479443
T3 - Conference Record - Asilomar Conference on Signals, Systems and Computers
SP - 1046
EP - 1050
BT - Conference Record of the 48th Asilomar Conference on Signals, Systems and Computers
A2 - Matthews, Michael B.
PB - IEEE Computer Society
T2 - 48th Asilomar Conference on Signals, Systems and Computers, ACSSC 2015
Y2 - 2 November 2014 through 5 November 2014
ER -