TY - GEN
T1 - Scalable and Adaptive KNN for Regression over Graphs
AU - Barrash, Seth J
AU - Shen, Yanning
AU - Giannakis, Georgios B.
PY - 2019/12
Y1 - 2019/12
N2 - The family of k-nearest neighbor (k NN) schemes comprises simple and effective algorithms that can be used for general machine learning tasks such as classification and regression. The goal of linear k NN regression is to predict the value of an unseen datum as a linear combination of its k neighbors on a graph, which could be found via certain distance metric. Despite the simplicity and effectiveness of kmathrmNN regression, a general problem facing this approach is the proper choice of k. Most of the conventional k NN algorithms simply apply the same k for all data samples, meaning that the data samples are assumed to lie on a regular graph wherein each data sample is connected with k neighbors. However, a constant choice may not be optimal for data lying in a heterogeneous feature space. On the other hand, existing algorithms for adaptively choosing k usually incur high computational complexity, especially for large training datasets. In order to cope with this challenge, this paper introduces a novel algorithm which can adaptively choose k for each data sample; meanwhile it is capable of greatly reducing the training complexity by actively choosing training samples. Real data tests corroborate the efficiency and effectiveness of the novel algorithm.
AB - The family of k-nearest neighbor (k NN) schemes comprises simple and effective algorithms that can be used for general machine learning tasks such as classification and regression. The goal of linear k NN regression is to predict the value of an unseen datum as a linear combination of its k neighbors on a graph, which could be found via certain distance metric. Despite the simplicity and effectiveness of kmathrmNN regression, a general problem facing this approach is the proper choice of k. Most of the conventional k NN algorithms simply apply the same k for all data samples, meaning that the data samples are assumed to lie on a regular graph wherein each data sample is connected with k neighbors. However, a constant choice may not be optimal for data lying in a heterogeneous feature space. On the other hand, existing algorithms for adaptively choosing k usually incur high computational complexity, especially for large training datasets. In order to cope with this challenge, this paper introduces a novel algorithm which can adaptively choose k for each data sample; meanwhile it is capable of greatly reducing the training complexity by actively choosing training samples. Real data tests corroborate the efficiency and effectiveness of the novel algorithm.
KW - Gaussian processes
KW - k-nearest neighbors
KW - learning over graphs
KW - regression
UR - http://www.scopus.com/inward/record.url?scp=85082396188&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082396188&partnerID=8YFLogxK
U2 - 10.1109/CAMSAP45676.2019.9022509
DO - 10.1109/CAMSAP45676.2019.9022509
M3 - Conference contribution
T3 - 2019 IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, CAMSAP 2019 - Proceedings
SP - 241
EP - 245
BT - 2019 IEEE 8th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, CAMSAP 2019 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 8th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, CAMSAP 2019
Y2 - 15 December 2019 through 18 December 2019
ER -