Fast parallel cosine K-nearest neighbor graph construction

David C. Anastasiu, George Karypis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Scopus citations

Abstract

The k-nearest neighbor graph is an important structure in many data mining methods for clustering, advertising, recommender systems, and outlier detection. Constructing the graph requires computing up to n2 similarities for a set of n objects. This has led researchers to seek approximate methods, which find many but not all of the nearest neighbors. In contrast, we leverage shared memory parallelism and recent advances in similarity joins to solve the problem exactly, via a filtering based approach. Our method considers all pairs of potential neighbors but quickly filters those that could not be a part of the k-nearest neighbor graph, based on similarity upper bound estimates. We evaluated our solution on several real-world datasets and found that, using 16 threads, our method achieves up to 12.9x speedup over our exact baseline and is sometimes faster even than approximate methods. Moreover, an approximate version of our method is up to 21.7x more efficient than the best approximate state-of-the-art baseline at similar high recall.

Original languageEnglish (US)
Title of host publicationProceedings of IA3 2016 - 6th Workshop on Irregular Applications
Subtitle of host publicationArchitectures and Algorithms, Held in conjunction with SC 2016: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages50-53
Number of pages4
ISBN (Electronic)9781509038671
DOIs
StatePublished - Jan 25 2017
Event6th Workshop on Irregular Applications: Architectures and Algorithms, IA3 2016 - Salt Lake City, United States
Duration: Nov 13 2016 → …

Publication series

NameProceedings of IA3 2016 - 6th Workshop on Irregular Applications: Architectures and Algorithms, Held in conjunction with SC 2016: The International Conference for High Performance Computing, Networking, Storage and Analysis

Other

Other6th Workshop on Irregular Applications: Architectures and Algorithms, IA3 2016
CountryUnited States
CitySalt Lake City
Period11/13/16 → …

Fingerprint Dive into the research topics of 'Fast parallel cosine K-nearest neighbor graph construction'. Together they form a unique fingerprint.

Cite this