L2Knng: Fast exact K-nearest neighbor graph construction with L2-norm pruning

David C. Anastasiu, George Karypis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

13 Scopus citations

Abstract

The k-nearest neighbor graph is often used as a building block in information retrieval, clustering, online advertising, and recommender systems algorithms. The complexity of constructing the exact k-nearest neighbor graph is quadratic on the number of objects that are compared, and most existing methods solve the problem approximately. We present L2Knng, an efficient algorithm that finds the exact cosine similarity k-nearest neighbor graph for a set of sparse high-dimensional objects. Our algorithm quickly builds an approximate solution to the problem, identifying many of the most similar neighbors, and then uses theoretic bounds on the similarity of two vectors, based on the ℓ2-norm of part of the vectors, to find each object's exact k-neighborhood. We perform an extensive evaluation of our algorithm, comparing against both exact and approximate baselines, and demonstrate the efficiency of our method across a variety of real-world datasets and neighborhood sizes. Our approximate and exact L2Knng variants compute the k-nearest neighbor graph up to an order of magnitude faster than their respective baselines.

Original languageEnglish (US)
Title of host publicationCIKM 2015 - Proceedings of the 24th ACM International Conference on Information and Knowledge Management
PublisherAssociation for Computing Machinery
Pages791-800
Number of pages10
ISBN (Electronic)9781450337946
DOIs
StatePublished - Oct 17 2015
Event24th ACM International Conference on Information and Knowledge Management, CIKM 2015 - Melbourne, Australia
Duration: Oct 19 2015Oct 23 2015

Publication series

NameInternational Conference on Information and Knowledge Management, Proceedings
Volume19-23-Oct-2015

Other

Other24th ACM International Conference on Information and Knowledge Management, CIKM 2015
CountryAustralia
CityMelbourne
Period10/19/1510/23/15

Keywords

  • Cosine similarity
  • K-nearest neighbor graph
  • Similarity search
  • Top-k

Fingerprint Dive into the research topics of 'L2Knng: Fast exact K-nearest neighbor graph construction with L2-norm pruning'. Together they form a unique fingerprint.

Cite this