Privacy preserving nearest neighbor search

Mark Shaneck, Yongdae Kim, Vipin Kumar

Research output: Chapter in Book/Report/Conference proceedingChapter

32 Scopus citations

Abstract

Data mining is frequently obstructed by privacy concerns. In many cases data is distributed, and bringing the data together in one place for analysis is not possible due to privacy laws (e.g. HIPAA) or policies. Privacy preserving data mining techniques have been developed to address this issue by providing mechanisms to mine the data while giving certain privacy guarantees. In this chapter we address the issue of privacy preserving nearest neighbor search, which forms the kernel of many data mining applications. To this end, we present a novel algorithm based on secure multiparty computation primitives to compute the nearest neighbors of records in horizontally distributed data. We show how this algorithm can be used in three important data mining algorithms, namely LOF outlier detection, SNN clustering, and kNN classification. We prove the security of these algorithms under the semi-honest adversarial model, and describe methods that can be used to optimize their performance. Keywords: Privacy Preserving Data Mining, Nearest Neighbor Search, Outlier Detection, Clustering, Classification, Secure Multiparty Computation

Original languageEnglish (US)
Title of host publicationMachine Learning in Cyber Trust
Subtitle of host publicationSecurity, Privacy, and Reliability
PublisherSpringer US
Pages247-276
Number of pages30
ISBN (Print)9780387887340
DOIs
StatePublished - 2009

Fingerprint

Dive into the research topics of 'Privacy preserving nearest neighbor search'. Together they form a unique fingerprint.

Cite this