Algorithms to calculate the Manhattan (L1) distance for vertical data represented in pTrees

Mohammad Hossain, Arjun G. Roy, Arijit Chatterjee, William Perrizo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

In data mining applications different types of distance metrics are used to measure the closeness of two data points. Among these metrics Manhattan (L1), Euclidean (L2) and Max (L.) distances are used very frequently in various algorithms. In pTree vertical data representation Max distance can be efficiently implemented using only bitwise operations across the pTrees without any horizontal access of the data points. But many clustering and classification algorithms require computing L1 and L2 distances in order to increase 1 their accuracy. In this paper we have shown how Manhattan or L1 distance can be calculated for vertical data represented in pTrees. Similar to the Max distance this algorithm also uses only bitwise operations across various pTrees without performing any horizontal scan of the data points. As a result the algorithm works very fast on huge volume of data represented by pTrees comparing with traditional horizontal data representation. Also these algorithms enable various data mining algorithms that use pTrees to improve their accuracy without sacrificing any significant speed.

Original languageEnglish (US)
Title of host publicationProceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012
Pages65-70
Number of pages6
StatePublished - Dec 1 2012
Externally publishedYes
Event27th International Conference on Computers and Their Applications, CATA 2012 - Las Vegas, NV, United States
Duration: Mar 12 2012Mar 14 2012

Publication series

NameProceedings of the ISCA 27th International Conference on Computers and Their Applications, CATA 2012

Conference

Conference27th International Conference on Computers and Their Applications, CATA 2012
CountryUnited States
CityLas Vegas, NV
Period3/12/123/14/12

Fingerprint Dive into the research topics of 'Algorithms to calculate the Manhattan (L1) distance for vertical data represented in pTrees'. Together they form a unique fingerprint.

Cite this