TY - GEN

T1 - Calculating the squared euclidean distance for vertical data represented in ptrees

AU - Hossain, Mohammad K.

AU - Chatterjee, Arijit

AU - Roy, Arjun G.

AU - Perrizo, William

PY - 2012

Y1 - 2012

N2 - Euclidean distance measures the natural distance between two points in space, hence it is very common in use in mathematics. However it is computationally very expensive as there involves costly square and square root operations to calculate this distance. An alternative approach is to calculate this distance without computing the square root, which is called the Square Euclidean Distance (SED). Although it does not support the triangular inequality property but it can be used for comparing the distance of two points from a fixed points. For this reason SED has been used in classification, clustering, image processing and other areas to save the computational time as well as increase accuracy. In this paper we have shown how SED can be calculated for vertical data represented in pTrees. This algorithm uses only bitwise operations across various pTrees without performing any horizontal scan of the data points. As a result the algorithm works very fast on huge volume of data represented by pTrees comparing with traditional horizontal data representation.

AB - Euclidean distance measures the natural distance between two points in space, hence it is very common in use in mathematics. However it is computationally very expensive as there involves costly square and square root operations to calculate this distance. An alternative approach is to calculate this distance without computing the square root, which is called the Square Euclidean Distance (SED). Although it does not support the triangular inequality property but it can be used for comparing the distance of two points from a fixed points. For this reason SED has been used in classification, clustering, image processing and other areas to save the computational time as well as increase accuracy. In this paper we have shown how SED can be calculated for vertical data represented in pTrees. This algorithm uses only bitwise operations across various pTrees without performing any horizontal scan of the data points. As a result the algorithm works very fast on huge volume of data represented by pTrees comparing with traditional horizontal data representation.

UR - http://www.scopus.com/inward/record.url?scp=84872003096&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872003096&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84872003096

SN - 9781880843864

T3 - Proceedings of the 21st International Conference on Software Engineering and Data Engineering, SEDE 2012

SP - 185

EP - 189

BT - Proceedings of the 21st International Conference on Software Engineering and Data Engineering, SEDE 2012

T2 - 21st International Conference on Software Engineering and Data Engineering, SEDE 2012

Y2 - 27 June 2012 through 29 June 2012

ER -