Similarity graph-based approach to declustering problems and its application towards parallelizing grid files

Duen Ren Liu, Shashi Shekhar

Research output: Chapter in Book/Report/Conference proceedingConference contribution

16 Scopus citations

Abstract

We propose a new similarity-based technique for declustering data. The proposed method can adapt to available information about query distributions, data distributions, data sizes and partition-size constraints. The method is based on max-cut partitioning of a similarity graph defined over the given set of data, under constraints on the partition sizes. It maximizes the chances that a pair of data-items that are to be accessed together by queries are allocated to distinct disks. We show that the proposed method can achieve optimal speed-up for a query-set, if there exists any other declustering method which will achieve the optimal speed-up. Experiments in parallelizing Grid Files show that the proposed method outperforms mapping-function-based methods for interesting query distributions as well for non-uniform data distributions.

Original languageEnglish (US)
Title of host publicationProceedings - International Conference on Data Engineering
PublisherIEEE
Pages373-381
Number of pages9
StatePublished - Jan 1 1995
EventProceedings of the 1995 IEEE 11th International Conference on Data Engineering - Taipei, Taiwan
Duration: Mar 6 1995Mar 10 1995

Other

OtherProceedings of the 1995 IEEE 11th International Conference on Data Engineering
CityTaipei, Taiwan
Period3/6/953/10/95

Fingerprint Dive into the research topics of 'Similarity graph-based approach to declustering problems and its application towards parallelizing grid files'. Together they form a unique fingerprint.

Cite this