ON THE FILE DESIGN PROBLEM FOR PARTIAL MATCH RETRIEVAL.

Research output: Contribution to journalArticle

14 Scopus citations

Abstract

It is shown that the problem of designing an optimal multikey hashing scheme taking into consideration the record distribution is computationally intractable (NP-hard). Therefore, a heuristic approach is necessary. In a multikey hashing scheme, although the directory is space efficient and the search algorithm is fast, due to the insufficient information in the directory some accessed buckets may not contain any record satisfying the given query. Thus, certain retrieval effort is wasted. A new class of file structures which combine a multikey hashing scheme and an indexed descriptor technique is introduced. By adding some extra information (either record descriptors or bucket descriptors) into the directory of a multikey hashing scheme, either only those buckets which contain at least one record satisfying the given query need to be accessed or the number of accessed buckets which do not contain any record satisfying the query is reduced.

Original languageEnglish (US)
Pages (from-to)213-222
Number of pages10
JournalIEEE Transactions on Software Engineering
VolumeSE-11
Issue number2
StatePublished - Feb 1 1985

Cite this