TY - JOUR
T1 - On Multiple Random Accesses and Physical Data Placement in Dynamic Files
AU - Wang, Je Hao
AU - Yuen, Tak Sun
AU - Du, David Hung Chang
PY - 1987/8
Y1 - 1987/8
N2 - In the study of data storage and retrieval involving secondary storage devices, for example, magnetic disks, a simplified model of storage that is often used is that each access takes a constant amount of time. However, if some information about the accesses is known, the model should take into consideration the inherent characteristics of the storage devices. In this paper, we assume a more refined model of storage that takes into consideration the seek time, the latency time, and the transmission time of disk accesses separately. We analyze the time required to randomly access a set of records residing on a set of consecutive cylinders on a magnetic disk a number of times, say n, for n 1. This problem may arise, for example, in the processing of queries that involve several relations in a relational database system. We also analyze the more general situations in which the n operations mayrepresent retrievals, insertions, or deletions, or a combination of them. We assume that the dynamic file structure linear hashing is used for locating and organizing the records. A linear hashing file does not employ any directory and its primary data buckets are assumed to be contiguous, therefore the data area of a linear hashing file corresponds closely to the disk space on a set of consecutive cylinders. As more records are inserted into a dynamic file, the data area has to be expanded. Using the same refined model of storage, we propose a data placement method for linear hashing such that the data area expansion time can be reduced. An analysis is presented and it shows that the improvement is significant.
AB - In the study of data storage and retrieval involving secondary storage devices, for example, magnetic disks, a simplified model of storage that is often used is that each access takes a constant amount of time. However, if some information about the accesses is known, the model should take into consideration the inherent characteristics of the storage devices. In this paper, we assume a more refined model of storage that takes into consideration the seek time, the latency time, and the transmission time of disk accesses separately. We analyze the time required to randomly access a set of records residing on a set of consecutive cylinders on a magnetic disk a number of times, say n, for n 1. This problem may arise, for example, in the processing of queries that involve several relations in a relational database system. We also analyze the more general situations in which the n operations mayrepresent retrievals, insertions, or deletions, or a combination of them. We assume that the dynamic file structure linear hashing is used for locating and organizing the records. A linear hashing file does not employ any directory and its primary data buckets are assumed to be contiguous, therefore the data area of a linear hashing file corresponds closely to the disk space on a set of consecutive cylinders. As more records are inserted into a dynamic file, the data area has to be expanded. Using the same refined model of storage, we propose a data placement method for linear hashing such that the data area expansion time can be reduced. An analysis is presented and it shows that the improvement is significant.
KW - Database management
KW - data model
KW - file organization
KW - information storage and retrieval
KW - physical design
KW - query processing
UR - http://www.scopus.com/inward/record.url?scp=0023401397&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0023401397&partnerID=8YFLogxK
U2 - 10.1109/TSE.1987.233515
DO - 10.1109/TSE.1987.233515
M3 - Article
AN - SCOPUS:0023401397
SN - 0098-5589
VL - SE-13
SP - 977
EP - 987
JO - IEEE Transactions on Software Engineering
JF - IEEE Transactions on Software Engineering
IS - 8
ER -