Redefining data locality for cross-data center storage

Kwangsung Oh, Ajaykrishna Raghavan, Abhishek Chandra, Jon Weissman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations

Abstract

Many Cloud applications exploit the diversity of storage options in a data center to achieve desired cost, performance, and durability tradeoffs. It is common to see applications using a combination of memory, local disk, and archival storage tiers within a single data center to meet their needs. For example, hot data can be kept in memory using ElastiCache, and colder data in cheaper, slower storage such as S3, using Amazon as an example. For user-facing applications, a recent trend is to exploit multiple data centers for data placement to enable better latency of access from users to their data. The conventional wisdom is that co-location of computation and storage within the same data center is a key to application performance, so that applications running within a data center are often still limited to access local data. In this paper, using experiments on Amazon, Microsoft, and Google clouds, we show that this assumption is false, and that accessing data in nearby data centers may be faster than local access at different or even same points in the storage hierarchy. This can lead to not only better performance, but also reduced cost, simpler consistency policies and reconsidering data locality in multiple DCs environment. This argues for an expansion of cloud storage tiers to consider non-local storage options, and has interesting implications for the design of a distributed storage system.

Original languageEnglish (US)
Title of host publicationBigSystem 2015 - Proceedings of the 2nd International Workshop on Software-Defined Ecosystems, Part of HPDC 2015
PublisherAssociation for Computing Machinery, Inc
Pages15-22
Number of pages8
ISBN (Electronic)9781450335683
DOIs
StatePublished - Jun 16 2015
Event2nd International Workshop on Software-Defined Ecosystems, BigSystem 2015 - Portland, United States
Duration: Jun 16 2015 → …

Publication series

NameBigSystem 2015 - Proceedings of the 2nd International Workshop on Software-Defined Ecosystems, Part of HPDC 2015

Other

Other2nd International Workshop on Software-Defined Ecosystems, BigSystem 2015
CountryUnited States
CityPortland
Period6/16/15 → …

Keywords

  • Data locality
  • In memory storage
  • Multi-tiered storage
  • Mutli-DCs
  • Wide area storage

Fingerprint Dive into the research topics of 'Redefining data locality for cross-data center storage'. Together they form a unique fingerprint.

Cite this