Abstract
Many important cloud services require replicating massive data from one datacenter (DC) to multiple DCs. While the performance of pair-wise inter-DC data transfers has been much improved, prior solutions are insufficient to optimize bulk-data multicast, as they fail to explore the capability of servers to store-and-forward data, as well as the rich inter-DC overlay paths that exist in geo-distributed DCs. To take advantage of these opportunities, we present BDS, an application-level multicast overlay network for large-scale inter-DC data replication. At the core of BDS is a fully centralized architecture, allowing a central controller to maintain an up-to-date global view of data delivery status of intermediate servers, in order to fully utilize the available overlay paths. To quickly react to network dynamics and workload churns, BDS speeds up the control algorithm by decoupling it into selection of overlay paths and scheduling of data transfers, each can be optimized efficiently. This enables BDS to update overlay routing decisions in near realtime (e.g., every other second) at the scale of multicasting hundreds of TB data over tens of thousands of overlay paths. A pilot deployment in one of the largest online service providers shows that BDS can achieve 3-5× speedup over the provider’s existing system and several well-known overlay routing baselines.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the 13th EuroSys Conference, EuroSys 2018 |
Publisher | Association for Computing Machinery, Inc |
ISBN (Electronic) | 9781450355841 |
DOIs | |
State | Published - Apr 23 2018 |
Event | 13th EuroSys Conference, EuroSys 2018 - Porto, Portugal Duration: Apr 23 2018 → Apr 26 2018 |
Publication series
Name | Proceedings of the 13th EuroSys Conference, EuroSys 2018 |
---|---|
Volume | 2018-January |
Conference
Conference | 13th EuroSys Conference, EuroSys 2018 |
---|---|
Country/Territory | Portugal |
City | Porto |
Period | 4/23/18 → 4/26/18 |
Bibliographical note
Funding Information:This work is supported in part by the HK RGC ECS-26200014, CRF-C703615G, HKUST PDF fund, the China 973 Program (2014CB340300), the National Natural Foundation of China (61472212), EU Marie Curie Actions CROWN (FP7-PEOPLE-2013-IRSES-610524). We would like to thank our shepherd, Paolo Romano, and the anonymous EuroSys reviewers for their valuable feedback.
Publisher Copyright:
© 2018 Association for Computing Machinery.