Scalable load balancing in the presence of heterogeneous servers

Kristen Gardner, Jazeem Abdul Jaleel, Alexander Wickeham, Sherwin Doroudi

Research output: Contribution to journalArticlepeer-review

12 Scopus citations


Heterogeneity is becoming increasingly ubiquitous in modern large-scale computer systems. Developing good load balancing policies for systems whose resources have varying speeds is crucial in achieving low response times. Indeed, how best to dispatch jobs to servers is a classical and well-studied problem in the queueing literature. Yet the bulk of existing work on large-scale systems assumes homogeneous servers; unfortunately, policies that perform well in the homogeneous setting can cause unacceptably poor performance in heterogeneous systems. We adapt the “power-of-d” versions of both the Join-the-Idle-Queue and Join-the-Shortest-Queue policies to design two corresponding families of heterogeneity-aware dispatching policies, each of which is parameterized by a pair of routing probabilities. Unlike their heterogeneity-unaware counterparts, our policies use server speed information both when choosing which servers to query and when probabilistically deciding where (among the queried servers) to dispatch jobs. Both of our policy families are analytically tractable: our mean response time and queue length distribution analyses are exact as the number of servers approaches infinity, under standard assumptions. Furthermore, our policy families achieve maximal stability and outperform well-known dispatching rules – including heterogeneity-aware policies such as Shortest-Expected-Delay – with respect to mean response time.

Original languageEnglish (US)
Article number102151
JournalPerformance Evaluation
StatePublished - Jan 2021

Bibliographical note

Publisher Copyright:
© 2020 Elsevier B.V.


  • Dispatching
  • Heterogeneity
  • Join-Idle-Queue
  • Join-the-Shortest-Queue
  • Load balancing
  • Power of d


Dive into the research topics of 'Scalable load balancing in the presence of heterogeneous servers'. Together they form a unique fingerprint.

Cite this