Intelligent dispatching is crucial to obtaining low response times in large-scale systems. One common scalable dispatching paradigm is the “power-of-d,” in which the dispatcher queries d servers at random and assigns the job to a server based only on the state of the queried servers. The bulk of power-of-d policies studied in the literature assume that the system is homogeneous, meaning that all servers have the same speed; meanwhile, real-world systems often exhibit server speed heterogeneity. This paper introduces a general framework for describing and analyzing heterogeneity-aware power-of-d policies. The key idea behind our framework is that dispatching policies can make use of server speed information at two decision points: when choosing which d servers to query and when assigning a job to one of those servers. Our framework explicitly separates the dispatching policy into a querying rule and an assignment rule; we consider general families of both rule types. While the strongest assignment rules incorporate both detailed queue-length information and server speed information, these rules typically are difficult to analyze. We overcome this difficulty by focusing on heterogeneity-aware assignment rules that ignore queue length information beyond idleness status. In this setting, we analyze mean response time and formulate novel optimization problems for the joint optimization of querying and assignment. We build upon our optimized policies to develop heuristic queue length-aware dispatching policies. Our heuristic policies perform well in simulation, relative to policies that have appeared in the literature.
Bibliographical notePublisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
- Mean Field Analysis