Non-Myopic Knowledge Gradient Policy for Ranking and Selection

Kexin Qin, L. Jeff Hong, Weiwei Fan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We consider the ranking and selection (R&S) problem with fixed simulation budget, in which the budget is assumed to be allocated sequentially. Deriving the optimal sampling procedure for this problem amounts to solving a stochastic dynamic program that is highly intractable. To overcome this difficulty, the existing R&S procedures are often designed from a myopic viewpoint. However, these myopic procedures are only single-step optimal and may have a poor performance for general sequential R&S problems. Therefore, in this paper, we combine two popular lookahead strategies and design a non-myopic knowledge gradient (KG) procedure. Meanwhile, to streamline the computation of procedure, we propose a modified Monte Carlo tree search method specifically designed under the R&S context. We show that the new procedure can exhibit a performance superior to the classic KG.

Original languageEnglish (US)
Title of host publicationProceedings of the 2022 Winter Simulation Conference, WSC 2022
EditorsB. Feng, G. Pedrielli, Y. Peng, S. Shashaani, E. Song, C.G. Corlu, L.H. Lee, E.P. Chew, T. Roeder, P. Lendermann
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3051-3062
Number of pages12
ISBN (Electronic)9798350309713
DOIs
StatePublished - 2022
Externally publishedYes
Event2022 Winter Simulation Conference, WSC 2022 - Guilin, China
Duration: Dec 11 2022Dec 14 2022

Publication series

NameProceedings - Winter Simulation Conference
Volume2022-December
ISSN (Print)0891-7736

Conference

Conference2022 Winter Simulation Conference, WSC 2022
Country/TerritoryChina
CityGuilin
Period12/11/2212/14/22

Bibliographical note

Publisher Copyright:
© 2022 IEEE.

Fingerprint

Dive into the research topics of 'Non-Myopic Knowledge Gradient Policy for Ranking and Selection'. Together they form a unique fingerprint.

Cite this