TY - GEN
T1 - Intent term weighting in e-commerce queries
AU - Manchanda, Saurav
AU - Sharma, Mohit
AU - Karypis, George
PY - 2019/11/3
Y1 - 2019/11/3
N2 - E-commerce search engines can fail to retrieve results that satisfy a query's product intent because: (i) conventional retrieval approaches, such as BM25, may ignore the important terms in queries owing to their low inverse document frequency (IDF), and (ii) for long queries, as is usually the case in rare queries (i.e., tail queries), they may fail to determine the relevant terms that are representative of the query's product intent. In this paper, we leverage the historical query reformulation logs of a large e-retailer (walmart.com) to develop a distant-supervision-based approach to identify the relevant terms that characterize the query's product intent. The key idea underpinning our approach is that the terms retained in the reformulation of a query are more important in describing the query's product intent than the discarded terms. Additionally, we also use the fact that the significance of a term depends on its context (other terms in the neighborhood) in the query to determine the term's importance towards the query's product intent. We show that identifying and emphasizing the terms that define the query's product intent leads to a 3% improvement in ranking and outperforms the context-unaware baselines.
AB - E-commerce search engines can fail to retrieve results that satisfy a query's product intent because: (i) conventional retrieval approaches, such as BM25, may ignore the important terms in queries owing to their low inverse document frequency (IDF), and (ii) for long queries, as is usually the case in rare queries (i.e., tail queries), they may fail to determine the relevant terms that are representative of the query's product intent. In this paper, we leverage the historical query reformulation logs of a large e-retailer (walmart.com) to develop a distant-supervision-based approach to identify the relevant terms that characterize the query's product intent. The key idea underpinning our approach is that the terms retained in the reformulation of a query are more important in describing the query's product intent than the discarded terms. Additionally, we also use the fact that the significance of a term depends on its context (other terms in the neighborhood) in the query to determine the term's importance towards the query's product intent. We show that identifying and emphasizing the terms that define the query's product intent leads to a 3% improvement in ranking and outperforms the context-unaware baselines.
KW - Query intent
KW - Query refinement
KW - Query reformulation
KW - Term weighting
UR - http://www.scopus.com/inward/record.url?scp=85075453052&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85075453052&partnerID=8YFLogxK
U2 - 10.1145/3357384.3358151
DO - 10.1145/3357384.3358151
M3 - Conference contribution
AN - SCOPUS:85075453052
T3 - International Conference on Information and Knowledge Management, Proceedings
SP - 2345
EP - 2348
BT - CIKM 2019 - Proceedings of the 28th ACM International Conference on Information and Knowledge Management
PB - Association for Computing Machinery
T2 - 28th ACM International Conference on Information and Knowledge Management, CIKM 2019
Y2 - 3 November 2019 through 7 November 2019
ER -