Q2P: Discovering query templates via autocompletion

Wensheng Wu, Weiyi Meng, Weifeng Su, Guangyou Zhou, Yao Yi Chiang

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

We present Q2P, a system that discovers query templates from search engines via their query autocompletion services. Q2P is distinct from the existing works in that it does not rely on query logs of search engines that are typically not readily available. Q2P is also unique in that it uses a trie to economically store queries sampled from a search engine and employs a beam-search strategy that focuses the expansion of the trie on its most promising nodes. Furthermore, Q2P leverages the trie-based storage of query sample to discover query templates using only two passes over the trie. Q2P is a key part of our ongoing project Deep2Q on a template-driven data integration on the Deep Web, where the templates learned by Q2P are used to guide the integration process in Deep2Q. Experimental results on four major search engines indicate that (1) Q2P sends only a moderate number of queries (ranging from 597 to 1,135) to the engines, while obtaining a significant number of completions per query (ranging from 4.2 to 8.5 on the average); (2) a significant number of templates (ranging from 8 to 32 when the minimum support for frequent templates is set to 1%) may be discovered from the samples.

Original languageEnglish (US)
Article number10
JournalACM Transactions on the Web
Volume10
Issue number2
DOIs
StatePublished - May 2016
Externally publishedYes

Bibliographical note

Funding Information:
This work was supported in part by the Guangdong Natural Science Foundation (Grant No. S2013010016852), BNU-HKBU United International College internal grant, and the National Natural Science Foundation of China (Grant No. 61303180 and 61573163).

Publisher Copyright:
© 2016 ACM.

Keywords

  • Autocompletion
  • Pattern discovery
  • Query templates
  • Search engines
  • Trie

Fingerprint

Dive into the research topics of 'Q2P: Discovering query templates via autocompletion'. Together they form a unique fingerprint.

Cite this