Fantastic answers and where to find them: Immersive question-directed visual attention

Ming Jiang, Shi Chen, Jinhui Yang, Qi Zhao

Research output: Contribution to journalConference articlepeer-review

8 Scopus citations


While most visual attention studies focus on bottom-up attention with restricted field-of-view, real-life situations are filled with embodied vision tasks. The role of attention is more significant in the latter due to the information overload, and attention to the most important regions is critical to the success of tasks. The effects of visual attention on task performance in this context have also been widely ignored. This research addresses a number of challenges to bridge this research gap, on both the data and model aspects. Specifically, we introduce the first dataset of top-down attention in immersive scenes. The Immersive Question-directed Visual Attention (IQVA) dataset features visual attention and corresponding task performance (i.e., answer correctness). It consists of 975 questions and answers collected from people viewing 360° videos in a head-mounted display. Analyses of the data demonstrate a significant correlation between people's task performance and their eye movements, suggesting the role of attention in task performance. With that, a neural network is developed to encode the differences of correct and incorrect attention and jointly predict the two. The proposed attention model for the first time takes into account answer correctness, whose outputs naturally distinguish important regions from distractions. This study with new data and features may enable new tasks that leverage attention and answer correctness, and inspire new research that reveals the process behind decision making in performing various tasks.

Original languageEnglish (US)
Article number9157348
Pages (from-to)2977-2986
Number of pages10
JournalProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
StatePublished - 2020
Event2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020 - Virtual, Online, United States
Duration: Jun 14 2020Jun 19 2020

Bibliographical note

Funding Information:
This work is supported by NSF Grants 1908711 and 1849107.

Publisher Copyright:
© 2020 IEEE


Dive into the research topics of 'Fantastic answers and where to find them: Immersive question-directed visual attention'. Together they form a unique fingerprint.

Cite this