SALICON: Reducing the semantic gap in saliency prediction by adapting deep neural networks

Xun Huang, Chengyao Shen, Xavier Boix, Qi Zhao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

226 Scopus citations

Abstract

Saliency in Context (SALICON) is an ongoing effort that aims at understanding and predicting visual attention. Conventional saliency models typically rely on low-level image statistics to predict human fixations. While these models perform significantly better than chance, there is still a large gap between model prediction and human behavior. This gap is largely due to the limited capability of models in predicting eye fixations with strong semantic content, the so-called semantic gap. This paper presents a focused study to narrow the semantic gap with an architecture based on Deep Neural Network (DNN). It leverages the representational power of high-level semantics encoded in DNNs pretrained for object recognition. Two key components are fine-tuning the DNNs fully convolutionally with an objective function based on the saliency evaluation metrics, and integrating information at different image scales. We compare our method with 14 saliency models on 6 public eye tracking benchmark datasets. Results demonstrate that our DNNs can automatically learn features particularly for saliency prediction that surpass by a big margin the state-of-the-art. In addition, our model ranks top to date under all seven metrics on the MIT300 challenge set.

Original languageEnglish (US)
Title of host publication2015 International Conference on Computer Vision, ICCV 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages262-270
Number of pages9
ISBN (Electronic)9781467383912
DOIs
StatePublished - Feb 17 2015
Event15th IEEE International Conference on Computer Vision, ICCV 2015 - Santiago, Chile
Duration: Dec 11 2015Dec 18 2015

Publication series

NameProceedings of the IEEE International Conference on Computer Vision
Volume2015 International Conference on Computer Vision, ICCV 2015
ISSN (Print)1550-5499

Other

Other15th IEEE International Conference on Computer Vision, ICCV 2015
CountryChile
CitySantiago
Period12/11/1512/18/15

Fingerprint Dive into the research topics of 'SALICON: Reducing the semantic gap in saliency prediction by adapting deep neural networks'. Together they form a unique fingerprint.

  • Cite this

    Huang, X., Shen, C., Boix, X., & Zhao, Q. (2015). SALICON: Reducing the semantic gap in saliency prediction by adapting deep neural networks. In 2015 International Conference on Computer Vision, ICCV 2015 (pp. 262-270). [7410395] (Proceedings of the IEEE International Conference on Computer Vision; Vol. 2015 International Conference on Computer Vision, ICCV 2015). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICCV.2015.38