TY - GEN
T1 - Saliency prediction with scene structural guidance
AU - Liang, Haoran
AU - Jiang, Ming
AU - Liang, Ronghua
AU - Zhao, Qi
PY - 2017/11/27
Y1 - 2017/11/27
N2 - Previous works have suggested the role of scene information in directing gaze. The structure of a scene provides global contextual information that complements local object information in saliency prediction. In this study, we explore how scene envelopes such as openness, depth, and perspective affect visual attention in natural outdoor images. To facilitate this study, an eye tracking dataset is first built with 500 natural scene images and eye tracking data with 15 subjects free-viewing the images. We make observations on scene layout properties and propose a set of scene structural features relating to visual attention. We further integrate features from deep neural networks and use the set of complementary features for saliency prediction. Our features are independent of and can work together with many computational modules, and this work demonstrates the use of Multiple kernel learning (MKL) as an example to integrate the features at low- and high-levels. Experimental results demonstrate that our model outperforms existing methods and our scene structural features can improve the performance of other saliency models in outdoor scenes.
AB - Previous works have suggested the role of scene information in directing gaze. The structure of a scene provides global contextual information that complements local object information in saliency prediction. In this study, we explore how scene envelopes such as openness, depth, and perspective affect visual attention in natural outdoor images. To facilitate this study, an eye tracking dataset is first built with 500 natural scene images and eye tracking data with 15 subjects free-viewing the images. We make observations on scene layout properties and propose a set of scene structural features relating to visual attention. We further integrate features from deep neural networks and use the set of complementary features for saliency prediction. Our features are independent of and can work together with many computational modules, and this work demonstrates the use of Multiple kernel learning (MKL) as an example to integrate the features at low- and high-levels. Experimental results demonstrate that our model outperforms existing methods and our scene structural features can improve the performance of other saliency models in outdoor scenes.
KW - Eye-tracking dataset
KW - Scene envelop
KW - Visual saliency
UR - http://www.scopus.com/inward/record.url?scp=85044372166&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85044372166&partnerID=8YFLogxK
U2 - 10.1109/SMC.2017.8123170
DO - 10.1109/SMC.2017.8123170
M3 - Conference contribution
AN - SCOPUS:85044372166
T3 - 2017 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2017
SP - 3483
EP - 3488
BT - 2017 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 IEEE International Conference on Systems, Man, and Cybernetics, SMC 2017
Y2 - 5 October 2017 through 8 October 2017
ER -