Abstract
Top-down attention plays an important role in guidance of human attention in real-world scenarios, but less efforts in computational modeing of visual attention has been put on it. Inspired by the mechanisms of top-down attention in human visual perception, we propose a multi-layer linear model of top-down attention to modulate bottom-up saliency maps actively. The first layer is a linear regression model which combines the bottom-up saliency maps on various visual features and objects. A contextual dependent upper layer is introduced to tune the parameters of the lower layer model adaptively. Finally, a mask of selection history is applied to the fused attention map to bias the attention selection towards the task related regions. Efficient learning algorithm with single-pass polynomial complexity is derived. We evaluate our model on a set of natural egocentric videos captured from a wearable glass in real-world environments. Our model outperforms the baseline and state-of-the-art bottom-up saliency models.
Original language | English (US) |
---|---|
Title of host publication | 2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings |
Publisher | IEEE Computer Society |
Pages | 3470-3474 |
Number of pages | 5 |
ISBN (Electronic) | 9781509021758 |
DOIs | |
State | Published - Jul 2 2017 |
Event | 24th IEEE International Conference on Image Processing, ICIP 2017 - Beijing, China Duration: Sep 17 2017 → Sep 20 2017 |
Publication series
Name | Proceedings - International Conference on Image Processing, ICIP |
---|---|
Volume | 2017-September |
ISSN (Print) | 1522-4880 |
Other
Other | 24th IEEE International Conference on Image Processing, ICIP 2017 |
---|---|
Country/Territory | China |
City | Beijing |
Period | 9/17/17 → 9/20/17 |
Bibliographical note
Publisher Copyright:© 2017 IEEE.
Keywords
- Ego-centric
- Real-world
- Visual attention