Abstract
A novel deep convolution neural network is proposed to predict gaze on current frames in egocentric videos. Inspired by human visual system, we introduce a fovea module responsible for sharp central vision and name our model as Foveated Neural Network (FNN). The retina-like visual inputs from the region of interest on the previous frame are analysed and encoded. The fusion of the hidden representations of the previous frame and the feature maps of the current frame guides the gaze prediction on the current frame. In order to simulate motion, we also include the dense optical flow between these adjacent frames as additional input. Experimental results show that FNN outperforms the state-of-the-art algorithms in the publicly available egocentric dataset. The analysis of FNN demonstrates that the hidden representations of the foveated visual input from the previous frame as well as the motion information between adjacent frames are efficient in improving gaze prediction performance in egocentric videos.
| Original language | English (US) |
|---|---|
| Title of host publication | 2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings |
| Publisher | IEEE Computer Society |
| Pages | 3720-3724 |
| Number of pages | 5 |
| ISBN (Electronic) | 9781509021758 |
| DOIs | |
| State | Published - Jul 2 2017 |
| Event | 24th IEEE International Conference on Image Processing, ICIP 2017 - Beijing, China Duration: Sep 17 2017 → Sep 20 2017 |
Publication series
| Name | Proceedings - International Conference on Image Processing, ICIP |
|---|---|
| Volume | 2017-September |
| ISSN (Print) | 1522-4880 |
Other
| Other | 24th IEEE International Conference on Image Processing, ICIP 2017 |
|---|---|
| Country/Territory | China |
| City | Beijing |
| Period | 9/17/17 → 9/20/17 |
Bibliographical note
Funding Information:This work was supported by the Reverse Engineering Visual Intelligence for cognitive Enhancement (REVIVE) programme funded by the Joint Council Office of A*STAR, Singapore.
Publisher Copyright:
© 2017 IEEE.
Keywords
- Egocentric Videos
- Fovea
- Gaze
- Saliency
- Visual Attention
Fingerprint
Dive into the research topics of 'Foveated neural network: Gaze prediction on egocentric videos'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS