Foveated neural network: Gaze prediction on egocentric videos

Mengmi Zhang, Keng Teck Ma, Joo Hwee Lim, Qi Zhao

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A novel deep convolution neural network is proposed to predict gaze on current frames in egocentric videos. Inspired by human visual system, we introduce a fovea module responsible for sharp central vision and name our model as Foveated Neural Network (FNN). The retina-like visual inputs from the region of interest on the previous frame are analysed and encoded. The fusion of the hidden representations of the previous frame and the feature maps of the current frame guides the gaze prediction on the current frame. In order to simulate motion, we also include the dense optical flow between these adjacent frames as additional input. Experimental results show that FNN outperforms the state-of-the-art algorithms in the publicly available egocentric dataset. The analysis of FNN demonstrates that the hidden representations of the foveated visual input from the previous frame as well as the motion information between adjacent frames are efficient in improving gaze prediction performance in egocentric videos.

Original languageEnglish (US)
Title of host publication2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings
PublisherIEEE Computer Society
Pages3720-3724
Number of pages5
ISBN (Electronic)9781509021758
DOIs
StatePublished - Feb 20 2018
Event24th IEEE International Conference on Image Processing, ICIP 2017 - Beijing, China
Duration: Sep 17 2017Sep 20 2017

Publication series

NameProceedings - International Conference on Image Processing, ICIP
Volume2017-September
ISSN (Print)1522-4880

Other

Other24th IEEE International Conference on Image Processing, ICIP 2017
CountryChina
CityBeijing
Period9/17/179/20/17

Bibliographical note

Funding Information:
This work was supported by the Reverse Engineering Visual Intelligence for cognitive Enhancement (REVIVE) programme funded by the Joint Council Office of A*STAR, Singapore.

Publisher Copyright:
© 2017 IEEE.

Copyright:
Copyright 2018 Elsevier B.V., All rights reserved.

Keywords

  • Egocentric Videos
  • Fovea
  • Gaze
  • Saliency
  • Visual Attention

Fingerprint Dive into the research topics of 'Foveated neural network: Gaze prediction on egocentric videos'. Together they form a unique fingerprint.

Cite this