A multimodal complexity comprehension-time framework for automated presentation synthesis

Harini Sridharan, Ankur Mani, Hari Sundaram

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations

Abstract

In this paper, we present a joint multimodal (audio, visual and text) framework to map the informational complexity of the media elements to comprehension time. The problem is important for interactive multimodal presentations. We propose the joint comprehension time to be a function of the media Kolmogorov complexity. For audio and images, the complexity is estimated using a lossless universal coding scheme. The text complexity is derived by analyzing the sentence structure. For all three channels, we conduct user-studies to map media complexity to comprehension time. For estimating the joint comprehension time, we assume channel independence resulting in a conservative comprehension time estimate. The time for the visual channels (text and images) are deemed additive, and the joint time is then the maximum of the visual and the auditory comprehension times. The user studies indicate that the model works very well, when compared with fixed-time multimodal presentations.

Original languageEnglish (US)
Title of host publicationIEEE International Conference on Multimedia and Expo, ICME 2005
Pages1042-1045
Number of pages4
DOIs
StatePublished - 2005
EventIEEE International Conference on Multimedia and Expo, ICME 2005 - Amsterdam, Netherlands
Duration: Jul 6 2005Jul 8 2005

Publication series

NameIEEE International Conference on Multimedia and Expo, ICME 2005
Volume2005

Other

OtherIEEE International Conference on Multimedia and Expo, ICME 2005
Country/TerritoryNetherlands
CityAmsterdam
Period7/6/057/8/05

Fingerprint

Dive into the research topics of 'A multimodal complexity comprehension-time framework for automated presentation synthesis'. Together they form a unique fingerprint.

Cite this