This article deals with the problem of classification of human activities from video. Our approach uses motion features that are computed very efficiently, and subsequently projected into a lower dimensional space where matching is performed. Each action is represented as a manifold in this lower dimensional space and matching is done by comparing these manifolds. To demonstrate the effectiveness of this approach, it was used on a large data set of similar actions, each performed by many different actors. Classification results were very accurate and show that this approach is robust to challenges such as variations in performers' physical attributes, color of clothing, and style of motion. An important result of this article is that the recovery of the three-dimensional properties of a moving person, or even the two-dimensional tracking of the person's limbs need not precede action recognition.
Bibliographical noteFunding Information:
We would like to thank the anonymous reviewers for their valuable comments. This work has been partially supported by the Minnesota Department of Transportation and the National Science Foundation through grants #CMS-0127893 and #IIS-0219863.
- Articulated motion
- Human tracking
- Motion recognition