Abstract
Machine learning techniques for computer vision applications like object recognition, scene classification, etc., require a large number of training samples for satisfactory performance. Especially when classification is to be performed over many categories, providing enough training samples for each category is infeasible. This paper describes new ideas in multiclass active learning to deal with the training bottleneck, making it easier to train large multiclass image classification systems. First, we propose a new interaction modality for training which requires only yes-no type binary feedback instead of a precise category label. The modality is especially powerful in the presence of hundreds of categories. For the proposed modality, we develop a Value-of-Information (VOI) algorithm that chooses informative queries while also considering user annotation cost. Second, we propose an active selection measure that works with many categories and is extremely fast to compute. This measure is employed to perform a fast seed search before computing VOI, resulting in an algorithm that scales linearly with dataset size. Third, we use locality sensitive hashing to provide a very fast approximation to active learning, which gives sublinear time scaling, allowing application to very large datasets. The approximation provides up to two orders of magnitude speedups with little loss in accuracy. Thorough empirical evaluation of classification accuracy, noise sensitivity, imbalanced data, and computational performance on a diverse set of image datasets demonstrates the strengths of the proposed algorithms.
Original language | English (US) |
---|---|
Article number | 6127880 |
Pages (from-to) | 2259-2273 |
Number of pages | 15 |
Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
Volume | 34 |
Issue number | 11 |
DOIs | |
State | Published - 2012 |
Bibliographical note
Funding Information:Diploma degree in electrical and computer engineering from the National Technical Uni-versity of Athens, Greece, in 1987, the MSEE degree in electrical engineering from Carnegie Mellon University (CMU), Pittsburgh, Pennsyl-vania, in 1988, and the PhD degree in electrical and computer engineering from Carnegie Mellon University in 1992. Currently, he is a Distinguished McKnight University Professor in the Department of Computer Science at the University of Minnesota and the director of the Center for Distributed Robotics and SECTTRA. His research interests include robotics, computer vision, sensors for transportation applications, and control. He has authored or coauthored more than 280 journal and conference papers in the above areas (67 refereed journal papers). He was a finalist for the Anton Philips Award for Best Student Paper at the 1991 IEEE International Conference on Robotics and Automation and the recipient of the Best Video Award at the 2000 IEEE International Conference on Robotics and Automation. Furthermore, he was a recipient of the Kritski fellowship in 1986 and 1987. He was a McKnight Land-Grant Professor at the University of Minnesota for the period 1995-1997 and has received the US National Science Foundation (NSF) Research Initiation and Early Career Development Awards. He was also awarded the Faculty Creativity Award by the University of Minnesota. One of his papers (coauthored by O. Masoud) was awarded the IEEE VTS 2001 Best Land Transportation Paper Award. Finally, he has received grants from the US Defense Advanced Research Projects Agency (DARPA), DHS, US Army, US Air Force, Sandia National Laboratories, NSF, Lockheed Martin, Microsoft, INEEL, USDOT, MN/DOT, Honeywell, and 3M (more than $20M). He is a fellow of the IEEE.
Funding Information:
This work was supported in part by the US National Science Foundation (NSF) through grants #IIP-0443945, #IIP-0726109, #CNS-0708344, #CNS-0821474, #IIP-0934327, #CNS-1039741, #IIS-1017344, #IIP-1032018, and #SMA-1028076, the Minnesota Department of Transportation, and the ITS Institute at the University of Minnesota. The authors thank Professor Kristen Grauman for providing kernel matrices for Caltech-101 data, and the anonymous reviewers for their helpful suggestions.
Keywords
- Active learning
- multiclass classification
- object recognition
- scalable machine learning