TY - GEN
T1 - Online ℓ1-dictionary learning with application to novel document detection
AU - Kasiviswanathan, Shiva Prasad
AU - Wang, Huahua
AU - Banerjee, Arindam
AU - Melville, Prem
PY - 2012/12/1
Y1 - 2012/12/1
N2 - Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a scalable manner. Motivated by this challenge, we introduce the problem of online ℓ1-dictionary learning where unlike traditional dictionary learning, which uses squared loss, the '1-penalty is used for measuring the reconstruction error. We present an efficient online algorithm for this problem based on alternating directions method of multipliers, and establish a sublinear regret bound for this algorithm. Empirical results on news-stream and Twitter data, shows that this online ℓ1- dictionary learning algorithm for novel document detection gives more than an order of magnitude speedup over the previously known batch algorithm, without any significant loss in quality of results.
AB - Given their pervasive use, social media, such as Twitter, have become a leading source of breaking news. A key task in the automated identification of such news is the detection of novel documents from a voluminous stream of text documents in a scalable manner. Motivated by this challenge, we introduce the problem of online ℓ1-dictionary learning where unlike traditional dictionary learning, which uses squared loss, the '1-penalty is used for measuring the reconstruction error. We present an efficient online algorithm for this problem based on alternating directions method of multipliers, and establish a sublinear regret bound for this algorithm. Empirical results on news-stream and Twitter data, shows that this online ℓ1- dictionary learning algorithm for novel document detection gives more than an order of magnitude speedup over the previously known batch algorithm, without any significant loss in quality of results.
UR - http://www.scopus.com/inward/record.url?scp=84877755328&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84877755328&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84877755328
SN - 9781627480031
T3 - Advances in Neural Information Processing Systems
SP - 2258
EP - 2266
BT - Advances in Neural Information Processing Systems 25
T2 - 26th Annual Conference on Neural Information Processing Systems 2012, NIPS 2012
Y2 - 3 December 2012 through 6 December 2012
ER -