Classification of Spanish election tweets (COSET) 2017: Classifying tweets using character and word level features

Ankush Khandelwal, Sahil Swami, Syed S. Akhtar, M. Shrivastava

Research output: Contribution to journalConference articlepeer-review

6 Scopus citations

Abstract

This paper describes the International Institute of Information Technology of Hyderabad's submission to the task Classification Of Spanish Election Tweets (COSET) as a part of IBEREVAL-2017[1]. The task is to classify Spanish election tweets into political, policy, personal, campaign and other issues. Our system uses Support Vector Machines with radial basis function kernel to classify tweets. We dwell upon the character and word level features along with the word embeddings and train the classification model with them and present the results. Our best run achieves a F1-macro score of 0.6054 on the test corpus for first phase and 0.8509 for the second phase.

Original languageEnglish (US)
Pages (from-to)49-54
Number of pages6
JournalCEUR Workshop Proceedings
Volume1881
StatePublished - Jan 1 2017
Externally publishedYes
Event2nd Workshop on Evaluation of Human Language Technologies for Iberian Languages, IberEval 2017 - Murcia, Spain
Duration: Sep 19 2017 → …

Keywords

  • Classification
  • Decision tree
  • Extra tree
  • Machine Learning
  • Radial basis function kernel
  • Random forest
  • SVM
  • Twitter
  • Word2vec

Fingerprint Dive into the research topics of 'Classification of Spanish election tweets (COSET) 2017: Classifying tweets using character and word level features'. Together they form a unique fingerprint.

Cite this