Generating semantic annotations for research datasets

Ayush Singhal, Jaideep Srivastava

Research output: Chapter in Book/Report/Conference proceedingConference contribution

2 Scopus citations


Annotations are important for the description of any object. They give understanding about the object in a summary form. Annotations, unlike tags, are structured form of meta-data information. Best structured information is prepared by humans. However, given the large volume and variety of objects like images, videos and documents, to name a few, it is practically impossible to annotate all the objects in the world. In such a situation, automated approaches to subscribe semantically correct and structured annotations is an extremely important task. In this paper we have proposed a novel problem of semantic annotation of research datasets. Explosion in the usage of social media and various electronic devices has led to collection of huge volumes of datasets for scientific research. Although, most of the datasets are available online, the lack of semantic annotations/meta-data and the lack of a unified public repository has made it difficult for researchers to browse through the datasets even with popular search engines. In this work we propose an algorithmic approach to automate the task of annotating the datasets in structured and semantic manner. We have used knowledge from the World Wide Web and organized knowledge bases such as dbpedia, yago, freebase and wordnet to derive context and annotations for the research datasets. The proposed approach is evaluated on two real world datasets, namely, UCI dataset repository and SNAP dataset collections. Using various experimental setups we show that the proposed approach outperforms the baseline approaches. We also perform a case study to compare our results with Google search engine. We find that using the semantic annotations the search accuracy increases by 18% over the normal search for datasets.

Original languageEnglish (US)
Title of host publication4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014
PublisherAssociation for Computing Machinery
ISBN (Print)9781450325387
StatePublished - 2014
Event4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014 - Thessaloniki, Greece
Duration: Jun 2 2014Jun 4 2014

Publication series

NameACM International Conference Proceeding Series


Other4th International Conference on Web Intelligence, Mining and Semantics, WIMS 2014


  • search engines
  • semantic annotation
  • summarization of Web data
  • web mining


Dive into the research topics of 'Generating semantic annotations for research datasets'. Together they form a unique fingerprint.

Cite this