Integration and automation of data preparation and data mining

Shrikanth Narayanan, Ayush Jaiswal, Yao Yi Chiang, Yanhui Geng, Craig A. Knoblock, Pedro Szekely

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Data mining tasks typically require significant effort in data preparation to find, transform, integrate and prepare the data for the relevant data mining tools. In addition, the work performed in data preparation is often not recorded and is difficult to reproduce from the raw data. In this paper we present an integrated approach to data preparation and data mining that combines the two steps into a single integrated process and maintains detailed metadata about the data sources, the steps in the process, and the resulting learned classifier produced from data mining algorithms. We present results on an example scenario, which shows that our approach provides significant reduction in the time in takes to perform a data mining task.

Original languageEnglish (US)
Title of host publicationProceedings - 14th IEEE International Conference on Data Mining Workshops, ICDMW 2014
EditorsZhi-Hua Zhou, Wei Wang, Ravi Kumar, Hannu Toivonen, Jian Pei, Joshua Zhexue Huang, Xindong Wu
PublisherIEEE Computer Society
Pages1076-1085
Number of pages10
EditionJanuary
ISBN (Electronic)9781479942749
DOIs
StatePublished - Jan 26 2015
Externally publishedYes
Event14th IEEE International Conference on Data Mining Workshops, ICDMW 2014 - Shenzhen, China
Duration: Dec 14 2014 → …

Publication series

NameIEEE International Conference on Data Mining Workshops, ICDMW
NumberJanuary
Volume2015-January
ISSN (Print)2375-9232
ISSN (Electronic)2375-9259

Other

Other14th IEEE International Conference on Data Mining Workshops, ICDMW 2014
Country/TerritoryChina
CityShenzhen
Period12/14/14 → …

Bibliographical note

Publisher Copyright:
© 2014 IEEE.

Fingerprint

Dive into the research topics of 'Integration and automation of data preparation and data mining'. Together they form a unique fingerprint.

Cite this