Warehouse creation a potential roadblock to data warehousing

Jaideep Srivastava, Ping Yao Chen

Research output: Contribution to journalArticlepeer-review

19 Scopus citations

Abstract

Data warehousing is gaining in popularity as organizations realize the benefits of being able to perform sophisticated analyses of their data. Recent years have seen the introduction of a number of data-warehousing engines, from both established database vendors as well as new players. The engines themselves are relatively easy to use and come with a good set of end-user tools. However, there is one key stumbling block to the rapid development of data warehouses, namely that of warehouse population. Specifically, problems arise in populating a warehouse with existing data since it has various types of heterogeneity. Given the lack of good tools, this task has generally been performed by various system integrators, e.g., software consulting organizations which have developed in-house tools and processes for the task. The general conclusion is that the task has proven to be labor-intensive, error-prone, and generally frustrating, leading a number of warehousing projects to be abandoned mid-way through development. However, the picture is not as grim as it appears. The problems that are being encountered in warehouse creation are very similar to those encountered in data integration, and they have been studied for about two decades. However, not all problems relevant to warehouse creation have been solved, and a number of research issues remain. The principal goal of this paper is to identify the common issues in data integration and data-warehouse creation. We hope this will lead: 1) developers of warehouse creation tools to examine and, where appropriate, incorporate the techniques developed for data integration, and 2) researchers in both the data integration and the data warehousing communities to address the open research issues in this important area.

Original languageEnglish (US)
Pages (from-to)118-126
Number of pages9
JournalIEEE Transactions on Knowledge and Data Engineering
Volume11
Issue number1
DOIs
StatePublished - 1999

Bibliographical note

Funding Information:
This work is supported, in part, by the U.S. Department of Transportation through Grant No. USDOT/DTRS93-G-0017 to the University of Minnesota.

Keywords

  • Attribute value conflict
  • Data integration
  • Data mining
  • Data warehouse
  • Entity identification

Fingerprint

Dive into the research topics of 'Warehouse creation a potential roadblock to data warehousing'. Together they form a unique fingerprint.

Cite this