Recovering information from summary data

Christos Faloutsos, H. V. Jagadish, N. D. Sidiropoulos

Research output: Chapter in Book/Report/Conference proceedingConference contribution

40 Scopus citations

Abstract

Data is often stored in summarized form, as a histogram of aggregates (COUNTs, SUMs, or AVeraGes) over specified ranges. We study how to estimate the original detail data from the stored summary. We formulate this task as an inverse problem, specifying a well-defined cost function that has to be optimized under constraints. We show that our formulation includes the unifor mity and independence assumptions as a spe cial case, and that it can achieve better recon struction results if we maximize the smooth ness as opposed to the uniformity. In our experiments on real and synthetic datasets, the proposed method almost consistently out performs its competitor, improving the root-mean-square error by up to 20 per cent for stock price data, and up to 90 per cent for smoother data sets. Finally, we show how to apply this theory to a variety of database problems that involve partial information, such as OLAP, data ware housing and histograms in query optimization.

Original languageEnglish (US)
Title of host publicationProceedings of the 23rd International Conference on Very Large Databases, VLDB 1997
EditorsFred Lochovsky, Michael J. Carey, Matthias Jarke, Klaus R. Dittrich, Pericles Loucopoulos, Manfred A. Jeusfeld
PublisherMorgan-Kaufmann
Pages36-45
Number of pages10
ISBN (Electronic)1558604707, 9781558604704
StatePublished - Jan 1 1997
Event23rd International Conference on Very Large Databases, VLDB 1997 - Athens, Greece
Duration: Aug 26 1997Aug 29 1997

Publication series

NameProceedings of the 23rd International Conference on Very Large Databases, VLDB 1997

Other

Other23rd International Conference on Very Large Databases, VLDB 1997
CountryGreece
CityAthens
Period8/26/978/29/97

Fingerprint Dive into the research topics of 'Recovering information from summary data'. Together they form a unique fingerprint.

Cite this