TY - JOUR
T1 - Validation sequence optimization
T2 - A theoretical approach
AU - Adomavicius, Gediminas
AU - Tuzhilin, Alexander
PY - 2007
Y1 - 2007
N2 - The need to validate large amounts of data with the help of the domain expert arises naturally in many data-intensive applications, including data mining, data stream, and database-related applications. This paper presents a general validation approach that generalizes different expert-driven validation methods developed for specialized validation problems. In particular, we model the validation process as a sequence of validation operators, explore various properties of such sequences, and present theoretical results that provide for better understanding of the validation process. We also address the problem of selecting the best validation sequence among the class of equivalent sequence permutations. We demonstrate that this optimization problem is NP-hard and present two heuristic algorithms for improving validation sequences.
AB - The need to validate large amounts of data with the help of the domain expert arises naturally in many data-intensive applications, including data mining, data stream, and database-related applications. This paper presents a general validation approach that generalizes different expert-driven validation methods developed for specialized validation problems. In particular, we model the validation process as a sequence of validation operators, explore various properties of such sequences, and present theoretical results that provide for better understanding of the validation process. We also address the problem of selecting the best validation sequence among the class of equivalent sequence permutations. We demonstrate that this optimization problem is NP-hard and present two heuristic algorithms for improving validation sequences.
KW - Computational complexity
KW - Data mining
KW - Dynamic programming
KW - Heuristic algorithms
KW - Sequence optimization
KW - Validation
KW - Validation operators
KW - Validation sequences
UR - http://www.scopus.com/inward/record.url?scp=61349159260&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=61349159260&partnerID=8YFLogxK
U2 - 10.1287/ijoc.1050.0153
DO - 10.1287/ijoc.1050.0153
M3 - Article
AN - SCOPUS:61349159260
SN - 1091-9856
VL - 19
SP - 185
EP - 200
JO - INFORMS Journal on Computing
JF - INFORMS Journal on Computing
IS - 2
ER -