TY - GEN
T1 - Support envelopes
T2 - KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
AU - Steinbach, Michael
AU - Tan, Pang Ning
AU - Kumar, Vipin
PY - 2004
Y1 - 2004
N2 - This paper introduces support envelopes - a new tool for analyzing association patterns - and illustrates some of their properties, applications, and possible extensions. Specifically, the support envelope for a transaction data set and a specified pair of positive integers (m, n) consists of the items and transactions that need to be searched to find any association pattern involving m or more transactions and n or more items. For any transaction data set with M transactions and TV items, there is a unique lattice of at most M * N support envelopes that captures the structure of the association patterns in that data set. Because support envelopes are not encumbered by a support threshold, this support lattice provides a complete view of the association structure of the data set, including association patterns that have low support. Furthermore, the boundary of the support lattice - the support boundary - has at most min(M, N) envelopes and is especially interesting since it bounds the maximum sizes of potential association patterns - not only for frequent, closed, and maximal itemsets, but also for patterns, such as error-tolerant itemsets, that are more general. The association structure can be represented graphically as a two-dimensional scatter plot of the (m, n) values associated with the support envelopes of the data set, a feature that is useful in the exploratory analysis of association patterns. Finally, the algorithm to compute support envelopes is simple and computationally efficient, and it is straightforward to parallelize the process of finding all the support envelopes.
AB - This paper introduces support envelopes - a new tool for analyzing association patterns - and illustrates some of their properties, applications, and possible extensions. Specifically, the support envelope for a transaction data set and a specified pair of positive integers (m, n) consists of the items and transactions that need to be searched to find any association pattern involving m or more transactions and n or more items. For any transaction data set with M transactions and TV items, there is a unique lattice of at most M * N support envelopes that captures the structure of the association patterns in that data set. Because support envelopes are not encumbered by a support threshold, this support lattice provides a complete view of the association structure of the data set, including association patterns that have low support. Furthermore, the boundary of the support lattice - the support boundary - has at most min(M, N) envelopes and is especially interesting since it bounds the maximum sizes of potential association patterns - not only for frequent, closed, and maximal itemsets, but also for patterns, such as error-tolerant itemsets, that are more general. The association structure can be represented graphically as a two-dimensional scatter plot of the (m, n) values associated with the support envelopes of the data set, a feature that is useful in the exploratory analysis of association patterns. Finally, the algorithm to compute support envelopes is simple and computationally efficient, and it is straightforward to parallelize the process of finding all the support envelopes.
KW - Association analysis
KW - Error-tolerant itemsets
KW - Formal concept analysis
KW - Support envelope
UR - http://www.scopus.com/inward/record.url?scp=12244289922&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=12244289922&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:12244289922
SN - 1581138881
SN - 9781581138887
T3 - KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 296
EP - 305
BT - KDD-2004 - Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
A2 - Kohavi, R.
A2 - Gehrke, J.
A2 - DuMouchel, W.
A2 - Ghosh, J.
Y2 - 22 August 2004 through 25 August 2004
ER -