Abstract
This paper explores a novel framework for building regression models using association rules. The model consists of an ordered set of IF-THEN rules, where the rule consequent is the predicted value of the target attribute. The approach consist of two steps: (1) extraction of association rules, and (2) construction of the rule-based regression model. We propose a pruning scheme for redundant and insignificant rules in the rule extraction step, and also a number of heuristics for building regression models. This approach allows discovery of global patterns, offers resistance to noise, while building relatively simple models. We perform a comparative study on the performance of RBA against CART and Cubist using 21 real-world data sets. Our experimental results suggest that RBA outperforms Cubist and are equally as good as CART in many data sets, and more importantly, there are situations where RBA is significantly better than CART, especially when the number of noise dimensions in the data is large.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of the Fourth SIAM International Conference on Data Mining |
Editors | M.W. Berry, U. Dayal, C. Kamath, D. Skillicorn |
Pages | 210-221 |
Number of pages | 12 |
State | Published - Jun 22 2004 |
Event | Proceedings of the Fourth SIAM International Conference on Data Mining - Lake Buena Vista, FL, United States Duration: Apr 22 2004 → Apr 24 2004 |
Other
Other | Proceedings of the Fourth SIAM International Conference on Data Mining |
---|---|
Country/Territory | United States |
City | Lake Buena Vista, FL |
Period | 4/22/04 → 4/24/04 |
Keywords
- Quantitative association rules
- Regression
- Rule-based learning