Abstract
This paper investigates the impact of rating data characteristics on the performance of recommendation algorithms. We focus on three groups of data characteristics: rating density, rating frequency distribution, and rating value distribution. We introduce a "window sampling" procedure that can effectively manipulate the characteristics of rating samples and apply regression model to uncover the relationships between data characteristics and recommendation accuracy. Our experimental results show that the recommendation accuracy is highly influenced by structural characteristics of rating data, and the effects of data characteristics are consistent for different recommendation techniques. Understanding how data characteristics can impact the recommendation performance has practical significance and can enable the recommender system designers to estimate the expected performance of their system in advance and, thus, to direct data collection efforts to maximize the recommendation performance.
Original language | English (US) |
---|---|
Title of host publication | Proceedings of 20th Annual Workshop on Information Technologies and Systems |
Publisher | Social Science Research Network |
State | Published - Jan 1 2010 |
Event | 20th Annual Workshop on Information Technologies and Systems, WITS 2010 - St. Louis, MO, United States Duration: Dec 11 2010 → Dec 12 2010 |
Other
Other | 20th Annual Workshop on Information Technologies and Systems, WITS 2010 |
---|---|
Country/Territory | United States |
City | St. Louis, MO |
Period | 12/11/10 → 12/12/10 |