Impact of data characteristics on recommender systems performance

Gediminas Adomavicius, Young Ok Kwon, Jingjing Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

This paper investigates the impact of rating data characteristics on the performance of recommendation algorithms. We focus on three groups of data characteristics: rating density, rating frequency distribution, and rating value distribution. We introduce a "window sampling" procedure that can effectively manipulate the characteristics of rating samples and apply regression model to uncover the relationships between data characteristics and recommendation accuracy. Our experimental results show that the recommendation accuracy is highly influenced by structural characteristics of rating data, and the effects of data characteristics are consistent for different recommendation techniques. Understanding how data characteristics can impact the recommendation performance has practical significance and can enable the recommender system designers to estimate the expected performance of their system in advance and, thus, to direct data collection efforts to maximize the recommendation performance.

Original languageEnglish (US)
Title of host publicationProceedings of 20th Annual Workshop on Information Technologies and Systems
PublisherSocial Science Research Network
StatePublished - Jan 1 2010
Event20th Annual Workshop on Information Technologies and Systems, WITS 2010 - St. Louis, MO, United States
Duration: Dec 11 2010Dec 12 2010

Other

Other20th Annual Workshop on Information Technologies and Systems, WITS 2010
Country/TerritoryUnited States
CitySt. Louis, MO
Period12/11/1012/12/10

Fingerprint

Dive into the research topics of 'Impact of data characteristics on recommender systems performance'. Together they form a unique fingerprint.

Cite this