To engage visitors to a Web site at a very early stage (i.e., before
registration or authentication), personalization tools must rely
primarily on clickstream data captured in Web server logs. The lack of
explicit user ratings as well as the sparse nature and the large volume
of data in such a setting poses serious challenges to standard
collaborative filtering techniques in terms of scalability and
performance. Web usage mining techniques such as clustering that rely
on offline pattern discovery from user transactions can be used to
improve the scalability of collaborative filtering, however, this is
often at the cost of reduced recommendation accuracy. In this paper we
propose effective and scalable techniques for Web personalization based
on association rule discovery from usage data. Through detailed
experimental evaluation on real usage data, we show that the proposed
methodology can achieve better recommendation effectiveness, while
maintaining a computational advantage over direct approaches to
collaborative filtering such as the
k-nearest-neighbor
strategy.
Download the paper in PDF