Next: Experimental Results
Up: Integrating Web Usage and
Previous: Discovery of Content Profiles
The recommendation engine is the online component of a Web
personalization system. The task of the recommendation engine is to
compute a recommendation set for the current (active) user
session, consisting the objects (links, ads, text, products, etc.) that
most closely match the current user profile. The essential aspect of
computing a recommendation set for a user is the matching of current
user's activity against aggregate usage profiles. The recommended
objects are added to the last page in the active session accessed by
the user before that page is sent to the browser. Maintaining a history
depth is important because most users navigate several paths leading to
independent pieces of information within a session. In many cases these
sub-sessions have a length of no more than 2 or 3 references. We
capture the user history depth within a sliding window over the current
session. The sliding window of size n over the active session allows
only the last n visited pages to influence the recommendation value
of items in the recommendation set. Finally, the structural
characteristics of the site or prior domain knowledge can also be used
to associate an additional measure of significance with each pageview
in the user's active session.
In our proposed architecture, both content and usage profiles are
represented as sets of pageview-weight pairs. This will allow for both
the active session and the profiles to be treated as n-dimensional
vectors over the space of pageviews in the site. Thus, given a content
or usage profile C, we can represent C as a vector
,
where
Similarly, the current active session S is also represented as a
vector
,
where
si is a significance weight associated with the corresponding
pageview reference, if the user has accessed pi in this session, and
si = 0, otherwise. We can compute the profile matching score using a
similarity function such as the normalized cosine measure for vectors:
Note that the matching score is normalized for the size of the clusters
and the active session. This corresponds to the intuitive notion that
we should see more of the user's active session before obtaining a
better match with a larger cluster representing a user profile. Given
a profile C and an active session S, a recommendation score,
Rec(S, p), is computed for each pageview p in C as follows:
If the pageview p is in the current active session, then its
recommendation value is set to zero. We obtain the usage recommendation
set, UREC(S), for current active session S by collecting from each
usage profile all pageviews whose recommendation score satisfies a
minimum recommendation threshold
,
i.e.,
where UP is the collection of all usage profiles. Furthermore, for
each pageview that is contributed by several usage profiles, we use its
maximal recommendation score from all of the contributing profiles.
In a similar manner, we can obtain the content recommendation set
CREC(S) from content profiles. Different methods can be used for
combining the two recommendation sets depending on the goals of
personalization and the requirements of the site. In our case, for each
pageview we take the maximum recommendation value across the two
recommendation sets. This allows, for example, content profiles to
contribute to the recommendation set even if no matching usage profile
is available and vice versa.
Next: Experimental Results
Up: Integrating Web Usage and
Previous: Discovery of Content Profiles
Bamshad Mobasher
2000-08-14