User-Based Collaborative Filtering with K-NN

How to compute Predictions:

Suppose that we have a new target user NU and we want to compute the predicted rating for NU on a target item It (an item NU has not rated).

Assume that we have identified the K nearest neighbors, U1, U2, ..., Uk for NU. Let us denote the rating given by user Ui to an item Ij by r(Ui,Ij). Also, let us denote the similarity of user Ui to user NU as by sim(NU, Ui). Note that, generally, this similarity is computed as the Pearson correlation of the two users.

Using the weighted sum approach, the predicted rating of NU on the target item It can be computed as follows:

In other words, the ratings of the K neighbors are weighted by their similarity to the target user, and the sum of all these weighted ratings is divided by the sum of all the similarities across the K neighbors.

Important Notes:

  1. Generally, when the K neighbors are identified, those whose correlations with the target user less than or equal to 0 are filtered out. So, in practice, the predictions may be computed with less than K neighbors (only those with similarities greater than 0 are considered).
  2. When computing the predictions (i.e., computing the weighted average), only those neighbors are considered that have actually rated the target item, It,  are considered. For example, suppose K = 3, and U1, U2, and U3, are the nearest neighbors to target user NU. Suppose that only U1 and U3 have rated item It. Then, the predicted rating for NU is computed using only U1 and U3: