Recommender system: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Citation bot (talk | contribs)
m [419]Add: postscript. Tweak: title, doi. | User-activated.
Line 22: Line 22:
Recommender systems are a useful alternative to [[search algorithm]]s since they help users discover items they might not have found by themselves. Interestingly enough, recommender systems are often implemented using search engines indexing non-traditional data.
Recommender systems are a useful alternative to [[search algorithm]]s since they help users discover items they might not have found by themselves. Interestingly enough, recommender systems are often implemented using search engines indexing non-traditional data.


Montaner provides the first overview of recommender systems, from an intelligent agents perspective.<ref>{{citation |last = Montaner |first = M. |last2 = Lopez |last3 = de la Rosa |first3 = J. L. |url = http://www.springerlink.com/content/kk844421t5466k35/ |title = A Taxonomy of Recommender Agents on the Internet |journal = Artificial Intelligence Review |volume = 19 |issue = 4 |date = June 2003 |pages = 285–330 |doi = 10.1023/A:1022850703159 |first2 = B. }}.</ref> Adomavicius provides a new overview of recommender systems.<ref name="Toward the Next Generation of Recommender Systems">{{citation |last = Adomavicius |first = G. |last2 = Tuzhilin |first2 = A. |url = http://portal.acm.org/citation.cfm?id=1070611.1070751 |title = Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions |journal = IEEE Transactions on Knowledge and Data Engineering |volume = 17 |issue = 6 |date = June 2005 |pages = 734–749 |doi = 10.1109/TKDE.2005.99 }}.</ref> Herlocker provides an additional overview of evaluation techniques for recommender systems.<ref>{{citation |last = Herlocker |first = J. L. |last2 = Konstan |first2 = J. A. |last3 = Terveen |first3 = L. G. |last4 = Riedl |first4 = J. T. |date = January 2004 |url = http://portal.acm.org/citation.cfm?id=963772 |title = Evaluating collaborative filtering recommender systems |journal = ACM Trans. Inf. Syst. |volume = 22 |issue = 1 |pages = 5–53 |doi = 10.1145/963770.963772 }}.</ref>
Montaner provides the first overview of recommender systems, from an intelligent agents perspective.<ref>{{Cite journal |last = Montaner |first = M. |last2 = Lopez |last3 = de la Rosa |first3 = J. L. |url = http://www.springerlink.com/content/kk844421t5466k35/ |title = A Taxonomy of Recommender Agents on the Internet |journal = Artificial Intelligence Review |volume = 19 |issue = 4 |date = June 2003 |pages = 285–330 |doi = 10.1023/A:1022850703159 |first2 = B. |postscript = <!-- Bot inserted parameter. Either remove it; or change its value to "." for the cite to end in a ".", as necessary. -->{{inconsistent citations}} }}.</ref> Adomavicius provides a new overview of recommender systems.<ref name="Toward the Next Generation of Recommender Systems">{{Cite journal |last = Adomavicius |first = G. |last2 = Tuzhilin |first2 = A. |url = http://portal.acm.org/citation.cfm?id=1070611.1070751 |title = Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions |journal = IEEE Transactions on Knowledge and Data Engineering |volume = 17 |issue = 6 |date = June 2005 |pages = 734–749 |doi = 10.1109/TKDE.2005.99 |postscript = <!-- Bot inserted parameter. Either remove it; or change its value to "." for the cite to end in a ".", as necessary. -->{{inconsistent citations}} }}.</ref> Herlocker provides an additional overview of evaluation techniques for recommender systems.<ref>{{Cite journal |last = Herlocker |first = J. L. |last2 = Konstan |first2 = J. A. |last3 = Terveen |first3 = L. G. |last4 = Riedl |first4 = J. T. |date = January 2004 |url = http://portal.acm.org/citation.cfm?id=963772 |title = Evaluating collaborative filtering recommender systems |journal = ACM Trans. Inf. Syst. |volume = 22 |issue = 1 |pages = 5–53 |doi = 10.1145/963770.963772 |postscript = <!-- Bot inserted parameter. Either remove it; or change its value to "." for the cite to end in a ".", as necessary. -->{{inconsistent citations}} }}.</ref>


Recommender system is an active research area in the [[data mining]] and [[machine learning]] areas. Some conferences such as RecSys, SIGIR, KDD have it as a topic.
Recommender system is an active research area in the [[data mining]] and [[machine learning]] areas. Some conferences such as RecSys, SIGIR, KDD have it as a topic.
Line 30: Line 30:
{{Main|Collaborative filtering}}
{{Main|Collaborative filtering}}
One approach to the design of recommender systems that has seen wide use is [[collaborative filtering]].<ref name="Breese98"> {{cite conference |author = John S. Breese, David Heckerman, and Carl Kadie |year = 1998 |title = Empirical analysis of predictive algorithms for collaborative filtering |booktitle = In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence (UAI'98) }}
One approach to the design of recommender systems that has seen wide use is [[collaborative filtering]].<ref name="Breese98"> {{cite conference |author = John S. Breese, David Heckerman, and Carl Kadie |year = 1998 |title = Empirical analysis of predictive algorithms for collaborative filtering |booktitle = In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence (UAI'98) }}
</ref> Collaborative filtering methods are based on collecting and analyzing a large amount of information on users’ behaviors, activities or preferences and predicting what users will like based on their similarity to other users. A key advantage of the collaborative filtering approach is that it does not rely on machine analyzable content and therefore it is capable of accurately recommending complex items such as movies without requiring an "understanding" of the item itself. Many algorithms have been used in measuring user similarity or item similarity in recommender systems. For example, the [[k-nearest neighbor algorithm|k-nearest neighborhood]] (k-NN) approach.<ref>{{citation |last = Sarwar |first = B. |last2 = Karypis |first2 = G. |last3 = Konstan |first3 = J. |last4 = Riedl |first4 = J. |year = 2000 |url = http://glaros.dtc.umn.edu/gkhome/node/122 |title = Application of Dimensionality Reduction in Recommender System A Case Study }},</ref> and the [[Pearson correlation|Pearson Correlation]].
</ref> Collaborative filtering methods are based on collecting and analyzing a large amount of information on users’ behaviors, activities or preferences and predicting what users will like based on their similarity to other users. A key advantage of the collaborative filtering approach is that it does not rely on machine analyzable content and therefore it is capable of accurately recommending complex items such as movies without requiring an "understanding" of the item itself. Many algorithms have been used in measuring user similarity or item similarity in recommender systems. For example, the [[k-nearest neighbor algorithm|k-nearest neighborhood]] (k-NN) approach.<ref>{{Cite web |last = Sarwar |first = B. |last2 = Karypis |first2 = G. |last3 = Konstan |first3 = J. |last4 = Riedl |first4 = J. |year = 2000 |url = http://glaros.dtc.umn.edu/gkhome/node/122 |title = Application of Dimensionality Reduction in Recommender System A Case Study |postscript = <!-- Bot inserted parameter. Either remove it; or change its value to "." for the cite to end in a ".", as necessary. -->{{inconsistent citations}} }},</ref> and the [[Pearson correlation|Pearson Correlation]].


When building a model from a user's profile, a distinction is often made between explicit and [[implicit data collection|implicit]] forms of [[data collection]].
When building a model from a user's profile, a distinction is often made between explicit and [[implicit data collection|implicit]] forms of [[data collection]].
Line 42: Line 42:
Examples of [[implicit data collection]] include the following:
Examples of [[implicit data collection]] include the following:
* Observing the items that a user views in an online store.
* Observing the items that a user views in an online store.
* Analyzing item/user viewing times<ref>{{citation |last = Parsons |first = J. |last2 = Ralph |first2 = P. |last3 = Gallagher |first3 = K. |date = July 2004 |title = Using viewing time to infer user preference in recommender systems. |publisher = AAAI Workshop in Semantic Web Personalization, San Jose, California }}.</ref>
* Analyzing item/user viewing times<ref>{{Cite document |last = Parsons |first = J. |last2 = Ralph |first2 = P. |last3 = Gallagher |first3 = K. |date = July 2004 |title = Using viewing time to infer user preference in recommender systems |publisher = AAAI Workshop in Semantic Web Personalization, San Jose, California |postscript = <!-- Bot inserted parameter. Either remove it; or change its value to "." for the cite to end in a ".", as necessary. -->{{inconsistent citations}} }}.</ref>
* Keeping a record of the items that a user purchases online.
* Keeping a record of the items that a user purchases online.
* Obtaining a list of items that a user has listened to or watched on his/her computer.
* Obtaining a list of items that a user has listened to or watched on his/her computer.
Line 57: Line 57:
* Sparsity: The number of items sold on major e-commerce sites is extremely large. The most active users will only have rated a small subset of the overall database. Thus, even the most popular items have very few ratings.
* Sparsity: The number of items sold on major e-commerce sites is extremely large. The most active users will only have rated a small subset of the overall database. Thus, even the most popular items have very few ratings.


A particular type of collaborative filtering algorithm uses [[matrix factorization]], a [[low rank approximation|low-rank matrix approximation]] technique.<ref>I. Markovsky, Low-Rank Approximation: Algorithms, Implementation, Applications, Springer, 2012, ISBN 978-1-4471-2226-5</ref><ref>{{citation |last = Takács |first = G. |last2 = Pilászy |first2 = I. |last3 = Németh |first3 = B. |last4 = Tikk |first4 = D. |url = http://www.jmlr.org/papers/volume10/takacs09a/takacs09a.pdf |title = Scalable Collaborative Filtering Approaches for Large Recommender Systems |journal = Journal of Machine Learning Research |volume = 10 |month = March |pages = 623–656 |year = 2009 }}</ref><ref>{{cite conference |last = Rennie |first = J. |last2 = Srebro |first2 = N. |url = http://people.csail.mit.edu/jrennie/papers/icml05-mmmf.pdf |format = PDF |title = Fast Maximum Margin Matrix Factorization for Collaborative Prediction |booktitle = Proceedings of the 22nd Annual International Conference on Machine Learning |year = 2005 |editor = Luc De Raedt, Stefan Wrobel |publisher = ACM Press |conferenceurl = http://icml.ais.fraunhofer.de/icml.php }}</ref>
A particular type of collaborative filtering algorithm uses [[matrix factorization]], a [[low rank approximation|low-rank matrix approximation]] technique.<ref>I. Markovsky, Low-Rank Approximation: Algorithms, Implementation, Applications, Springer, 2012, ISBN 978-1-4471-2226-5</ref><ref>{{Cite journal |last = Takács |first = G. |last2 = Pilászy |first2 = I. |last3 = Németh |first3 = B. |last4 = Tikk |first4 = D. |url = http://www.jmlr.org/papers/volume10/takacs09a/takacs09a.pdf |title = Scalable Collaborative Filtering Approaches for Large Recommender Systems |journal = Journal of Machine Learning Research |volume = 10 |month = March |pages = 623–656 |year = 2009 |postscript = <!-- Bot inserted parameter. Either remove it; or change its value to "." for the cite to end in a ".", as necessary. -->{{inconsistent citations}} }}</ref><ref>{{cite conference |last = Rennie |first = J. |last2 = Srebro |first2 = N. |url = http://people.csail.mit.edu/jrennie/papers/icml05-mmmf.pdf |format = PDF |title = Fast Maximum Margin Matrix Factorization for Collaborative Prediction |booktitle = Proceedings of the 22nd Annual International Conference on Machine Learning |year = 2005 |editor = Luc De Raedt, Stefan Wrobel |publisher = ACM Press |conferenceurl = http://icml.ais.fraunhofer.de/icml.php }}</ref>


=== Content-based filtering ===
=== Content-based filtering ===
Line 80: Line 80:
== Mobile Recommender Systems ==
== Mobile Recommender Systems ==


One growing area of research in the area of recommender systems is [[mobile recommender systems]]. With the increasing ubiquity of internet-accessing [[smart phones]], it is now possible to offer personalized, context-sensitive recommendations. This is a particularly difficult area of research as mobile data is more complex than recommender systems often have to deal with (it is heterogeneous, noisy, requires spatial and temporal auto-correlation, and has validation and generality problems <ref name="taxirecommender">{{cite conference |author = Yong Ge, Hui Xiong, Alexander Tuzhilin, Keli Xiao, Marco Gruteser,Michael J. Pazzani |year = 2010 |title = An Energy-Efficient Mobile Recommender System |booktitle = Proceedings of the 16th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining |url = http://pegasus.rutgers.edu/~yongge/KDD10-MRS.pdf |publisher = [[Association for Computing Machinery|ACM]] |location = [[New York City|New York City, New York]] |pages = 899-908 |accessdate = 2011-11-17 }}
One growing area of research in the area of recommender systems is [[mobile recommender systems]]. With the increasing ubiquity of internet-accessing [[smart phones]], it is now possible to offer personalized, context-sensitive recommendations. This is a particularly difficult area of research as mobile data is more complex than recommender systems often have to deal with (it is heterogeneous, noisy, requires spatial and temporal auto-correlation, and has validation and generality problems <ref name="taxirecommender">{{cite conference |author = Yong Ge, Hui Xiong, Alexander Tuzhilin, Keli Xiao, Marco Gruteser,Michael J. Pazzani |year = 2010 |title = An Energy-Efficient Mobile Recommender System |booktitle = Proceedings of the 16th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining |url = http://pegasus.rutgers.edu/~yongge/KDD10-MRS.pdf |publisher = [[Association for Computing Machinery|ACM]] |location = [[New York City|New York City, New York]] |pages = 899–908 |accessdate = 2011-11-17 }}
</ref>). Additionally, mobile recommender systems suffer from a transplantation problem - recommendations may not apply in all regions (for instance, it would be unwise to recommend a recipe in an area where all of the ingredients may not be available).
</ref>). Additionally, mobile recommender systems suffer from a transplantation problem - recommendations may not apply in all regions (for instance, it would be unwise to recommend a recipe in an area where all of the ingredients may not be available).


Line 108: Line 108:


Much research has been conducted on ongoing privacy issues in this space. Ramakrishnan ''et al.'' have conducted an extensive overview of the trade-offs between personalization and privacy and found that the combination of weak ties (an unexpected connection that provides serendipitous recommendations) and other data sources can be used to uncover identities of users in an anonymized dataset.<ref name="privacyoverview">
Much research has been conducted on ongoing privacy issues in this space. Ramakrishnan ''et al.'' have conducted an extensive overview of the trade-offs between personalization and privacy and found that the combination of weak ties (an unexpected connection that provides serendipitous recommendations) and other data sources can be used to uncover identities of users in an anonymized dataset.<ref name="privacyoverview">
{{cite journal |author = Naren Ramakrishnan, Benjamin J. Keller, Batul J. Mirza, Ananth Y. Grama, George Karypis |year = 2001 |title = Privacy Risks in Recommender Systems |journal = IEEE Internet Computing |volume = 5 |issue = 6 |url = http://citeseer.ist.psu.edu/schein02methods.html |isbn = 1-58113-561-0 |publisher = [[IEEE Educational Activities Department]] |location = Piscataway, NJ |pages = 54–62 }}
{{cite journal |author = Naren Ramakrishnan, Benjamin J. Keller, Batul J. Mirza, Ananth Y. Grama, George Karypis |year = 2001 |title = Privacy Risks in Recommender Systems |journal = IEEE Internet Computing |volume = 5 |issue = 6 |url = http://citeseer.ist.psu.edu/schein02methods.html |isbn = 1-58113-561-0 |publisher = [[IEEE Educational Activities Department]] |location = Piscataway, NJ |pages = 54–62 |doi = 10.1109/4236.968832 }}
</ref>
</ref>



Revision as of 02:57, 7 March 2013

Recommender systems or recommendation systems (sometimes replacing "system" with a synonym such as platform or engine) are a subclass of information filtering system that seek to predict the 'rating' or 'preference' that a user would give to an item (such as music, books, or movies) or social element (e.g. people or groups) they had not yet considered, using a model built from the characteristics of an item (content-based approaches) or the user's social environment (collaborative filtering approaches).[1][2]

Recommender systems have become extremely common in recent years. A few examples of such systems:

  • When viewing a product on Amazon.com, the store will recommend additional items based on a matrix of what other shoppers bought along with the currently selected item.[3]
  • Pandora Radio takes an initial input of a song or musician and plays music with similar characteristics (based on a series of keywords attributed to the inputted artist or piece of music). The stations created by Pandora can be refined through user feedback (emphasizing or deemphasizing certain characteristics).
  • Netflix offers predictions of movies that a user might like to watch based on the user's previous ratings and watching habits (as compared to the behavior of other users), also taking into account the characteristics (such as the genre) of the film.

Overview

Recommender systems typically produce a list of recommendations in one of two ways - through collaborative or content-based filtering. Collaborative filtering approaches to build a model from a user's past behavior (items previously purchased or selected and/or numerical ratings given to those items) as well as similar decisions made by other users, then use that model to predict items (or ratings for items) that the user may have an interest in.[4] Content-based filtering approaches utilize a series of discrete characteristics of an item in order to recommend additional items with similar properties.[5] These approaches are often combined (see Hybrid Recommender Systems).

The differences between collaborative and content-based filtering can be demonstrated by comparing two popular music recommender systems - Last.fm and Pandora Radio.

  • Pandora uses the properties of a song or artist (a subset of the 400 attributes provided by the Music Genome Project) in order to seed a "station" that plays music with similar properties. User feedback is used to refine the station's results, deemphasizing certain attributes when a user "dislikes" a particular song and emphasizing other attributes when a user "likes" a song. This is an example of a content-based approach.
  • Last.fm creates a "station" of recommended songs by observing what bands and individual tracks that the user has listened to on a regular basis and comparing those against the listening behavior of other users. Last.fm will play tracks that do not appear in the user's library, but are often played by other users with similar interests. As this approach leverages the behavior of users, it is an example of a collaborative filtering technique.

Each type of system has its own strengths and weaknesses. In the above example, Last.fm requires a large amount of information on a user in order to make accurate recommendations. This is an example of the cold start problem, and is common in collaborative filtering systems.[6] While Pandora needs very little information to get started, it is far more limited in scope (for example, it can only make recommendations that are similar to the original seed).

Recommender systems are a useful alternative to search algorithms since they help users discover items they might not have found by themselves. Interestingly enough, recommender systems are often implemented using search engines indexing non-traditional data.

Montaner provides the first overview of recommender systems, from an intelligent agents perspective.[7] Adomavicius provides a new overview of recommender systems.[8] Herlocker provides an additional overview of evaluation techniques for recommender systems.[9]

Recommender system is an active research area in the data mining and machine learning areas. Some conferences such as RecSys, SIGIR, KDD have it as a topic.

Approaches

Collaborative filtering

One approach to the design of recommender systems that has seen wide use is collaborative filtering.[10] Collaborative filtering methods are based on collecting and analyzing a large amount of information on users’ behaviors, activities or preferences and predicting what users will like based on their similarity to other users. A key advantage of the collaborative filtering approach is that it does not rely on machine analyzable content and therefore it is capable of accurately recommending complex items such as movies without requiring an "understanding" of the item itself. Many algorithms have been used in measuring user similarity or item similarity in recommender systems. For example, the k-nearest neighborhood (k-NN) approach.[11] and the Pearson Correlation.

When building a model from a user's profile, a distinction is often made between explicit and implicit forms of data collection.

Examples of explicit data collection include the following:

  • Asking a user to rate an item on a sliding scale.
  • Asking a user to rank a collection of items from favorite to least favorite.
  • Presenting two items to a user and asking him/her to choose the better one of them.
  • Asking a user to create a list of items that he/she likes.

Examples of implicit data collection include the following:

  • Observing the items that a user views in an online store.
  • Analyzing item/user viewing times[12]
  • Keeping a record of the items that a user purchases online.
  • Obtaining a list of items that a user has listened to or watched on his/her computer.
  • Analyzing the user's social network and discovering similar likes and dislikes

The recommender system compares the collected data to similar and dissimilar data collected from others and calculates a list of recommended items for the user. Several commercial and non-commercial examples are listed in the article on collaborative filtering systems.

One of the most famous examples of collaborative filtering is item-to-item collaborative filtering (people who buy x also buy y), an algorithm popularized by Amazon.com's recommender system.[3] Other examples include:

  • As previously detailed, Last.fm recommends music based on a comparison of the listening habits of similar users.
  • Facebook, MySpace, LinkedIn, and other social networks use collaborative filtering to recommend new friends, groups, and other social connections (by examining the network of connections between a user and their friends).[1]

Collaborative filtering approaches often suffer from three problems: cold start, scalability, and sparsity.[13]

  • Cold Start: These systems often require a large amount of existing data on a user in order to make accurate recommendations.
  • Scalability: In many of the environments that these systems make recommendations in, there are millions of users and products. Thus, a large amount of computation power is often necessary to calculate recommendations.
  • Sparsity: The number of items sold on major e-commerce sites is extremely large. The most active users will only have rated a small subset of the overall database. Thus, even the most popular items have very few ratings.

A particular type of collaborative filtering algorithm uses matrix factorization, a low-rank matrix approximation technique.[14][15][16]

Content-based filtering

Another common approach when designing recommender systems is content-based filtering. Content-based filtering methods are based on information about and characteristics of the items that are going to be recommended. In other words, these algorithms try to recommend items that are similar to those that a user liked in the past (or is examining in the present). In particular, various candidate items are compared with items previously rated by the user and the best-matching items are recommended. This approach has its roots in information retrieval and information filtering research.

Basically, these methods use an item profile (i.e., a set of discrete attributes and features) characterizing the item within the system. The system creates a content-based profile of users based on a weighted vector of item features. The weights denote the importance of each feature to the user and can be computed from individually rated content vectors using a variety of techniques. Simple approaches use the average values of the rated item vector while other sophisticated methods use machine learning techniques such as Bayesian Classifiers, cluster analysis, decision trees, and artificial neural networks in order to estimate the probability that the user is going to like the item.

Direct feedback from a user, usually in the form of a like or dislike button, can be used to assign higher or lower weights on the importance of certain attributes (using Rocchio Classification or other similar techniques).

A key issue with content-based filtering is whether the system is able to learn user preferences from user's actions regarding one content source and use them across other content types. When the system is limited to recommending content of the same type as the user is already using, the value from the recommendation system is significantly less than when other content types from other services can be recommended. For example, recommending news articles based on browsing of news is useful, but it's much more useful when music, videos, products, discussions etc. from different services can be recommended based on news browsing. Such cross-content recommendation has been productised by Leiki.

As previously detailed, Pandora Radio is a popular example of a content-based recommender system that plays music with similar characteristics to that of a song provided by the user as an initial seed. There are also a large number of content-based recommender systems aimed at providing movie recommendations, a few such examples include Rotten Tomatoes, Internet Movie Database, Jinni, Rovi Corporation and See This Next.

Hybrid Recommender Systems

Recent research has demonstrated that a hybrid approach, combining collaborative filtering and content-based filtering could be more effective in some cases. Hybrid approaches can be implemented in several ways: by making content-based and collaborative-based predictions separately and then combining them; by adding content-based capabilities to a collaborative-based approach (and vice versa); or by unifying the approaches into one model (see [8] for a complete review of recommender systems). Several studies empirically compare the performance of the hybrid with the pure collaborative and content-based methods and demonstrate that the hybrid methods can provide more accurate recommendations than pure approaches. These methods can also be used to overcome some of the common problems in recommender systems such as cold start and the sparsity problem.

Netflix and See This Next are good examples of hybrid systems. They make recommendations by comparing the watching and searching habits of similar users (i.e. collaborative filtering) as well as by offering movies that share characteristics with films that a user has rated highly (content-based filtering).

Mobile Recommender Systems

One growing area of research in the area of recommender systems is mobile recommender systems. With the increasing ubiquity of internet-accessing smart phones, it is now possible to offer personalized, context-sensitive recommendations. This is a particularly difficult area of research as mobile data is more complex than recommender systems often have to deal with (it is heterogeneous, noisy, requires spatial and temporal auto-correlation, and has validation and generality problems [17]). Additionally, mobile recommender systems suffer from a transplantation problem - recommendations may not apply in all regions (for instance, it would be unwise to recommend a recipe in an area where all of the ingredients may not be available).

One example of a mobile recommender system is one that offers potentially profitable driving routes for taxi drivers in a city.[17] This system takes as input data in the form of GPS traces of the routes that taxi drivers took while working, which include location (latitude and longitude), time stamps, and operational status (with or without passengers). It then recommends a list of pickup points along a route that will lead to optimal occupancy times and profits. This type of system is obviously location-dependent, and as it must operate on a handheld or embedded device, the computation and energy requirements must remain low.

An other example of mobile recommendation is what (Bouneffouf et al., 2012) developed for professional users. This system takes as input data the GPS traces of the user and his agenda to suggest him suitable information depending on his situation and interests. The system uses machine learning techniques and reasoning process in order to adapt dynamically the mobile recommender system to the evolution of the user’s interest. The author called his algorithm hybrid-ε-greedy [18].

The Netflix Prize

One of the key events that energized research in recommender systems was the Netflix prize. From 2006 to 2009, Netflix sponsored a competition, offering a grand prize of $1,000,000 to the team that could take an offered dataset of over 100 million movie ratings and return recommendations that were 10% more accurate than those offered by the company's existing recommender system. This competition energized the search for new and more accurate algorithms. On 21 September 2009, the grand prize of US$1,000,000 was given to the BellKor's Pragmatic Chaos team.

The most accurate algorithm in 2007 used an ensemble method of 107 different algorithmic approaches, blended into a single prediction:[19]

Predictive accuracy is substantially improved when blending multiple predictors. Our experience is that most efforts should be concentrated in deriving substantially different approaches, rather than refining a single technique. Consequently, our solution is an ensemble of many methods.

Many benefits accrued to the web due to the NetFlix project. Some teams have taken their technology and applied it to other markets, such as 4-Tell, Inc.'s NetFlix project-derived solution for ecommerce websites.

A second contest was planned, but was ultimately canceled in response to an ongoing lawsuit and concerns from the Federal Trade Commission.[20]

Privacy Concerns

Building user profiles using collaborative filtering can be problematic from a privacy point of view. Many European countries have a strong culture of data privacy and every attempt to introduce any level of user profiling can result in a negative customer response.

A number of privacy issues arose around the dataset offered by Netflix for the Netflix Prize competition. Although the data sets were anonymized in order to preserve customer privacy, in 2007, two researchers from the University of Texas were able to identify individual users by matching the data sets with film ratings on the Internet Movie Database.[21] As a result, in December 2009, an anonymous Netflix user sued Netflix in Doe v. Netflix, alleging that Netflix had violated U.S. fair trade laws and the Video Privacy Protection Act by releasing the datasets.[22] This led in part to the cancellation of a second Netflix Prize competition in 2010.[20]

Much research has been conducted on ongoing privacy issues in this space. Ramakrishnan et al. have conducted an extensive overview of the trade-offs between personalization and privacy and found that the combination of weak ties (an unexpected connection that provides serendipitous recommendations) and other data sources can be used to uncover identities of users in an anonymized dataset.[23]

Performance Measures

Evaluation is important in assessing the effectiveness of recommendation algorithms. The commonly used metrics are the Mean Squared Error and Root Mean Squared Error. The latter was used in the Netflix Prize. The information retrieval metrics such as Precision, Recall, or DCG are useful to assess the quality of a recommendation method. Recently, the diversity, novelty and coverage are also considered as an important aspects in evaluation.[24]

See also

References

  1. ^ a b Francesco Ricci and Lior Rokach and Bracha Shapira, Introduction to Recommender Systems Handbook, Recommender Systems Handbook, Springer, 2011, pp. 1-35
  2. ^ How Computers Know What We Want — Before We Do
  3. ^ a b Collaborative Recommendations Using Item-to-Item Similarity Mappings
  4. ^ Prem Melville and Vikas Sindhwani, Recommender Systems, Encyclopedia of Machine Learning, 2010.
  5. ^ R. J. Mooney and L. Roy (1999). "Content-based book recommendation using learning for text categorization". In Workshop Recom. Sys.: Algo. and Evaluation. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)
  6. ^ Andrew I. Schein, Alexandrin Popescul, Lyle H. Ungar, David M. Pennock (2002). "Methods and Metrics for Cold-Start Recommendations". Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2002). New York City, New York: ACM. pp. 253–260. ISBN 1-58113-561-0. Retrieved 2008-02-02. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)CS1 maint: multiple names: authors list (link)
  7. ^ Montaner, M.; Lopez, B.; de la Rosa, J. L. (June 2003). "A Taxonomy of Recommender Agents on the Internet". Artificial Intelligence Review. 19 (4): 285–330. doi:10.1023/A:1022850703159Template:Inconsistent citations{{cite journal}}: CS1 maint: postscript (link).
  8. ^ a b Adomavicius, G.; Tuzhilin, A. (June 2005). "Toward the Next Generation of Recommender Systems: A Survey of the State-of-the-Art and Possible Extensions". IEEE Transactions on Knowledge and Data Engineering. 17 (6): 734–749. doi:10.1109/TKDE.2005.99Template:Inconsistent citations{{cite journal}}: CS1 maint: postscript (link).
  9. ^ Herlocker, J. L.; Konstan, J. A.; Terveen, L. G.; Riedl, J. T. (January 2004). "Evaluating collaborative filtering recommender systems". ACM Trans. Inf. Syst. 22 (1): 5–53. doi:10.1145/963770.963772Template:Inconsistent citations{{cite journal}}: CS1 maint: postscript (link).
  10. ^ John S. Breese, David Heckerman, and Carl Kadie (1998). "Empirical analysis of predictive algorithms for collaborative filtering". In Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence (UAI'98). {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)CS1 maint: multiple names: authors list (link)
  11. ^ Sarwar, B.; Karypis, G.; Konstan, J.; Riedl, J. (2000). "Application of Dimensionality Reduction in Recommender System A Case Study"Template:Inconsistent citations{{cite web}}: CS1 maint: postscript (link),
  12. ^ Parsons, J.; Ralph, P.; Gallagher, K. (July 2004). "Using viewing time to infer user preference in recommender systems" (Document). AAAI Workshop in Semantic Web Personalization, San Jose, CaliforniaTemplate:Inconsistent citations{{cite document}}: CS1 maint: postscript (link).
  13. ^ Sanghack Lee and Jihoon Yang and Sung-Yong Park, Discovery of Hidden Similarity on Collaborative Filtering to Overcome Sparsity Problem, Discovery Science, 2007.
  14. ^ I. Markovsky, Low-Rank Approximation: Algorithms, Implementation, Applications, Springer, 2012, ISBN 978-1-4471-2226-5
  15. ^ Takács, G.; Pilászy, I.; Németh, B.; Tikk, D. (2009). "Scalable Collaborative Filtering Approaches for Large Recommender Systems" (PDF). Journal of Machine Learning Research. 10: 623–656Template:Inconsistent citations {{cite journal}}: Unknown parameter |month= ignored (help)CS1 maint: postscript (link)
  16. ^ Rennie, J.; Srebro, N. (2005). "Fast Maximum Margin Matrix Factorization for Collaborative Prediction" (PDF). In Luc De Raedt, Stefan Wrobel (ed.). Proceedings of the 22nd Annual International Conference on Machine Learning. ACM Press. {{cite conference}}: External link in |conferenceurl= (help); Unknown parameter |booktitle= ignored (|book-title= suggested) (help); Unknown parameter |conferenceurl= ignored (|conference-url= suggested) (help)
  17. ^ a b Yong Ge, Hui Xiong, Alexander Tuzhilin, Keli Xiao, Marco Gruteser,Michael J. Pazzani (2010). "An Energy-Efficient Mobile Recommender System" (PDF). Proceedings of the 16th ACM SIGKDD Int'l Conf. on Knowledge Discovery and Data Mining. New York City, New York: ACM. pp. 899–908. Retrieved 2011-11-17. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help)CS1 maint: multiple names: authors list (link)
  18. ^ Bouneffouf, Djallel (2012), "Following the User's Interests in Mobile Context-Aware Recommender Systems: The Hybrid-e-greedy Algorithm", Proceedings of the 2012 26th International Conference on Advanced Information Networking and Applications Workshops (PDF), Lecture Notes in Computer Science, IEEE Computer Society, pp. 657--662, ISBN [[Special:BookSources/{978-0-7695-4652-0|{978-0-7695-4652-0]] {{citation}}: Check |isbn= value: invalid character (help).
  19. ^ R. Bell, Y. Koren, C. Volinsky (2007). "The BellKor solution to the Netflix Prize" (PDF).{{cite web}}: CS1 maint: multiple names: authors list (link)
  20. ^ a b "Netflix Prize Update". Netflix Prize Forum. 2010-03-12.
  21. ^ Rise of the Netflix Hackers
  22. ^ Netflix Spilled Your Brokeback Mountain Secret, Lawsuit Claims
  23. ^ Naren Ramakrishnan, Benjamin J. Keller, Batul J. Mirza, Ananth Y. Grama, George Karypis (2001). "Privacy Risks in Recommender Systems". IEEE Internet Computing. 5 (6). Piscataway, NJ: IEEE Educational Activities Department: 54–62. doi:10.1109/4236.968832. ISBN 1-58113-561-0.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  24. ^ Lathia, N., Hailes, S., Capra, L., Amatriain, X.: Temporal diversity in recommender systems. In: Proceeding of the 33rd International ACMSIGIR Conference on Research and Development in Information Retrieval, SIGIR 2010, pp. 210–217. ACM, New York

Further reading

Books
Scientific articles

External links