Less than five years old, the concept of collaborative filtering has already spawned dozens of publicly available systems, several experimental proprietary systems, and even a few commercially available systems. On Saturday, March 16, 1996, 50 researchers in the academic and business worlds gathered at the University of California-Berkeley to exchange ideas and experiences about these emerging filtering tools. The workshop was organized by the School of Information Management and Systems (SIMS), University of California at Berkeley and the Fisher Center for Information Technology and Management at the Haas School of Business; and sponsored by Infonautics Corporation and Verity Inc.
>Acknowledging that far more systems and applications were under development than could be presented during at a one-day workshop, Paul Resnick (AT&T Public Policy Research), observed that the invited speakers represented a cross-section of work in progress. Resnick, along with Hal Varian (SIMS), opened the day's proceedings.
The systems presented were: Group Asynchronous Browser, "GAB" (Bellcore); GroupLens (University of Minnesota); Do-I-Care (University of California-Irvine); Pointers and Digests (Lotus); and People Helping One Another Know Stuff, "Phoaks" (AT&T Research). Links to many of the systems may be found at: http://www.sims.berkeley.edu/resources/collab.
Additionally, an application called UPrint1 (Xerox Parc) and two enabling platforms for collaborative filtering were discussed.
This summary of proceedings provides an overview of the systems, application and infrastructure discussed at the workshop; highlights the development concerns they share in common; and touches upon some issues up ahead. The summary concludes with a list of next steps suggested at the workshop and information about joining a Web forum on collaborative filtering.
As demonstrated at the workshop, collaborative filtering can streamline research, improve retrieval precision, reduce the amount of time spent looking for significant changes on a favorite Web page, and even aid in the selection of films and videos.
On a philosophical level, collaborative filtering (as suggested by a few workshop participants), could lead to "information empowerment." During a discussion period, Marvin Weinberger (Infonautics Corporation), observed that the work being done in collaborative filtering is "really about trying to make systems smart enough so that they really know you as an individual; so you are empowered to navigate effectively." As part of the GroupLens presentation, John Riedl (University of Minnesota), noted a distinction between filtering in Internet media and filtering in "more traditional media." According to Riedl, the availability of collaborative filtering means it is "no longer the case that we give our quality decisions over to the hands of a publisher."
To guide workshop discussion, Paul Resnick proposed the following working definition of collaborative filtering: "Guiding people's choices of what to read, what to look at, what to watch, what to listen to (the filtering part); and doing that guidance based on information gathered from some other people (the collaborative part)." Resnick also challenged participants to develop a "catchier" term.
Each system summary below provides an overall description of the system; what issue(s) the system seeks to resolve and/or the system's comparative advantages; issues under study; and system status.
GAB -- GAB is part of a group of collaborative filtering activities at Bellcore called "FINE" (pronounced "fee'nay," standing for Find It Now Eugene). Also included among that grouping are the Bellcore Video Recommender (a recommendation-based system whereby individuals rate videos), and Bellcore Advisor, which utilizes standard information retrieval techniques now used to find documents, to find people instead.
As described by Mark Rosenstein (Bellcore), GAB uses hierarchical decompositions made by individuals and provides browsing facilities over Web pages. GAB enables users to find equivalence classes of Web pages (i.e., under what class or classes of items their favorite URLs appear). Once those equivalence classes are located, GAB provides users with browseable interfaces of those representations.
GAB also enables a user to select and specify a personal grouping of others' opinions or choices to follow (e.g., hotlists, bookmarks).
For the middle level user who is interested in a particular domain, has some knowledge of Web space for that domain and is willing to let others be more active in searching out sources, GAB is ideal. However, for the novice who has little experience with a domain or the expert who has considerable familiarity, GAB is not as useful a tool.
Additionally, GAB's ability to reach into someone else's bookmarks and extract information (e.g., perhaps a personal bookmark on medical information) raises privacy concerns. To overcome this problem, however, users could edit in properties of bookmark headings to be shared (using a keyword, "public") so that when the GAB search engine looked for "bookmarks," only those categories marked "public" would be indexed. Further, since the "sociology of GAB" is such that sharing would be confined to within workgroups or social groups (where individuals already know each other), privacy may not be as large a concern as it would be in large scale sharing.
Status: Work on GAB is supported by an ARPA grant and research is ongoing. At a recent WWW conference, Bellcore announced several products now under development (see http://www.bellcore.com/WWWCONF/ARTICLES/04/adaptx.html).
For general information regarding Bellcore collaborative filtering activities, see http://community.bellcore.com/navigation/; for specific information regarding the Bellcore Video Recommender, see http://community.bellcore.com/navigation/videos.html.
GroupLens -- Concern regarding quality decisions for information retrieval as well as the burgeoning amount of information to be retrieved inspired John Riedl and Paul Resnick four years ago to develop a system now known as "GroupLens." Using Usenet newsgroups as the domain, these GroupLens' developers designed an architecture structured around a server (nicknamed the "better bit bureau") that collects ratings from individuals and then, based on those ratings, produces predictions of quality for those individuals. According to a GroupLens flyer, the system "combines your opinions about articles you have already read with the opinions of others who have done likewise and gives you a personalized prediction for each unread news article. The prediction is on a scale from 1-5, and indicates to you how likely you are to find the article useful."
A key feature of GroupLens is its open architecture. This feature allows other researchers to create clients that work with GroupLens servers or to even replace those servers if improvements can be suggested.
The major limitation to further development, said Joe Konstan (University of Minnesota), is "bigness." With a potential 22 million Usenet users reading and rating articles, Usenet newsgroups constitute a huge database. Another concern is heterogeneity, and whether work performed in one domain can be carried over into another. A third matter pertains to startup; that is, whether folks will continue participating (recommending) if they do not obtain "instant gratification."
Status: A GroupLens pilot test began February 8, 1996 with a goal of creating a large testbed of data containing news article ratings (as entered by users) as well as the full text of articles rated. According to Brad Miller (University of Minnesota), three specific research issues are now under study: (1) how well GroupLens scales in terms of number of users the "better bit bureau" can support; (2) which algorithms work best; and (3) whether there are any effective surrogates to users entering ratings. For more information, see http://www.cs.umn.edu/Research/GroupLens.
Do-I-Care -- Letting a user know when to revisit a favorite Web page and alerting a user to an interesting change on that page are the primary functions of Do-I-Care. In addition to collaborative filtering, Do-I-Care is concerned with the issue of re-discovery and the utility of machine learning in agents. Do-I-Care is a "sister project" to Mike Pazzani's Syskill and Webert discovery agent.
A Do-I-Care user trains an agent over time to look for the kinds of changes in which he or she would be interested. This training is achieved by a single user indicating that he or she "cares" or "doesn't care" about an item. The technology uses a simple Bayesian classifier and simple text parser that looks for key words. Agents can be cascaded, allowing collaborative use. An individual can use others' efforts for free.
Mark Ackerman (University of California-Irvine), reported that to date, Do-I-Care has been able to achieve 70-95% accuracy after training 10 to 20 times. Some agents, such as one Ackerman has used to track airline fare sales, have already achieved 100% accuracy.
One concern relates to privacy. Unlike some other systems, Do-I-Care involves the explicit rating of other people's work; an act which in itself may be organizationally or socially problematic once those ratings are shared.
Status: A short paper on Do-I-Care was presented at CHI '96. A longer paper is in process. For more information, see http://www.ics.uci.edu/CORPS/dica.html.
Pointers & Digests -- Lotus is developing a form of attributed filtering which, according to Kate Ehrlich (Lotus), is "based on an implicit social contract we believe exists in small workgroups."
As Ehrlich explained, the familiarity which stems from belonging to a workgroup (and theoretically, from attributed filtering), provides two inherent advantages: (1) when one receives a recommendation, one can evaluate it against what is known about the other person; and (2) individuals are more likely to make recommendations to people they know.
Another contrast is the fact that Pointers supports those who send recommendations (in fact, only "recommenders," not receivers, require software); queries cannot be made against the system. For example, if a Notes user is reading a newswire database and comes across an article about librarians, that user can recommend the article by adding comments to a form and then mailing that form to one or more colleagues. The colleague then receives a semi-structured e-mail message containing the title of the article, the name of the database in which it was a hypertext link, and the name and comments of the recommender.
A key feature of Pointers is that it is not tied to a particular domain; i.e., any Notes database to which a particular user has access is a possible source for sending messages. Further, it is a "push" model, in the sense that it is pushing information "out" to other users (whether or not those users request it) by identifying sources and documents.
In fact, incorporating the "push" model of Pointers with the "pull" model of more conventional filtering and retrieval systems is an issue now under study.
Another Lotus system called Information Digest, is also under development. Information Digest provides lists of recommendations organized into sections. Several issues, however, require further analysis. For example, Digest is not as useful for on-demand searches unless there is a large enough pool of recommendations available.
Status: Both Lotus Notes V3 and Notes V4 have the capability of creating Pointers and Information Digests. A first version works with Notes V3, and a second version works with Notes V4. A modified version of Pointers is being shipped with the Web Browser in Lotus Notes V4. For additional information, see http://www.acm.org/sigchi/chi95/Electronic/documnts/papers/ke_bdy.htm.
Phoaks-- Yet another filtering technique, one called Phoaks, uses frequency of mention data within Usenet news groups (e.g., how often people mention URLs; why they mention URLs). Will Hill and his colleagues at AT&T Research are now evaluating these data for their potential to recommend Web resources. According to Hill, their "bias" is the following: how far can they go without having to ask a user for any data?
One of Phoaks' primary advantages stems from its "one person, one vote" approach. Because the system will only accept one recommendation per individual, it prevents any single person from "spamming" the system with multiple recommendations.
Some of the issues now under study include: human interfaces to the social filtering data (i.e., what is useful and useable); privacy vs. connectedness design issues; dependence upon noncompetition; credit-assignment to recommenders, issues of credentials; interoperable systems (e.g., providing PICS server to Phoaks' data).
Status: An early prototype and field tests are on the Internet (see http://www.phoaks.com/phoaks/).
UPrint1 is a "pull technology" that will enable users to print out any book, any time, anywhere. Rather than visit a bookstore to buy a book (or even track down a hard-to-find book), UPrint1 will allow a user to order from a virtual bookstore and specify the location where the publication should be printed (e.g., a local printshop, or even the user's home). UPrint1 will also support customized printing such as braille and large print editions.
UPrint1's development is premised on the overall trend towards digital (books included) and Xerox Parc's prediction that virtual (online) bookstores will eventually outrival physical ones in popularity and convenience because of their advantages (e.g., easier searches, greater number of books from which to choose, anytime/anywhere accessibility, group interactions/chatrooms, auto-recommendation).
"The research problem," says Goldberg, "is making this happen." If the enabler for UPrint1 is the virtual bookstore, then auto-recommendation is what will make both the bookstore and UPrint1 attractive options. Auto-recommendation, which models the informal process many use today to identify books of interest (seeking recommendations from friends, colleagues), has already been used for recommending music and videos (e.g., Firefly).
However, in contrast to Firefly, UPrint1's prospective data is considerably bigger. With about one million books in print, "size," said Goldberg, is the biggest potential "showstopper" to launching UPrint1. Other concerns include: how to solicit recommendations (should recommenders type in their favorites, or should they chose from a list); bootstrapping (i.e., developing an initial list of recommendations); which algorithm to use; whether to correlate findings over an entire list of recommendations, or only within certain genres; and user control issues (will users be "blindly" correlated or will they have some control regarding with whom they are correlated?).
Status: Development of UPrint1 is just getting underway.
PICS -- Originally developed as a technical solution to the growing concern of children viewing indecent material online, PICS or "Platform Internet Content Selection" could, according to Paul Resnick, serve as a platform for collaborative filtering systems.
Instead of blocking undesirable material outright, PICS provides a way for parents to define and select which rating service(s) they'd like to use in filtering the content seen by their children. Significantly, PICS enables the rating labels and the rating label software to remain separate.
What is the tie to collaborative filtering? With many companies now "building in" this technology, said Resnick, it may be useful for collaborative filtering system developers to think about "piggybacking."
Status: PICS developers have been encouraged by the number of recent product announcements regarding label reading software, rating services and label bureaus. However, noted Resnick, there is currently a need to develop tools for entering the rating data. For more information, see http://www.w3.org/PICS.
Meta-Information Protocols and Architectures -- Martin Roscheisen (Stanford University) discussed the kinds of protocols necessary for an infrastructure of third-party meta-information (e.g., annotations and certification of arbitrary attributes). In particular, Roscheisen focused on the protocols now being developed as part of the Stanford Integrated Digital Libraries project.
A prototype for a generalized form of shared Web annotations called "ComMentor" (http://pcd.stanford.edu/ComMentor), was completed in early 1995. Some of those concepts can now be found in other systems such as PICS. A generalization of this third-party meta-information protocol is now being finalized as the Stanford Interop protocol for digital libraries interoperability (http://diglib.stanford.edu/~testbed).
Roscheisen also referred to a system called "Grassroots" which provides non-anonymous collaborative filtering in a way that integrates an entire set of currently disparate forms of interfaces such as e-mail, newsgroups, hypermail, etc. (see http://diglib.stanford.edu/Grassroots).
Status: The Stanford Interop protocol is currently being tested in collaboration with the Digital Library projects at the University of Michigan, University of California-Santa Barbara, and the University of Illinois.
While questions at the workshop covered a wide range of topics, three general categories of concern dominated the proceedings:
Notably, several speakers alluded to the problem of getting to critical mass (i.e., getting enough recommenders, and hence recommendations, to ultimately generate statistically meaningful predictions) as a significant obstacle to system start-up.
Many worksoop speakers also observed that maintaining recommender participation could be equally difficult. During start-up, initial recommenders receive no immediate payoff, and over time, may lose interest, become frustrated with the messages they see, or resent "free riders" so much that they decide to become one.
According to Chris Avery, Harvard University, economic theory explains why the day may come when recommenders cease recommending. It also cautions system developers to think about future systems where compensation might be used to encourage continued participation.
Collaborative filtering may be thought of as a public good in the sense that the value provided -- i.e., the recommendation -- is created by one person's efforts, and others are helped. Since some are producers and some are consumers (and no one can be both a producer and a consumer), this set of circumstances creates an instant class of free-riders as well a category of folks who must accept a negative expected value to their actions; i.e., that there will be no immediate payoff to their actions, but in the long term, there could be a substantial benefit (identification of the one article that helps everyone). The general social surplus from providing the evaluations, explained Avery, would be maximized as long as the incremental benefit from each new evaluation continued to outweigh the cost.
As Avery pointed out, Internet civic-mindedness may ensure that recommenders look past this negative expected value and continue to recommend for the greater good of the community. However, since economic theory "anticipates" the worst, said Avery, it is wise to consider some of the "pitfalls" arising from the mix of incentives involved in deciding whether to read an article or wait for someone else to read it.
In the future, market pricing might eventually subsidize and coordinate evaluations. Such pricing schemes are likely to be complex, but could be adapted over time as agents learned more about the folks involved.
Nearly all workshop speakers (as well as many participants) were concerned about reliability.
Some questions pertained to genre (whether results from one category of interest could be carried over into another).
Other questions challenged the ability of systems to overcome deliberate attempts to confound results. For example, in a discussion about recipes, could vegetarians manipulate results by assigning unfavorable ratings to meat recipes? In the case of GAB, for example, misleading the system would only hurt one's reputation, explained Mark Rosenstein (reputation, in this instance, was defined as the set of correlations one has with others within a group).
The greatest number of reliability concerns pertained to the accuracy of various algorithms in predicting results. During the GroupLens presentation, Brad Miller explained how developers were concerned that conditional probability might not be yielding the best results. His group decided to write and run a program with synthetic users. Their results were as follows: with lots of data, conditional probability worked well; with a certain number of ratings, the Pearson algorithm was indicated, and with a few ratings, or for instances with users having a low correlation to others, an adaptive algorithm worked best.
David Heckerman of Microsoft discussed a clustering method that overcomes some prime concerns about clustering (e.g., deciding which distance metric to use; determining the number of clusters or classes; handling missing data). The method discussed by Heckerman is a "Bayesian approach" and very quickly predicts user preferences (the example demonstrated at the workshop used the domain of Nielsen television network programming).
Presenters acknowledged that a determined individual could, in fact, identify a pattern of response among or between different interest groups, thereby identifying a particular user and cracking an otherwise secure pseudonym. Nevertheless, many systems have implemented safeguards that provide a ensure a relatively high degree of privacy. In GroupLens, the system only knows a user's pseudonym. Users can choose to have pseudonyms that reveal their true identities and are in complete control of deciding whether to do so.
As discussed above, GAB participants can protect their private bookmarks from being accessed using a keyword to edit in properties of bookmark headings to be shared.
With many of the technical issues now being raised and tackled, several of the sociological implications of collaborative filtering are beginning to surface. Will collaborative filtering, as John Riedl suggested, "democratize" the information quality process, or will it result in social fragmentation? "Will the global village fracture into tribes?" asked Paul Resnick.
At Lotus, researchers are studying the social dynamics of attributed filtering as well as the second order effect of forming interest groups around ratings.
In closing the proceedings, Paul Resnick suggested several activities:
To promote continued discussion of these activities, a listserver was formed:
collab@sims.berkeley.edu
New participants may join by sending an e-mail to: majordomo@sims.berkeley.edu
In the body of the e-mail message, the words "subscribe collab" should be entered (the subject line may be left blank).
Once a subscription request is received, the list management software will send an automated greeting message with further information. The discussions are archived at: http://www.sims.berkeley.edu/resources/collab
This summary was prepared by Louise A. Arnheim for the Coalition for Networked Information (CNI). Ms. Arnheim is a consultant who specializes in writing and editing materials on telecommunications and information technology. She may be reached at LArnheim@aol.com.