NEWS RELEASE, 5/18/99Need a laugh? Try a UC Berkeley website that recommends jokes tailored to your sense of humor |
By Robert Sanders, Public Affairs
BERKELEY--Any comic will tell you audiences are fickle and unpredictable, but that hasn't stopped a University of California, Berkeley, professor from trying to predict what jokes you'll find funny. Ken Goldberg has created a web site, Jester 2.0 - Jokes for Your sense of Humor (http://shadow.ieor.berkeley.edu/humor/), that recommends jokes based on how you rate a set of sample jokes. "If amazon.com can predict the books you'll like and moviefinder.com can try to pick movies you'll enjoy, why not jokes?," said Goldberg, an associate professor of industrial engineering and operations research. Jokes, in fact, are ideal for the method Goldberg uses to tailor them to a person's sense of humor. Called "collaborative filtering" or "recommender systems," the technique arose as a way to group people with similar preferences. By recording reactions, for example, to a standard set of movies or books, people collaborate in filtering through the many available movies and books to find new ones to recommend to those with similar likes and dislikes. With Jester 2.0, each person is given 15 jokes to rate on a sliding scale from Not Funny to Very Funny. Based on these responses, people who express similar opinions are lumped together. "The idea is, if my set of ratings is similar to someone else's, and that person likes a certain joke, then I am likely to enjoy that joke, too," Goldberg said. Similar filtering is used to recommend movies, books, videos and other consumer items. In these situations, though, the method often breaks down because not everyone has seen or read, and thus has an opinion about, each movie or book in a standard sample. Jokes don't have that problem. "With jokes, you can form an opinion in about 30 seconds. That was the motive behind using this with jokes," Goldberg said. Collaborative filtering also works best when the number of things being rated is smaller than the number of people doing the rating. The number of jokes in the Jester 2.0 database is somewhat over 100, while some 10,000 people have used the site to date. The first version of the web site arose in one of Goldberg's classes, Industrial Engineering and Operations Research 215: Analysis and Design of Databases, when student Mark DiGiovanni created Jester 1.0. Later, other students - Hiro Narita and, now, Dhruv Gupta and Chris Perkins - improved the site, rolling out Jester 2.0 in March. Goldberg's team is not merely repeating what websites like amazon.com have done, however. Jester is an experiment to come up with faster, more efficient ways of filtering. "Our goal is to make collaborative filtering fast," he said. "We think we have found an important simplification that is faster than the method used in most collaborative filtering." In signing up for Jester 2.0, each person expresses an opinion about each of 15 sample jokes. Some are relatively tame: Q: If a person who speaks three languages is called "tri-lingual," and a person who speaks two languages is called "bi-lingual," what do you call a person who only speaks one language? A: American! Many other jokes, however, involve sexual innuendo, ethnic and religious humor - jokes that some may find offensive. Many even worse jokes were thrown out. Of 15,000 jokes recommended for addition to the website after the debut of Jester 1.0, only 30 survived the cut. The first 10 responses are used to assign a person to a spot in a 10-dimensional "joke space," as Goldberg calls it. (The last five jokes are new ones thrown in to test audience response.) The trick is to identify clusters within this 10-dimensional space. Most collaborative filtering is done using a technique called "nearest neighbor" analysis, where the computer calculates how far you are in joke space from other people and lumps you with your nearest neighbors. Goldberg's team uses principal component analysis - an old technique used in pattern recognition - that involves reducing the 10-dimensional joke space to two dimensions in a way that preserves the cluster information. Using this type of analysis, the difficulty of defining clusters and, thus, the computation time, increases only linearly with the number or jokes. With the alternative method, nearest neighbor analysis, the time required to compute clusters increases as the product of the number of jokes and the number of raters. Goldberg and his team are now using their linear model to predict jokes, although they are looking at alternatives too. While the mathematics has been challenging, Goldberg finds the data they've acquired on joke preferences just as fascinating. "We found some correlation between those who like Clinton jokes
and those who like feminist jokes, for example," he said. "What
does that mean? We don't know. We're trying to stay in the statistical realm
and concentrate on efficiency, but we'd love to share our data with research
groups studying the psychology of jokes." |
Send comments to: comments@pa.urel.berkeley.edu