Some researchers (from Singapore!) recently requested access to the data from our Facts or Friends paper. I went ahead and bundled up a dataset that’s suitable for distribution. If anyone out there is interested in checking out this small-ish dataset (~1100 codings across ~500 questions from 3 Q&A sites) just send me an email. I’d be happy to share.
I wrote a paper this past fall that I hope you will read and enjoy. I welcome feedback.
Tens of thousands of questions are asked and answered every day on social question and answer (Q&A) Web sites such as Yahoo Answers. While these sites generate an enormous volume of searchable data, the problem of determining which questions and answers are archival quality has grown. One major component of this problem is the prevalence of conversational questions, identified both by Q&A sites and academic literature as questions that are intended simply to start discussion. For example, a conversational question such as “do you believe in evolution?” might successfully engage users in discussion, but probably will not yield a useful web page for users searching for information about evolution. Using data from three popular Q&A sites, we confirm that humans can reliably distinguish between these conversational questions and other informational questions, and present evidence that conversational questions typically have much lower potential archival value than informational questions. Further, we explore the use of machine learning techniques to automatically classify questions as conversational or informational, learning in the process about categorical, linguistic, and social differences between different question types. Our algorithms approach human performance, attaining 89.7% classification accuracy in our experiments.
I wrote this paper with Daniel Moy, an undergraduate at the University of Minnesota, and Joe Konstan, my advisor. Also, many of you (my friends and colleagues) helped this research by coding data – thanks!
This paper will be published in the proceedings of CHI 2009, and is currently nominated for a best paper award, which is a real honor. Also of note is the fact that our Q&A paper at last year’s CHI conference was wedged in a session with a medley of completely unrelated work (e.g., public interactive displays). This year, Q&A has it’s own session with some outstanding researchers.
For some reason, Q&A sites are mostly overlooked by analysts and journalists. One recent exception was an article from Slate, comparing Yahoo! Answers (unfavorably) with Wikipedia. A good read, if you’re interested in Q&A, social search, or mass collaboration web sites.
This morning, though, while catching up on my Technology Review RSS feed, I saw some analysis of Microsoft’s move to acquire Yahoo! A quoted analyst observes that this acquisition isn’t just about search, but also about Yahoo’s social properties, some of which are best in class. The interesting fact is that the two properties cited are Flickr and del.icio.us. Yes, important sites. I use both of them. What about Yahoo! Answers? Check this out:
Yahoo! Answers gets about as much traffic as Flickr, and about an order of magnitude more traffic than delicious. Also, note that while Flickr is a leader in the field (although Photobucket, Smugmug, Google’s Picasa, and Kodak’s offerings are strong competitors), Yahoo! Answers blows the rest of the Q&A field away in terms of activity. There is no other Q&A site in the US that comes close.
I will speculate that the reason for ignoring Q&A sites to date has to do with the demographics of their users. While Flickr and Delicious are founded on relatively tech-saavy, relatively geeky users (i.e. the same demographic as technology writers), Q&A sites are often frequented by high-schoolers, stay-at-home moms, and other demographics that are typically ignored by the technology press.
What do you think? What are other reasons for ignoring Q&A sites?
Shilad, Dan, and I wrote a short paper called “Supporting Social Recommendations with Activity-Balanced Clustering“. I presented the paper at Recommender Systems 2007 here in Minneapolis. The idea is that standard clustering algorithms such as k-means simply don’t work well if you’re trying to create groups of users that manifest similar levels of activity. In MovieLens, k-means placed 84% of the active users in a single cluster. Thus, we mashed up some techniques from the clustering literature to create an algorithm that improved on k-means in terms of cluster quality and computational complexity, while giving us the balanced properties we sought. One of the key insights is that stable matching algorithms are a clever way to evenly divide sets of things based on similarity metrics.
I received lots of feedback from conference attendees. One of the most interesting questions (that I have not investigated) is whether there are some unique properties of the users that I clustered that led to the severe lack of balance with k-means. I don’t think so – the 20,000 or so MovieLens users that we used as our data have two characteristics that seem particularly regular:
- Power law activity
- Normal distribution of ratings (1-5 scale)
Another person suggested that I simply didn’t tweak k-means well enough (and the other algorithms I tried). Well, I turned plenty of knobs, many times. In the process of the knob turning, I realized that clustering metrics are a fairly bad way of gauging the “feel” of a cluster of users. I’d like to dig more into clustering metrics that reflect the different ways that groups of people might come together.