On Tuesday night Marc Smith and I will be presenting the poster for our paper, “Distinguishing Knowledge vs Social Capital in Social Media with Roles and Context” at the International Conference for Weblogs and Social Media. You can find the poster, co-authored with Marc Smith (Telligent Systems), Lise Getoor (University of Maryland) and Howard T. Welser (Ohio University), here. The full text of the paper has more information, but the poster is a good summary of the key concepts in the paper:
- What roles do people play in social media?
- What contexts shape user behavior in social media?
- How can we leverage roles and context together to predict future user behavior (in terms of contribution type) from past user behavior?
The specific research question being addressed is: can we predict whether a particular contribution to Live Q&A (a Microsoft-sponsored community question answering site) will be contain factual information, or discussion / chat. It is possible to do the prediction based on the text of the contribution, but such an approach focuses entirely on the content, and not on the actor – the user who is making the contribution. If we leverage actor-centric information (what role does he/she play in the community: an “answer person” or a “discussion person”? is he making the contribution in a discussion-oriented context, such as implied by tagging the contribution as “fun,” or a fact-oriented context, such as implied by tagging the contribution as “math”?), we find we can build a decent predictor at very low cost with very few variables. If we use just role information or just context information, we do reasonably well… but if we use both, we do *much* better. While the question we’re answering here is quite specific, the advantage of our approach is that it can be applied to almost any social media context – any place online where users can both contribute content and interact with others. We could just as easily create a predictor for contributions to Yahoo! Answers, or even to Wikipedia (if we had the relevant data). This is definitely food for thought / opportunity for future work đ