Friday, November 02, 2007

Content-based Social Network Analysis of Online Communities - Social Networking Conference @ U of T

Content-based Social Network Analysis of Online Communities
Anatoliy Gruzd and Caroline Haythornthwaite
School of Library and Information Science, University of Illinois

In this talk, they analyze online communities like bulletin boards to gain more information and insight about nodes, relations and ties. Very few systems look at relational information so they focus on nodes and tie discovery. Their goal is to identify who are the actors in the network. Their approach is to use natural language processing to enhance the current techniques of building social networks. So how to obtain the social networks from online communities? There are two methods. First, you can do a chain network which is based on the chain of posting of posts and comments (like what I do for my PhD research). One of the problems with the chain network (which I also encountered as well) is what is the relation of the 3rd commenter, do they comment on the posting or the previous comment? A solution around this is to look at tie strength to the previous commenter or the poster to determine if the person is posting to the previous commenter or the poster. The second method is to do a name network by pulling the names from within the body of the text. Here is where the NLP comes into play.

The idea in the name network is to make use of node and information in text of posting. How to disambiguate names/nicknames from text, those that mean the same person. How to know the name is in the subject, is it being discussed? To determine this, they did hand coding of the items to see the categories of names. They then compared the name network with the chain network and performed ego network analysis for posts and comments. Another problem is that many times when you reply, the previous message is embedded in the post so you don't want to include this in the name generator to duplicate this. So, they removed the previous message embedded in the reply to the post.

No comments: