This is the collection of keyword pairs that appeared in two clusters of people who Tweeted about “Paul Ryan”, the Republican Congressman from Wisconsin who delivered the GOP rebuttal to the 2011 United States State of the Union Address. This network illustrates the ways that certain word pairs appears only or predominantly in one cluster (colored here Red and Blue) or the other. Terms that appeared in both clusters appear as purple.
Social networks are built from relationships between people. Keyword networks are built from relationships between words and other text strings. When two words appear in the same message, sentence, or alongside one another ties of different strengths are created. The networks that result can illuminate the relationships among topics of importance in a collection of messages.
Markus Strohmaier from the Technical University Graz (TUG) along with Claudia Wagner gave us inspiration in a paper:
C. Wagner, M. Strohmaier, The Wisdom in Tweetonomies: Acquiring Latent Conceptual Structures from Social Awareness Streams, Semantic Search 2010 Workshop (SemSearch2010), in conjunction with the 19th International World Wide Web Conference (WWW2010), Raleigh, NC, USA, April 26-30, ACM, 2010. (pdf)
in which they defined a range of ways two words (technically these are strings, they may not really be words) can be associated with one another. Words could be linked if they are in the same tweet, next to one another, or sequential among other ways to link terms.
NodeXL has not had any features for exploring the networks in texts. Now with the addition of a new macro from Scott Golder, it is fairly simple to extract pairs of keywords from collection of tweets. NodeXL’s Twitter importer can optionally include the content of the tweet that included the search term and this column of text can now be processed itself into a new network based on the ways words appear together in tweets.
This feature builds on the work of several people. Scott Golder from Cornell started the ball rolling with a simple but effective VBA script that allowed others to build and refine the models of what counts as a tie between two words. Vladimir Barash added several refinements including support for stop word lists to remove common terms. Scott then picked up the code again and added a set of features for selecting the nature of the graph and making it easier to select the options needed.
The code for the Keyword Network macro is below.
The instructions to use it take a few steps to complete: