Imported Twitter networks now have an “in-reply-to tweet ID” column. This is a useful data element for building “paths” that capture how information flows through a network.
When you lay out each of the graph’s groups in its own box, you can now select how the boxes are laid out. Go to NodeXL>Graph>Layout>Layout Options in the Excel ribbon. (Thanks to Cody Dunne for this feature.)
The Check for Updates item has been removed from the Excel ribbon. NodeXL now automatically checks for updates once a day. Once this release is installed, NodeXL will automatically update itself when a new release is available. You will no longer have to manually download and install new releases. This release and those that follow will all be referred to as “NodeXL Excel Template 2014.” New releases will continue to have version numbers, but the numbers will be less important in light of the new auto-update feature.
If you use third-party graph data importers, such as the Social Network Importer for NodeXL, note that the folder where the importers are stored must be specified in the NodeXL>Data>Import>Import Options dialog:
If you use the NodeXL Network Server, an advanced command-line program that downloads a network from Twitter and stores the network on disk in several file formats, note that the program is no longer a part of NodeXL Excel Template. See “Using the NodeXL Network Server command-line program with NodeXL Excel Template 2014” at http://nodexl.codeplex.com/discussions/522830.
When a Twitter network is imported, the hashtags in the “Hashtags in Tweet” (or “Hashtags in Latest Tweet”) column are now all in lower case. Previously, identical strings with different case letters would be counted differently. This is no longer the case and the result is that terms that had been divided are now unified. These terms will now have higher values and there will be more diversity in the top ten list.
Thanks for using NodeXL and stay tuned for additional updates!
Here is a map of connections among people who recently tweeted the term “peoplebrowsr”.
“But what does that picture mean?”
I hear this reaction frequently when I show people maps I have made of social media connections.
I often point out that the map and the data can reveal people who occupy important locations in the network as well as emergent clusters and groups.
“So why didn’t you just say so?”
I hear this reaction frequently when I explain what is important about a network.
In NodeXL version 203 we have released a new feature called Graph Summary. Our goal is to “just say so”.
In this version we introduce the basics of automatic captioning. In the NodeXL>Graph menu we now have a “Summary” button:
NodeXL will collect information about the creation and configuration of the network. The dialog box looks like this:
Note that NodeXL>Data>Save Import Details in Graph Summary must be selected in the Import menu for the “Data Import” field to be populated.
Selecting “Copy to Clipboard” will load a copy of these text fields into the buffer. An example of that caption is here:
The graph represents a network of up to 1000 Twitter
users whose recent tweets contained "peoplebrowsr".
The network was obtained on
Friday, 09 March 2012 at 01:21 UTC.
There is an edge for each follows relationship.
There is an edge for each "replies-to" relationship
in a tweet.
There is an edge for each "mentions"
relationship in a tweet.
There is a self-loop edge for each tweet that is
not a "replies-to" or "mentions".
The earliest tweet in the network was tweeted on
Friday, 02 March 2012 at 02:39 UTC.
The latest tweet in the network was tweeted on
Friday, 09 March 2012 at 00:47 UTC.
The graph is directed.
The graph was laid out using the
Harel-Koren Fast Multiscale layout algorithm.
The edge colors are based on relationship values.
The vertex sizes are based on followers values.
Overall Graph Metrics:
Unique Edges: 172
Edges With Duplicates: 123
Total Edges: 295
Connected Components: 15
Single-Vertex Connected Components: 13
Maximum Vertices in a Connected Component: 58
Maximum Edges in a Connected Component: 276
Maximum Geodesic Distance (Diameter): 4
Average Geodesic Distance: 2.014176
Graph Density: 0.036653091447612
Top 10 Vertices, Ranked by Betweenness Centrality:
The graph's vertices were grouped by cluster using the
Clauset-Newman-Moore cluster algorithm.
More NodeXL network visualizations are here:
A gallery of NodeXL network data sets is available here:
NodeXL is free and open and available from www.codeplex.com/nodexl
NodeXL is developed by the Social Media Research Foundation
(www.smrfoundation.org) - which is dedicated to
open tools, open data, and open scholarship.
Donations to support NodeXL are welcome through PayPal:
The book, Analyzing social media networks with NodeXL:
Insights from a connected world, is available from Morgan Kaufmann and from Amazon.
This is the collection of keyword pairs that appeared in two clusters of people who Tweeted about “Paul Ryan”, the Republican Congressman from Wisconsin who delivered the GOP rebuttal to the 2011 United States State of the Union Address. This network illustrates the ways that certain word pairs appears only or predominantly in one cluster (colored here Red and Blue) or the other. Terms that appeared in both clusters appear as purple.
Social networks are built from relationships between people. Keyword networks are built from relationships between words and other text strings. When two words appear in the same message, sentence, or alongside one another ties of different strengths are created. The networks that result can illuminate the relationships among topics of importance in a collection of messages.
Markus Strohmaier from the Technical University Graz (TUG) along with Claudia Wagner gave us inspiration in a paper:
in which they defined a range of ways two words (technically these are strings, they may not really be words) can be associated with one another. Words could be linked if they are in the same tweet, next to one another, or sequential among other ways to link terms.
NodeXL has not had any features for exploring the networks in texts. Now with the addition of a new macro from Scott Golder, it is fairly simple to extract pairs of keywords from collection of tweets. NodeXL’s Twitter importer can optionally include the content of the tweet that included the search term and this column of text can now be processed itself into a new network based on the ways words appear together in tweets.
This feature builds on the work of several people. Scott Golder from Cornell started the ball rolling with a simple but effective VBA script that allowed others to build and refine the models of what counts as a tie between two words. Vladimir Barash added several refinements including support for stop word lists to remove common terms. Scott then picked up the code again and added a set of features for selecting the nature of the graph and making it easier to select the options needed.
The code for the Keyword Network macro is below.
The instructions to use it take a few steps to complete:
Group Frames: If your graph has groups and you choose to lay out the groups in their own boxes (NodeXL, Graph, Layout, Layout Options), you can now specify the width of the box outlines.
Constant Edges: When you select an edge, its width no longer changes. NodeXL used to use the same width for all selected edges, even if the edges had varying widths when unselected.
Group and Vertex Display Harmony:
When a graph has groups, you now have more control over how the groups are shown. Go to NodeXL, Analysis, Groups, Group Options.
The NodeXL, Show/Hide, Graph Elements, Groups menu item has been replaced with a checkbox in the Group Options dialog box.
Right-Click Group Controls: Menu items for selecting, expanding, collapsing and removing groups are now available in the menu that appears when you right-click the graph pane. (These are just shortcuts for the same menu items that are available in the Ribbon at NodeXL, Analysis, Groups.)
WYSIWYCC: What You See Is What You Can Click –
Hidden edges and vertices (those that have their Visibility cells set to Hide) can no longer be selected in the graph pane.
Edges and vertices that have been filtered (NodeXL, Analysis, Dynamic Filters) can no longer be selected in the graph pane.
Bigger Twitter Lists: When importing a Twitter list network (NodeXL, Import, From Twitter List Network), you can now enter up to 10,000 usernames. The maximum used to be 500.
UCINET / Matrix Compatibility: Bug fix: When exporting the graph to a UCINET file (NodeXL, Data, Export, To UCINET Full Matrix DL File), isolated vertices didn’t get exported. When exporting the graph to a new matrix workbook (NodeXL, Data, Export, To New Matrix Workbook), isolated vertices didn’t get exported, when importing a graph from a matrix workbook (NodeXL, Data, Import, From Open Matrix Workbook), isolated vertices didn’t get imported. Now they do!
NodeXL now (v.166) offers users a set of keyboard shortcuts that can speed up your routine network layout tasks.
After you click in the graph pane, a number of keyboard shortcuts are now available for functions that had previously been available in the visualization pane’s right-click menu. Now, you can press:
Ctrl+A to select all vertices and edges
Ctrl+V to select all vertices
Ctrl+E to select all edges
Ctrl+D to deselect everything
Ctrl+P to edit the properties of the selected vertices
Ctrl+C to save the graph image to the Windows clipboard
Ctrl+I to save the graph image to a file
Arrow key to move the selected vertices a small distance
Shift+arrow key to move the selected vertices a large distance.
(If you forget a shortcut, most of them are listed in the graph pane’s right-click menu.)
If you have any suggestions for other frequent tasks that could be accelerated with a keyboard command, please contact us on the NodeXL discussion board or here in the comments.
A single workbook may contain data from a single NodeXL data collection, run on a particular day and collecting data from a few hours or days back from that moment (depending on factors like the volume of activity around the selected keyword and the depth of the twitter search catalog, which is often not more than a week or two long and much shorter for active topics). An example of a single network slice is this recent map of the connections among people who mentioned “microsoft research” in Twitter on a single day (December 18th, 2010):
This is a single slice of the network, a day out of months of activity. A still frame can tell a rich story: this is a picture of a crowd that has gathered to discuss a topic of common interest: “microsoft research“. It illustrates a structure common to many large discussions of popular topics — a large set of isolates (the rows at the bottom) who were not observed to have a followed, mentions, or replies relationship to anyone else who tweeted the same term. These are casual mentioners of the topic. At the end of these rows are a small number of dyads, triads, and small components of a handful of people who link to one another but not to the largest connected component. These are pairs or small groups discussing the topic among themselves, but none are connected to a larger component. Above these rows is the “giant component” — the blob of people who do have a connection to someone else who also tweeted a message containing the same term who in turn have a connection that leads to a large number of others. The giant component is itself composed of several sub-components of densely connected groups. At the center of each component are the core users, the people who often hold their cluster together. Between these clusters are the bridges, the people who link otherwise disconnected sub-groups. At the edges are the peripheral people who have just taken the first step up from being an isolate and have formed a single reply, mention, or follows relationship to someone else who also tweeted the search keyword and can bridge them back to the core of the giant component. This is a large and active network with hybrid qualities. There is a “brand” or broadcast element in it: the yellow cluster is a hub and spoke structure centered on the Microsoft Research Twitter account. These people re-tweet what this account publishes but do not connect to one another. Just a few of these people set off second and third waves of retweets. Elsewhere in the graph there are other network structures present, for example the green and blue clusters feature people are centered around their own discussions of the term “microsoft research“.
If you collect many still frames of slices of network activity there is great value in exploring the way the network graph changes over time. In the most recent release NodeXL provides the first step in a series of features related to time and graph comparison. You can now create a workbook that aggregates the overall metrics (edge counts, vertex counts, connected component counts, etc.) for a folder full of NodeXL workbooks. In NodeXL follow the menu path: NodeXL>Analysis>Graph Metrics>Aggregate Overall Metrics to get this:
The result of this feature is a workbook with a row containing the summary data from each of the workbooks in the target folder. Any arbitrary collection of network workbooks can be aggregated but this is particularly useful when the workbooks are sequential time slices.
An example is the time series plot below tracking the rise and fall of several Twitter volume and network measures for the “microsoft research” search term over several months:
This chart tracks the number of vertices (each vertex in this case is a person our data collector saw tweet about the search term “microsoft research“) in each (almost) daily network snapshot. In addition the unique edges or connections between these Twitter users are plotted along with the number of people with no connections (“Single-Vertex Connected Components”). The size of the largest component in the network (“Maximum vertices in a connected component “) is a measure of the changing size of the core community of discussion participants. Measures like the maximum and average “geodesic” distance provide a rough measure of how long and thin (high values) or generally spherical (low values) a particular network is shaped. A “geodesic” is the longest path that can be walked through the network. Long skinny networks may indicate the presence of loosely connected smaller groups that have a few people who act as bridges. Low geodesic values suggest dense networks with people connected to many others with few isolates and sub-groups.
I find the ratios between measures of the size of the large network component and the population of isolates to be interesting. As events go on over a period of days more people connect with others who are talking about the same topic, growing the size of the large connected component. But often the isolate population also grows during this time as people at the periphery of the topic network catch sight of mentions of the event and tweet about it. I could imagine one goal of social media management to be the conversion of isolates to connected component members. Those who follow, reply or mention even a single other person also talking about a topic are more likely to return and engage than those who have zero connections. It is not clear if more connections provide a linear increase in continued engagement, I suspect that the main effect is at the zero/one divide and drops off in effect after the first dozen or so connections. Encouraging cohesion and network density by replying to isolates and encouraging others to do so may help keep a social media population focused and growing.
In many cases I look at a network graph and apply a series of operations to transform it into a more presentable form. For example, I often calculate graph metrics, use Autofill columns to map data to display attributes like size, color, or shape, create clusters, sub-graph images, and then select the Harel-Koren layout and select the options so that small components get lined up in neat rows at the bottom of the graph. I like the edges to be gray and partially transparent. I often set the font size to a large 24 points because I scale the graph to about 10% of its full size to reduce occlusion.
Carrying out each of these operations once is no problem. Repeat 100 times and there is a problem.
The NodeXL team completed another phase of our automation feature, allowing users to build a refined graph with any set of configuration that can be applied to any number of other networks.
Along with the automated collection system, NodeXL can now generate a regular stream of network graphs from social media sources.