NodeXL has new updates to its importers for Twitter users and lists.
We have released an updated version of NodeXL that simplifies and merges the previously separate User and List importers.
The new, streamlined importer treats an individual user as a list of one.
Lists can be defined by pointing to an existing Twitter List or simply entering a list of delimited user names into the text box.
The updated importer now collects many more tweets per person and parses these messages to generate reply and mention edges.
You can now define a group of Twitter users and find out how much they reply and mention one another.
You can even pull in the followers of each person, to see if they reply or mention people they also follow.
But ever since June 11, 2013, Twitter has made access to the “follows” edge data very difficult (its just very slow). Designed and implemented prior to the update that restricted access to the follower network, the original NodeXL Twitter list importers relied mostly on queries that are now impractically slow for all but the smallest lists of users who have small collections of followers.
The update to these User and List importer is partially an adaptation to these changes. The importer shifts away from the follower network to focus on the communication interaction data in the content of Tweets. Since Twitter offers more generous access to Tweets than to information about who follows who, we are obliged to make better use of what they do offer.
Imported Twitter networks now have an “in-reply-to tweet ID” column. This is a useful data element for building “paths” that capture how information flows through a network.
When you lay out each of the graph’s groups in its own box, you can now select how the boxes are laid out. Go to NodeXL>Graph>Layout>Layout Options in the Excel ribbon. (Thanks to Cody Dunne for this feature.)
The Check for Updates item has been removed from the Excel ribbon. NodeXL now automatically checks for updates once a day. Once this release is installed, NodeXL will automatically update itself when a new release is available. You will no longer have to manually download and install new releases. This release and those that follow will all be referred to as “NodeXL Excel Template 2014.” New releases will continue to have version numbers, but the numbers will be less important in light of the new auto-update feature.
If you use third-party graph data importers, such as the Social Network Importer for NodeXL, note that the folder where the importers are stored must be specified in the NodeXL>Data>Import>Import Options dialog:
If you use the NodeXL Network Server, an advanced command-line program that downloads a network from Twitter and stores the network on disk in several file formats, note that the program is no longer a part of NodeXL Excel Template. See “Using the NodeXL Network Server command-line program with NodeXL Excel Template 2014” at http://nodexl.codeplex.com/discussions/522830.
When a Twitter network is imported, the hashtags in the “Hashtags in Tweet” (or “Hashtags in Latest Tweet”) column are now all in lower case. Previously, identical strings with different case letters would be counted differently. This is no longer the case and the result is that terms that had been divided are now unified. These terms will now have higher values and there will be more diversity in the top ten list.
Thanks for using NodeXL and stay tuned for additional updates!
Social media networks tend to be “clumpy”. Here is the map of connections among people who tweeted the term “global warming”:
NodeXL v.210 and newer now supports text analysis of content collected from social media data sources. NodeXL applies social network clustering and then analyzes text that is grouped by social clusters.
Connections among people who tweet about a topic, keyword or hashtag form patterns that can lead to the formation of sub-groups and clusters. Multiple clusters are formed within a network when a sub-population of people link to one another far more than to people in other groups. These regions of dense connections define the boundaries between sub-populations. Clusters often reflect the variation in interest in certain people and topics in the population. Some people and topics are more interesting to one group than others. Within these groups certain people and words get repeated more often than others.
Networks can be partitioned by many methods. NodeXL implements several. A collection of vertices can be grouped by the user by applying labels to the vertex worksheet (“Group by vertex attribute”). Or a group of vertices can be determined by an algorithm that looks for differences in the density of connections and divides by the points of least association (“Group by cluster algorithm”). Networks can also be grouped into separate isolated collections of nodes, called “connected components”.
In NodeXL groups can be visualized in multiple ways. Groups can be collapsed into meta-vertices that stand-in for the members of that group (right-click the graph pane and select “Groups>Collapse all groups”). Group members can also be displayed within a “box” with the “group-in-a-box” feature (found in the layout selection menu in the Graph Pane – select “Layout Options”).
Within each group is a population of people along with the tweets they authored in the time period captured by the data set. Each group has a collection of tweets that can be analyzed. The contents of all the tweets in a network can be scanned and certain types of strings can be counted to measure its frequency of mention. These counts can be repeated for each group, allowing groups to be contrasted based on the relative rates strings like URLs, hashtags, and @usernames. Here is a sample of the worksheet NodeXL creates to display all the data about people, URLs, and hashtags frequently mentioned in each group:
The worksheets offers top URLs, hashtags, and users across the entire network, and within each sub-group. The details offer insights into the people and topics of greatest interest.
Group Frames: If your graph has groups and you choose to lay out the groups in their own boxes (NodeXL, Graph, Layout, Layout Options), you can now specify the width of the box outlines.
Constant Edges: When you select an edge, its width no longer changes. NodeXL used to use the same width for all selected edges, even if the edges had varying widths when unselected.
Group and Vertex Display Harmony:
When a graph has groups, you now have more control over how the groups are shown. Go to NodeXL, Analysis, Groups, Group Options.
The NodeXL, Show/Hide, Graph Elements, Groups menu item has been replaced with a checkbox in the Group Options dialog box.
Right-Click Group Controls: Menu items for selecting, expanding, collapsing and removing groups are now available in the menu that appears when you right-click the graph pane. (These are just shortcuts for the same menu items that are available in the Ribbon at NodeXL, Analysis, Groups.)
WYSIWYCC: What You See Is What You Can Click –
Hidden edges and vertices (those that have their Visibility cells set to Hide) can no longer be selected in the graph pane.
Edges and vertices that have been filtered (NodeXL, Analysis, Dynamic Filters) can no longer be selected in the graph pane.
Bigger Twitter Lists: When importing a Twitter list network (NodeXL, Import, From Twitter List Network), you can now enter up to 10,000 usernames. The maximum used to be 500.
UCINET / Matrix Compatibility: Bug fix: When exporting the graph to a UCINET file (NodeXL, Data, Export, To UCINET Full Matrix DL File), isolated vertices didn’t get exported. When exporting the graph to a new matrix workbook (NodeXL, Data, Export, To New Matrix Workbook), isolated vertices didn’t get exported, when importing a graph from a matrix workbook (NodeXL, Data, Import, From Open Matrix Workbook), isolated vertices didn’t get imported. Now they do!
NodeXL now (v.166) offers users a set of keyboard shortcuts that can speed up your routine network layout tasks.
After you click in the graph pane, a number of keyboard shortcuts are now available for functions that had previously been available in the visualization pane’s right-click menu. Now, you can press:
Ctrl+A to select all vertices and edges
Ctrl+V to select all vertices
Ctrl+E to select all edges
Ctrl+D to deselect everything
Ctrl+P to edit the properties of the selected vertices
Ctrl+C to save the graph image to the Windows clipboard
Ctrl+I to save the graph image to a file
Arrow key to move the selected vertices a small distance
Shift+arrow key to move the selected vertices a large distance.
(If you forget a shortcut, most of them are listed in the graph pane’s right-click menu.)
If you have any suggestions for other frequent tasks that could be accelerated with a keyboard command, please contact us on the NodeXL discussion board or here in the comments.
As mobile devices become a major method for authoring and consuming social media, location data is increasingly a part of many posts, tweets, check-ins, and messages. Many Twitter clients, for example, can add the user’s current latitude and longitude to the metadata associated with a tweet. Other systems like Facebook Places, Google Latitude and Foursquare encourage users to declare where they are to the world, often passing the information to other social media sites.
Using this location data in network analysis opens up a range of new opportunities. Instead of a person – to – person social network, location data allows people to be linked to places and, by extension, places can be linked to other places based on the patterns of connection people create when located in a particular place. A convergence of network analysis and Geographic Information Systems in underway. A great example of this can be found in this wonderful video from the BBC which demonstrates the idea by mapping the flow of telephone calls, texts, and data around the UK and the wider world.
Now, NodeXL (v.156) has the first of a series of features that will start to approximate the experience displayed in the video by supporting the import of location data about networks and plotting networks onto maps.
For now, we have started importing latitude and longitude data that Twitter makes available. If you check “Add a Tweet column to the Vertices worksheet” in NodeXL, Data, Import, From Twitter Search Network or From Twitter User Network, the Twitter user’s geographical coordinates will be added to the Vertices worksheet when they are available.
Note that not every tweet has a latitude and longitude, in fact many do not (yet). Further, note that not every latitude and longitude is accurate, many are not.
We need to implement more features for better location data support in a NodeXL workbook, but this is a start. We are interested in exploring geospatial networks and Twitter is a great data source. With this data in place we may look into features that emit KML files for exploration in other packages like Google Earth. A nifty Google Earth/Spreadsheet importer can take small sets (400) of location data points in a spreadsheet and export them to a KML file, something we could implement in the future as well. In addition we may be able to connect directly with services like Bing Maps and Google Maps to display connections between nodes with known locations. Metrics that calculate the distance between nodes seem sensible as well.
Location coordinates are the key to a cornucopia of related data about a place. Given a latitude and longitude it is possible to find the name of the city it is located in, look up data about that location in official records as well as resources like Wikipedia. Income, education, property values, weather, photos, and more can be pulled together from just a simple lat/long. All of these attributes could be used to cluster or illustrate the network visualization.
Clusters are now groups in NodeXL. Recently, the NodeXL team has been focused on a set of new features related to grouping sets of vertices together. In the previous version we released a feature that allowed all sorts of groupings to be recorded in the worksheet. What’s new is that the three clustering algorithms we have already provided are just one form of group, components (connected sets of vertices) are another, and user labeled sets are a third method of creating a group of nodes in NodeXL (this last feature is still pending). This release adds the ability to add vertices to a group and then collapse all of the vertices in that group to a metanode – a composite of all the nodes in that group. It is then possible to expand the collapsed vertices into the graph
These features are part of a larger effort to support time in which “time is but a group” – a set of nodes and edges present in a time slice. We are working on designs in which some groups are sequenced, allowing the user to move up and back through collections of vertices that may appear or disappear over different time slices/groups.
Here are the most recent features: 220.127.116.11 (2010-09-06)
After you group the graph’s vertices (NodeXL, Analysis, Groups), you can now select all the vertices in a group. Go to the Groups worksheet and click on a group name.
Once a group is selected, you can collapse it into a single vertex. Go to NodeXL, Analysis, Groups, Collapse Group. You can expand it again using Expand Group.
The Groups worksheet now includes a column that tells you how many vertices are in the group.
Bug fix: The NodeXL, Help, Check for Updates feature stopped working in version 18.104.22.168.
Bug fix: If you clicked NodeXL, Graph, Show Graph while editing a worksheet cell, you would get a message that started with “Unable to set the Hidden property of the Range class.”
This version introduces the concept of “vertex groups,” or “groups” for short. A group is a set of related vertices. All vertices in a group are shown with the same shape and color. Clusters are an example of groups.
The worksheets that used to be called “Clusters” and “Cluster Vertices” are now called “Groups” and “Group Vertices.”
The NodeXL, Analysis, Find Clusters button in the ribbon has been moved to a new NodeXL, Analysis, Groups menu.
You can now group vertices by connected components, meaning that each group of interconnected vertices will have the same shape and color. Go to NodeXL, Analysis, Groups, Find Connected Components.
You can now group vertices using the values in a column on the Vertices worksheet — all vertices with degree greater than 100 in one group, all vertices with degree greater than 50 in another, for example.
If you open an older NodeXL workbook in this new version of NodeXL, the Clusters and Cluster Vertices worksheets will be automatically renamed.
You cannot open a new NodeXL workbook in an older version of NodeXL. If you attempt to do so, you will get a message that starts with “This document might not function as expected because the following control is missing: Clusters.”
The NodeXL team has just released a new version (v.22.214.171.124) that contains a new “Automation” feature that allows users to define a collection of operations to perform on their network graphs and invoke the complete set in a single button click AND reuse that configuration on other workbook graphs. In fact, the feature will apply the configuration you define to all the files you specify, allowing easy processing of large collections of network data sets.
This week the feature is partially complete. Users can invoke the merge duplicate edges, calculate graph metrics, auto-fill columns, create sub-graph images, find clusters and show graph. These operations can require as many as dozens of clicks when performed manually. If you have dozens or hundreds of network data sets the result is a daunting case of repetitive strain injury and carpal tunnel syndrome. Instead, with automation, these operations can be carried out orders of magnitude more frequently without much pain!
The next release will feature the complete package which will then include control over the layout and graph options. As a result, automatically generated network visualizations can be produced in a pipeline: users will be able to specify a query using the NodeXL desktop network data collector and then automate the processing of large collections of data sets.
The result should be better analysis of time series data sets that have many “slices”. The feature points the way to additional development work for supporting the comparison between networks to evaluate their evolution.