When a particular node in a network is of special interest it can be useful to create a network visualization in which it is located at the center of concentric rings of vertices.
NodeXL supports a “Polar” layout in which each vertex has two values that govern its location: distance from center (“Vertex Polar R”) and the angle around the clock (“Vertex Polar Angle”).
Using a random network, we added two columns to the Vertices worksheet that we called: Ring (or “Vertex Polar R”) and Rotation (or “Vertex Polar Angle”). We then assigned values for the “Ring” and the “Rotation” for each Vertex:
These values can then be mapped to the location for each Vertex using the NodeXL Autofill columns feature:
When these values are applied to the network visualization and the layout is set to “Polar” the visualization repositions each vertex into a position around a ring. The values are set by mapping Vertex Polar R to “Ring” and Vertex Polar Angle to “Rotation” and then selecting “Autofill”. The result is a single ring plotted around a core:
To see this more clearly, I built a larger random network with 100 vertices and added two more “rings”. The resulting image looks like this:
It can be useful to create networks randomly to use as training data sets. A simple way to create a random network is to use the Excel RAND() function. To create a random network of ten nodes use the following formula in a ten by two array:
=INT(RAND()*10)
the “INT” part removes the decimal places, leaving an “Integer” whole number value.
Then copy the grid of Random Value 1 and Random Value 2 to the Vertex1 and Vertex 2 columns (starting in A3) (use Paste Values to just get the current values of the random number generator!). You can control the number of edges by copying the formula to as many rows as you want. You can control the number of nodes by changing the value the random result is multiplied by. The current formula multiplies by ten, creating up to 10 vertex values (after being reduced to an integer). Increase this value to 100 for that many different nodes. You could get much more sophisticated, creating weighted probabilities for each vertex to appear, but this is a first, simple way to “just get a network data set” fast.
When you show the graph, you may get a network visualization that looks somewhat like this:
The event has grown along with the importance of big data, analytics, BI, and data visualization.
I will speak about the ways social media networks can be collected and analyzer to reveal the key people, groups and topics relevant to a topical population.
NodeXL Map of PAWCON Connections in Twitter
Title: Charting Collections of Connections in Social Media: Creating Maps and Measures with NodeXL
Networks are a data structure common found across all social media services that allow populations to author collections of connections. The Social Media Research Foundation’s NodeXL project makes analysis of social media networks accessible to most users of the Excel spreadsheet application. With NodeXL, Networks become as easy to create as pie charts. Applying the tool to a range of social media networks has already revealed the variations present in online social spaces. A review of the tool and images of Twitter, flickr, YouTube, and email networks will be presented.
We now live in a sea of tweets, posts, blogs, and updates coming from a significant fraction of the people in the connected world. Our personal and professional relationships are now made up as much of texts, emails, phone calls, photos, videos, documents, slides, and game play as by face-to-face interactions. Social media can be a bewildering stream of comments, a daunting fire hose of content. With better tools and a few key concepts from the social sciences, the social media swarm of favorites, comments, tags, likes, ratings, and links can be brought into clearer focus to reveal key people, topics and sub-communities. As more social interactions move through machine-readable data sets new insights and illustrations of human relationships and organizations become possible. But new forms of data require new tools to collect, analyze, and communicate insights.
“This collection of Science@Microsoft vignettes illustrates some of the progress that has been made in a number of disciplines and describes the technologies that have been deployed to gain these new insights.”
The volume lists tools for scientific research and includes NodeXL:
NodeXL is a powerful and easy-to-use interactive network visualization and analysis tool that uses Microsoft Excel for representing generic graph data, performing advanced network analysis, and visual exploration of networks. NodeXL supports multiple social network data providers that import graph data (nodes and edge lists) into Excel. The import features of NodeXL explore social media by pulling data from personal email indexes on the desktop, Twitter, Flicker, YouTube, Facebook, and web hyperlinks.
NodeXL allows non-programmers to generate useful network statistics and metrics quickly and create visualizations of network graphs. Filtering and display attributes can be used to highlight important structures in the network.
During August 21-24, 2012 Summer Social Webshop gathered 55 students and 20 speakers for a week of presentations, discussions, and collaboration around the study and application of social media to social good. Sponsored by the U.S. National Science Foundation, the Social Media Research Foundation, and Grand, the Webshop brings doctoral students in computer science, iSchool, sociology, communications, political science, anthropology, psychology, journalism, and related disciplines together for 4-days of intensive workshop on Technology-Mediated Social Participation (TMSP).
Technology-Mediated Social Participation includes social networking tools, blogs and microblogs, user-generated content sites, discussion groups, problem reporting, recommendation systems, and other social media applied to national priorities such as health, energy, education, disaster response, political participation, environmental protection, business innovation, or community safety.
During the 4-day workshop, students attended presentations from an interdisciplinary group of leaders in the field and engage in other research and community-building activities like working on short-term projects, sharing research plans, developing new research collaborations, learning relevant software, analysis methods and data collection tools, and meeting Federal policy makers.
The Association for Education in Journalism and Mass Communication(AEJMC) is a nonprofit, educational association of journalism and mass communication educators, students and media professionals. The Association’s mission is to promote the highest possible standards for journalism and mass communication education, to cultivate the widest possible range of communication research, to encourage the implementation of a multi-cultural society in the classroom and curriculum, and to defend and maintain freedom of communication in an effort to achieve better professional practice and a better informed public.
Using NodeXL for Social Network Analysis
— 2 pm to 5 pm Presented by Communication Theory and Methodology Division This pre-conference workshop examines social network analysis. Social network analysis can be used to examine message boards, blogs, and friend networks (amongmany other phenomena). Participants will learn to use the NodeXL program to conduct a network analysis.
I gave a talk that describes ways of analyzing the social and semantic networks found in social media at a workshop at the Web Science conference. The event is a great collection of people interested in the exploration of complex systems and internet applications, particularly social applications. It describes itself as ” inherently interdisciplinary, integrating computer and information sciences, communication, linguistics, sociology, psychology, economics, law, political science, and other disciplines.”
Social media networks tend to be “clumpy”. Here is the map of connections among people who tweeted the term “global warming”:
NodeXL v.210 and newer now supports text analysis of content collected from social media data sources. NodeXL applies social network clustering and then analyzes text that is grouped by social clusters.
Connections among people who tweet about a topic, keyword or hashtag form patterns that can lead to the formation of sub-groups and clusters. Multiple clusters are formed within a network when a sub-population of people link to one another far more than to people in other groups. These regions of dense connections define the boundaries between sub-populations. Clusters often reflect the variation in interest in certain people and topics in the population. Some people and topics are more interesting to one group than others. Within these groups certain people and words get repeated more often than others.
Networks can be partitioned by many methods. NodeXL implements several. A collection of vertices can be grouped by the user by applying labels to the vertex worksheet (“Group by vertex attribute”). Or a group of vertices can be determined by an algorithm that looks for differences in the density of connections and divides by the points of least association (“Group by cluster algorithm”). Networks can also be grouped into separate isolated collections of nodes, called “connected components”.
In NodeXL groups can be visualized in multiple ways. Groups can be collapsed into meta-vertices that stand-in for the members of that group (right-click the graph pane and select “Groups>Collapse all groups”). Group members can also be displayed within a “box” with the “group-in-a-box” feature (found in the layout selection menu in the Graph Pane – select “Layout Options”).
Within each group is a population of people along with the tweets they authored in the time period captured by the data set. Each group has a collection of tweets that can be analyzed. The contents of all the tweets in a network can be scanned and certain types of strings can be counted to measure its frequency of mention. These counts can be repeated for each group, allowing groups to be contrasted based on the relative rates strings like URLs, hashtags, and @usernames. Here is a sample of the worksheet NodeXL creates to display all the data about people, URLs, and hashtags frequently mentioned in each group:
The worksheets offers top URLs, hashtags, and users across the entire network, and within each sub-group. The details offer insights into the people and topics of greatest interest.
This feature allows the content in sub-groups to be contrasted, thus answering the question: how is this sub-group the same or different from another sub-group?
On June 4th in Dublin, Ireland the 2012 International AAAI Conference on Weblogs and Social Media. ICWSM gathers computer scientists, linguists, communications scholars, and the social scientists to increase understanding of social media in all its incarnations. Now in its sixth year, ICWSM is a leading venue for cutting-edge research in social media.
ICWSM-12, features a program of workshops, tutorials, contributed technical talks, posters and invited presentations. The main conference features keynote talks from prominent social scientists and technologists.
Andrew Tomkins is an engineering director at Google working on measurement, modelling, and analysis of content, communities, and users on the World Wide Web. Prior to joining Google, he spent four years at Yahoo! as chief scientist of search, and eight years at IBM’s Almaden Research Center, where he co-founded the WebFountain project. Andrew holds Bachelors degrees in Math and CS from MIT, and a PhD in CS from Carnegie Mellon University; he has published over a hundred technical papers.
Patrick Meier is a recognized expert and thought leader on the intersection between new technologies, crisis early warning, humanitarian response and human rights. He is the co-founder of the International Network of Crisis Mappers and previously co-directed Harvard University’s Program on Crisis Mapping and Early Warning. Over the past 10 years, Patrick has consulted extensively with several international organizations including the UN, OSCE and OECD in Africa, Asia and Europe. Patrick is also a distinguished scholar completing his PhD at The Fletcher School during which time he was a Doctoral Fellow at Stanford University. In 2010, President Bill Clinton publicly thanked him for his leadership and contributions. He blogs at iRevolution.net.
Lada A. Adamic is an associate professor in the School of Information and the Center for the Study of Complex Systems at the University of Michigan. She is also affiliated with EECS. Her research interests center on information dynamics in networks: how information diffuses, how it can be found, and how it influences the evolution of a network’s structure. Her projects have included identifying expertise in online question and answer forums, studying the dynamics of viral marketing, and characterizing the structure in blogs and other online communities. She has received an NSF CAREER award, and best paper awards from Hypertext ’08, ICWSM-10 and ICWSM-11, and the most influential paper of the decade award from Web Intelligence ’11.
“The goal of the workshop is to bring together researchers and industry practitioners interested in visual and interactive techniques for social media analysis, particularly in social sciences and humanities as well as in industry and to discuss ideas, techniques, and applications to support social media analysis.”
I will present a tutorial on Social Media Network Analysis with NodeXL on June 4th at the event:
Networks are a data structure common found across all social media services that allow populations to author collections of connections. The Social Media Research Foundation’s NodeXL project makes analysis of social media networks accessible to most users of the Excel spreadsheet application. With NodeXL, Networks become as easy to create as pie charts. Applying the tool to a range of social media networks has already revealed the variations present in online social spaces. A review of the tool and images of Twitter, flickr, YouTube, and email networks will be presented.
This network graph represents a network of 29 Twitter users whose recent tweets contained “icwsm”. The network was obtained on Saturday, 21 April 2012 at 20:33 UTC. There is an edge for each follows relationship. There is an edge for each “replies-to” relationship in a tweet. There is an edge for each “mentions” relationship in a tweet. There is a self-loop edge for each tweet that is not a “replies-to” or “mentions”. The earliest tweet in the network was tweeted on Saturday, 14 April 2012 at 18:55 UTC. The latest tweet in the network was tweeted on Saturday, 21 April 2012 at 05:48 UTC.
The graph is directed.
The graph’s vertices were grouped by cluster using the Clauset-Newman-Moore cluster algorithm.
The graph was laid out using the Harel-Koren layout algorithm.
The edge colors are based on relationship values. The vertex sizes are based on followers values.
Top 10 Vertices, Ranked by Betweenness Centrality:
@icwsm
@johnbreslin
@IBMResearch
@CaptSolo
@marc_smith
@bde
@karenchurch
@imbenzene
@hemant_Pt
@_akisato Overall Graph Metrics:
Vertices: 29
Unique Edges: 68
Edges With Duplicates: 32
Total Edges: 100
Self-Loops: 18
Connected Components: 5
Single-Vertex Connected Components: 4
Maximum Vertices in a Connected Component: 25
Maximum Edges in a Connected Component: 96
Maximum Geodesic Distance (Diameter): 3
Average Geodesic Distance: 1.866455
Graph Density: 0.082512315270936
Modularity: 0.2488
This two-volume encyclopedia provides a thorough introduction to the wide-ranging, fast-developing field of social networking, a much-needed resource at a time when new social networks or “communities” seem to spring up on the internet every day. Social networks, or groupings of individuals tied by one or more specific types of interests or interdependencies ranging from likes and dislikes, or disease transmission to the “old boy” network or overlapping circles of friends, have been in existence for longer than services such as Facebook or YouTube; analysis of these networks emphasizes the relationships within the network. The Encyclopedia of Social Networks offers comprehensive coverage of the theory and research within the social sciences that has sprung from the analysis of such groupings, with accompanying definitions, measures, and research.
Featuring approximately 350 signed entries, along with approximately 40 media clips, organized alphabetically and offering cross-references and suggestions for further readings, this encyclopedia opens with a thematic reader’s guide in the front that groups related entries by topics. A chronology offers the reader historical perspective on the study of social networks. This two-volume reference work is a must-have resource for libraries serving researchers interested in the various fields related to social networks, including sociology, social psychology and communication and media studies.
Know who is becoming more important than know how. Networks are a data structure common found across all social media services that allow populations to author collections of connections. Innovation networks are created when new connections form among people who have a portion of a solution.
The Social Media Research Foundation‘s NodeXL project makes analysis of social media networks accessible to most users of the Excel spreadsheet application. With NodeXL, Networks become as easy to create as pie charts. Applying the tool to a range of social media networks has already revealed the variations present in online social spaces. A review of the tool and images of Twitter, flickr, YouTube, and email networks will be presented. In particular, innovation topics will be mapped to highlight the key people and groups talking about new ideas and opportunities.
I will speak about the results of collecting, analyzing and visualizing the collections of connections that form in political discussions in social media.
For example, this is a map of the connections among the people who recently tweeted about Scott Walker.
The graph represents a network of up to 1000 Twitter users whose recent tweets contained “scott AND walker”. The network was obtained on Friday, 13 April 2012 at 07:40 UTC. There is an edge for each “replies-to” relationship in a tweet. There is an edge for each “mentions” relationship in a tweet. There is a self-loop edge for each tweet that is not a “replies-to” or “mentions”. The earliest tweet in the network was tweeted on Thursday, 12 April 2012 at 03:32 UTC. The latest tweet in the network was tweeted on Friday, 13 April 2012 at 04:12 UTC. [Read more →]