Encyclopedia of Social Network Analysis

My colleague George Barnett has edited the Encyclopedia of Social Network Analysis.

I contributed four entries with co-authors:

WWW Hyperlink Networks

with Robert Ackland, Australian National University

Email networks

with Derek Hansen, Brigham Young University

Blog networks

with John Kelly, Morningside Analytics, Harvard Berkman Center

Facebook networks

with Bernie Hogan, Oxford Internet Institute


This two-volume encyclopedia provides a thorough introduction to the wide-ranging, fast-developing field of social networking, a much-needed resource at a time when new social networks or “communities” seem to spring up on the internet every day. Social networks, or groupings of individuals tied by one or more specific types of interests or interdependencies ranging from likes and dislikes, or disease transmission to the “old boy” network or overlapping circles of friends, have been in existence for longer than services such as Facebook or YouTube; analysis of these networks emphasizes the relationships within the network. The Encyclopedia of Social Networks offers comprehensive coverage of the theory and research within the social sciences that has sprung from the analysis of such groupings, with accompanying definitions, measures, and research.

Featuring approximately 350 signed entries, along with approximately 40 media clips, organized alphabetically and offering cross-references and suggestions for further readings, this encyclopedia opens with a thematic reader’s guide in the front that groups related entries by topics. A chronology offers the reader historical perspective on the study of social networks. This two-volume reference work is a must-have resource for libraries serving researchers interested in the various fields related to social networks, including sociology, social psychology and communication and media studies.

2010 Workshop on Information in Networks, September 24-25 at NYU

The Second Workshop on Information in Networks
September 24-25, 2010, New York City

Sponsored in part by the Initiative on Information in Networks
Organizers: Sinan Aral, Foster Provost, Arun Sundararajan

The second Workshop on Information in Networks (WIN10) will be held this year September 24-25, 2010, again in New York City. From the program description:

“Last year’s workshop brought together a small yet influential community around topics that at their core involve ‘information in networks‘—its distribution, its diffusion, its value, and its influence on social and economic outcomes. Scholars from fields as diverse as computer science, economics, information systems, marketing, physics, political science and sociology came together to lay the foundation for ongoing relationships and to build a multidisciplinary research community. This year’s workshop will build on this foundation toward bringing more innovative content and vibrant discussion to the forum. Speakers will share their recent research, which may have been published elsewhere, but which may not be widely known outside of their own disciplines. The workshop will combine invited and contributed talks with poster presentations selected from a pool of submitted abstracts. We hope the energy of New York City will inspire the gathering, and that our participants will leave with new ideas and a renewed sense of community.”

Ben Shneiderman and Jenny Preece will speak about their work on social media applied to national priorities with a talk titled: “Promoting National Initiatives for Technology-Mediated Social Participation“.  The talk includes their work creating NSF workshops on Technology-Mediated Social Participation (www.tmsp.umd.edu), the paper Reader-to-Leader Framework: Motivating technology-mediated social participation (which appeared in the AIS Transactions on Human-Computer Interaction in March 2009), and recent work with the Encyclopedia of Life (www.eol.org), and  NodeXL projects.  Here is the abstract.

WIN10 speakers include:
Ron Burt, University of Chicago
Nicholas Christakis, Harvard University
Nathan Eagle, MIT
Sanjeev Goyal, Cambridge University
Matthew Jackson, Stanford University
Jenny Preece, University of Maryland
Ben Shneiderman, University of Maryland
Tony Jebara, Columbia University
David Jensen, University of Massachussetts
Michael Kearns, University of Pennsylvania
Rachel Kranton, Duke University
David Lazer, Northeastern University
Mark Newman, University of Michigan (tentative)
Alex Sandy Pentland, MIT
Alessandro Vespignani, Indiana University
Stanley Wasserman, Indiana University
Duncan Watts, Yahoo! Research

ICWSM 2010 Liveblog, Day 2

Fourth International AAAI Conference on Weblogs and Social Media (ICWSM-10)

***Microblogging 2***

Predicting Elections with Twitter: What 140 Characters Reveal about Political Sentiment (Tumasjan et al.)

Successful use of social media in las presidential campaign has established twitter as an integral part of political campaign toolbox

Goal: analyze on Twitter: 1. Deliberation, 2. Sentiment, 3. Prediction

Previous work:

Deliberation: Honeycutt and Herring – Twitter not only used for one-way comm, but 31% of all tweets direct a specific addressee. Kroop and Jansen – political internet discussion boards dominated by small # of heavy users

Sentiment: How accurately can Twitter inform us about the electorate’s political sentiment?

Prediction: can Twitter serve as a predictor of the election result?

Data: examined more than 100k tweets and extracted their sentiment using LIWC

Target: German federal election 2009


1. While Twitter is used as a forum for political deliberation on substantive issues, this forum is dominated by heavy users

Two widely accepted indicators of blog-based deliberation:

-The exchange of substantive issues (31% of all messages contain “@”),

-Equality of participaion: While the distribution of users across groups is almost identical with the one found on internet message boards, we find even less equality of participation for the political debate on Twitter. Additional analyses have shown users to exhibit a party-bias in the volume and sentiment of messages.

2. The online sentiment in tweets reflects nuanced offline differences between the politicians in our sample.

LIWC profiles:

-Leading candidates: Very similar profile for all leading candidates, only polarizing political characters, such as liberal leader and socialist, deviate in line with their roles as opposition leaders. Messages mentioning Steinmeir (coalition leader) are most tentative

3. Similarity of profiles is a plausible reflection of the political proximity between the parties

Key findings: high convergence of leading candidates, more divergence among politicians of governin grand coalition than among those of a potential right wing coalition

4. Activity on Twitter prior to election seems to validly reflect the election outcome (MAE 1.65%), and joint party mentions accurately reflect the political ties between parties.

From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series (Brendan O’Connor)

Continue reading

Data Bank or Data Pimp: choosing the future of social media repositories

The Key Bank Vault door or http://www.flickr.com/photos/cambodia4kidsorg/2274922356/?

Are social media sites data banks, secure repositories of personal assets, or data pimps, soliciting intimate exposure for profit?

I think these services need to choose.  I notice that the setting for who can see what in various systems is in flux.  I can set something to private today and may have to reset it keep it private later.

When I upload content to a site, shouldn’t the expectation be that the deposit is governed by the terms at the time of the contribution?  Why should terms change after I upload?  At least, shouldn’t new rules apply only to new content or content explicitly that has had permissions altered.

Banks do lend out the money I provide them, but only in an anonymous way.  No one knows my dollars are in their mortgage or car loan.  Only legally authorized entities can see my banking records (or so I hope).

Data pimps seem to want to give away anything I give up.  They sell my data as quickly and for as much as possible.

Banks have now developed a reputation that does not make them a great contrast for data pimps, but they still try to represent values like security, confidentiality, and reliability.

I have personally assumed that all data I upload is public.  Only my pictures of my kids have been made “private” and I would not be surprised if those pictures ultimately become public.

Photo credit: cambodia4kidsorg

Bernie Hogan’s Facebook Social Network Data Provider and Visualization toolkit

My colleague at the Oxford Internet Institute, Bernie Hogan, is working on tools that collect personal Facebook network data and visualize the connections among your friends.  These tools now interoperate with NodeXL through the GraphML XML file format. Here is the new link: http://namegen.oii.ox.ac.uk/fb/downloadNet.php?type=graphml

Here is an example: http://twitpic.com/9rvfq

2009 - September - Bernie Hogan - Facebook Network Visualization

It provides a good illustration of the ways a person’s social network is clumped into clusters built around life phases, workplaces, educational institutions, teams and locations.  As people move through more of these stages of life during the Facebook era (and often before) they accumulate these clusters.

Facebook or other contact and friend management systems might could leverage this clustering to organize the presentation of contact information streams.

Bernie recently announced on the SOCNET list that he has updated his script for downloading your Facebook network.


1. Its faster. (Presently orders of magnitude faster than Nexus, Touchgraph or ORA).
2. It gives nice feedback during the download.
3. It has less bugs!
4. It gives you the output as a file you can right-click and save rather than copy-paste.
5. IDs are names.”

Bernie writes that phase two of his project is underway.

Bernie is planning a demo at the Sunbelt social network analysis conference in Italy in 2010.

Bernie is the author of the Facebook chapter in our forthcoming book Analyzing Social Media Networks with NodeXL: Insights from a connected world available from Morgan-Kaufmann in July 2010.

Liveblogging ICWSM 2009 – Day 1

2009 ICWSM in San Jose

[Vladimir Barash is liveblogging the ICWSM conference]
9-10AM: A Tempest: Or, on the Flood of Interest in Sentiment Analysis, Opinion Mining, and the Computational Treatment of Subjective Language (Lillian Lee)

-Sentiment analysis using discussion structure: clasify speeches in US congressional floor debates as supporting or opposing proposed legislation -Individual doc classifier -agreement (degree) classifier for pairs of speeches

-Agreement info allows COLLECTIVE CLASSIFICATION – “agreeing speeches should get the same label”

-ECON: debate about effect of sentiment on sales
-comScore (users willing to pay 20-99% more for 5 star item vs. 4 star item)
-Jury is still out

-SOC: What opinions are influential? (Niculescu-Danescu Muzyl et al.)
-Prior work has focused on features of text and has not been in context of sociological aspects of reviews
-look at helpfulness scores

Continue reading

Social Networks in the News at NYT

My colleague Scott Sargent at Telligent notes that there are two sections of the March 29th Sunday New York Times feature articles illustrated with network graphs.  The Business section runs an article “Is Facebook Growing Up Too Fast? (http://www.nytimes.com/2009/03/29/technology/internet/29face.html) and the Style Section has an article on The Celebrity Twitter Ecosystem.

20090329 NYT Facebook Ego Networks

20090329 NYT Facebook Ego Networks

My colleague Prof. Ben Shniederman is positively impressed by these images.  He writes, “Notice how the node layout remains stable as edges are removed, so by the 4th figure the edges can all be followed easily….”.  This is one of the themes he highlights in his paper and presentations about problems and improvements in network graph drawing (see: http://www.cs.umd.edu/hcil/nvss/and in particular http://www.cs.umd.edu/hcil/pubs/presentations/NVSS-3.ppt). Prof. Shniederman’s  5th edition of Designing the User Interface is now available with two full chapters on the website with wordles to open each chapter.

A somewhat related article ran the same day in the Style section on The Celebrity Twitter Ecosystem (http://www.nytimes.com/2009/03/29/fashion/29twitter.html). This image focused on the linkages between well known people using Twitter and, by extension, revealing who they follow and who follows them in the social network.

2009 -03- 29 - NYT - Twitter Ecosystem

In the first image no names are associated with the nodes, in the second the names are the major point of the diagram.

The practice of “anonymization” of network graphs may be moot in light of a recent publication mentioned on the Social Network Analysis email list (SOCNET) by Mark Round from QinetiQ of a paper:

Deanonymizing Social Networks – Arvind Narayanan & Vitaly Shmatikov

which suggests that just publishing the unique pattern of links around an individual is sufficient to identify them in an otherwise anonymized data base.

Operators of online social networks are increasingly sharing
potentially sensitive information about users and their relationships
with advertisers, application developers, and data-mining researchers.
Privacy is typically protected by anonymization, i.e., removing names,
addresses, etc.
We present a framework for analyzing privacy and anonymity in social
networks and develop a new re-identification algorithm targeting
anonymized social-network graphs. To demonstrate its effectiveness on
real-world networks, we show that a third of the users who can be
verified to have accounts on both Twitter, a popular microblogging
service, and Flickr, an online photo-sharing site, can be re-identified
in the anonymous Twitter graph with only a 12% error rate.
Our de-anonymization algorithm is based
purely on the network topology, does not require creation of a large
number of dummy “sybil” nodes, is robust to noise and all existing
defenses, and works even when the overlap between the target network
and the adversary’s auxiliary information is small.