Abstract: Communities in social networks emerge from interactions among individuals and can be analyzed through a combination of clustering and graph layout algorithms. These approaches result in 2D or 3D visualizations of clustered graphs, with groups of vertices representing individuals that form a community. However, in many instances the vertices have attributes that divide individuals into distinct categories such as gender, profession, geographic location, and similar. It is often important to investigate what categories of individuals comprise each community and vice-versa, how the community structures associate the individuals from the same category. Currently, there are no effective methods for analyzing both the community structure and the category-based partitions of social graphs. We propose Group-In-a-Box (GIB), a metalayout for clustered graphs that enables multi-faceted analysis of networks. It uses the treemap space filling technique to display each graph cluster or category group within its own box, sized according to the number of vertices therein. GIB optimizes visualization of the network sub-graphs, providing a semantic substrate for category-based and cluster-based partitions of social graphs. We illustrate the application of GIB to multi-faceted analysis of real social networks and discuss desirable properties of GIB using synthetic datasets.
The paper is authored by:
Eduarda Mendes Rodrigues*, Natasa Milic-Frayling†, Marc Smith‡, Ben Shneiderman§, Derek Hansen¶
* Dept. of Informatics Engineering, Faculty of Engineering, University of Porto, Portugal – eduardamr @ acm.org
† Microsoft Research, Cambridge, UK -natasamf @ microsoft.com
‡ Connected Action Consulting Group, Belmont, California, USA – marc @ connectedaction.net
§ Dept. of Computer Science & Human-Computer Interaction Lab, University of Maryland, College Park, Maryland, USA – ben @ cs.umd.edu
¶ College of Information Studies, University of Maryland, College Park, Maryland – dlhansen @ umd.edu
A map of the connections among the people who recently tweeted #SocialCom2011:
Connections among the Twitter users who recently tweeted the word #socialcom2011 when queried on October 10, 2011, scaled by numbers of followers (with outliers thresholded). Connections created when users reply, mention or follow one another.
Layout using the “Group Layout” composed of tiled bounded regions. Clusters calculated by the Clauset-Newman-Moore algorithm are also encoded by color.
NodeXL is extendable. 3rd Party developers have been building data providers that can plug into NodeXL that connect the network visualization tool to sources of network data. We now have three providers of extensions to NodeXL: VOSON for WWW hyperlink networks, the Exchange Spigot for NodeXL for extracting enterprise email networks, and the Facebook Spigot for NodeXL that extracts your own Facebook network for analysis and visualization!
We welcome additional data provider projects! Have a network? Connect it to NodeXL with the simple directions listed here.
Import hyperlink networks into NodeXL with the VOSON System — a web-based software incorporating web mining, data visualisation, and traditional empirical social science methods (e.g. social network analysis, SNA). http://voson.anu.edu.au/node/13#VOSON-NodeXL
These are the connections among the people who recently tweeted the term “ASA2011” on 18 August 2011.
Several papers and panels related to the sociology of the internet will take place:
Saturday, Aug 20 – 2:30pm – 4:10pm
124. Section on Sociology of Law Paper Session. Privacy in the Digital Age: Law, Culture, and Contention I (co-sponsored with the Section on Collective Behavior and Social Movements and Section on Communication and Information Technology)
Monday, Aug 22 – 8:30-9:30AM Roundtables
338. Section on Communication and Information Technology Roundtable Session
Monday, Aug 22 – 9:30-10:10AM Business and Awards Ceremony
(immediately follows roundtables)
Monday, Aug 22 – 10:30AM – 12:10PM
376. Section on Communication and Information Technology Invited Session. Social Media in Community Action and Social Change
Monday, Aug 22 – 2:30pm – 4:10pm
419. Section on Communication and Information Technology Paper Session. New Media Frontiers: Youth and New Media
Monday, Aug 22 – 5:00-7:00PM, Section Reception, hotel suite, Caesars Palace
Location to be announced at all CITASA sessions and meetings.
Bits exist along a gradient from private to public. But in practice they only move in one direction.
Thus, there are two destinies for information: public or oblivion.
Information wants to be copied.
This is not the same as information wanting to be free (or expensive), or information wanting *you* to be free. Information probably prefers to be free because it may increase the rate at which it is copied, not because it is inherently liberating to the user. In fact, the “free” quality of some information is probably not liberating at all. Copying and liberty are orthogonal.
Information diffuses over time: access rights to information can expand over time, but only rarely (ever?) does data become less available, and once available publicly, information is almost never entirely private again.
With enough copies on enough devices, information becomes essentially public. The state of being public may come in degrees, some things are more public than others. Much information is public in principle but enjoys security by obscurity. Obscurity is eroded by increasing availability of computing resources that make collection and machine analysis affordable at large scales. The banality of data is no protection. “No one cares what I think/do/say/click” is not a valid assumption. In aggregate the banal is data and fuel to many business models. Maybe no one *cares* what you tweet, click, buy or search for, but many businesses make it their business to aggregate these scattered faint signals and build detailed profiles to drive commerce and customized views of data.
Some information is destroyed, never to be recovered. This is the only way information can avoid eventually (potentially) becoming public. But less and less data now meets this fate. Delete is a declining feature of many systems.
Information that is not public and has not yet been destroyed is just waiting to change to either state.
Despite security systems, many private bits are eventually exposed by people passing material to someone else who then accidentally makes them public, or they do so unintentionally themselves by leaving files in publicly accessible locations that are visited by search engine spiders and other web crawlers. Even professionally managed private data repositories are subject to subsequent distribution, infiltration or error. Data spills are becoming more common. Billions of records are hemorrhaged into the public regularly. If well funded organizationscannot secure their information, the rest of us should take note.
It may not be possible for big organizations or any organization to secure their networks, or even do so sufficiently effectively to give users a practical period of privacy, however short. Eventually private bits, even when encrypted (no matter how well), become public because the march of computing power makes their encryption increasingly trivial to break and their exchange over networks (no mater how well secured) is subject to leaking, intentional and otherwise. Private bits may only have a “half-life” during which they retain their non-public existence. The length of this half-life may itself be getting shorter. Mary Branscome suggests that there could be a physical law in operation: the natural entropy of access control lists?
All bits that persist are destined to be public, and once public never to be private again. Unless they are destroyed.
I argue that the only bits that you cannot find are the ones you need right now. The only bits you cannot get rid of are the ones that are most embarrassing to you right now. Just because you cannot find the bits you want does not mean that no one else can find those bits.
This issue is getting more important as we are invited to use systems that promise selective sharing of data and other tools generate ever more data to potentially share. Anything that puts your bits into the cloud promises selective sharing. I believe and hope my much beloved Dropbox account is separate from all the others, except for the one’s I chose to share with. And I think it is, expect for that glitch they had, the details of which elude me (but I think we’re good now, and I so depend on Dropbox I do not know what I would do without it). But all these walls are just made out of a few lines of business logic and an Access Control List. ACLs rule our access to digital objects with an iron fist until they don’t for the many human and technical reasons mentioned. Like most human infrastructures these selective sharing mechanisms are subject to failure and attack.
Now new sources of data captured from the details of everyday life by sensors and services are increasingly recorded by external systems and by people themselves, generating new streams of archival material that is richer than all but the most obsessively observed biographies.
Some steps are still in progress: when my phone notices your phone a new set of mobile social software applications become possible as whole populations capture data about other people as they beacon their identities to one another. Additional sensors will collect ever more medical data with the intent of improving our health and safety, as early adopters in the “Quantified Self” movement make clear.
But the consequences of data diffusion are becoming difficult to predict. Social media systems are being linked to one another to enable cascades of events to be triggered from a single message as status updates are passed among Facebook, LinkedIn, Twitter, and blogs. Tools now automatically aggregate the results of searches and post articles that themselves may trigger other events. Taking a photo or updating a status message can now set off a series of unpredictable events.
Add potential improvements in audio and facial recognition and a new world of continuous observation and publication emerges. Some benefits, like those displayed by the Google Flu tracking system, illustrate the potential for insight from aggregated sensor data. More exploitative applications are also likely.
Therefore, all services that promote the idea of “selective sharing” are selling a myth. The more you trust that information you generate can be contained, the more potential there is for an “explosive decompression” as data intended for an individual or a small group becomes suddenly available to a large group or a complete population. Private bits are in a state of high potential energy, always poised to become public.
Engineering is the science, art and practice of containing and directing forces. Information system engineers might be up to the challenge of delivering selective sharing. And when combined with law, regulation and social practices, technology could make selective sharing real the way that engineers manage the flow of powerful but dangerous flows of high pressure steam through power plants. However, recently even high pressure steam engineers working with nuclear fuels have faced some very bad failure conditions beyond their predicted scope. Information technologists may face analogous issues when managing high pressure containers of selectively shared information.
My policy is not to give up all forms of privacy, I still keep my email and other data behind passwords that I do not (knowingly) share. I share lots of pictures on flickr but not all of them are public. I would prefer to keep lots of financial, medical, and personal stuff selectively shared. I’d like these features to work.
But I have started to understand that my data is likely to be open to others, if not now then some day — and probably sooner than I expect. The net/cloud holds a good sized and growing chunk of my digital life and I would like selective sharing features (if I could handle the cognitive tax of managing them). I just do not believe it is a reasonable expectation. In a world of increasing interconnection and unifying name and search spaces, data may not be something you can keep local for long.
Tools that suggest that we can reliably segregate content and limit its diffusion are suggesting that water does not roll down hill. Those who believe that are likely to get wet.
Starting in version .165 of NodeXL we have supported the idea of an options file that can be imported, exported and exchanged among users.
If you have set all the knobs and dials of your copy of NodeXL just right, you can export these adjustments and configurations into a single file. Use the NodeXL>Options>Export feature to create a named file containing your settings. You can now exchange this file with others. If you receive an options file, you can use the NodeXL>Options>Import feature to pick it out from the file system and set your copy of NodeXL to the settings defined in that file.
If you use the related NodeXL>Options>Use Current for New feature you can set the defaults for NodeXL to the settings contained in any imported options file.
Summer Social Webshop
on Technology-Mediated Social Participation University of Maryland, College Park August 23-26, 2011
Several years ago a program at the University of Maryland called “Webshop” (Web Workshop) was organized by Professor John Robinson and held for three consecutive Summers. I visited and spoke at two of these events and know many people who attended or spoke at one or more and remember the event enthusiastically. The students who attended include some of the now leading researchers in the field of social science studies of the internet. There is an impressive alumni list.
The last Webshop was held in 2003 and many years and significant changes have occurred in the time since. Twitter, Facebook, StreetView, iPad, FourSquare, Android, Kinect, EC2, Mechanical Turk, Arduino, were all new or non-existent when the first Webshops were run. Today we have more reason than ever to focus on the details and patterns of computer-mediated human association. Ever more people channel more of their communications with others through more digital media, often of the social kind. A new data resource for the social sciences is growing in scale and promise: from billions of events it is possible to start to build a picture of an aggregate whole, and to start to grasp the terrain and landscape of social media.
The Summer Social Webshop (@Webshop2011) is happening again! With the generous support of the National Science Foundation and additional assistance from Google Research, this August 23-26 at the University of Maryland, College Park, a group of students will hear and engage with more than two dozen leading researchers exploring digital social landscapes from a variety of perspectives. Organized by a collaboration between the University of Maryland’s Human Computer Interaction Laboratory (HCIL), the College of Information Studies, the Sociology and Computer Science Department, and the Social Media Research Foundation, the event will gather students from a wide range of disciplines to get a concentrated dose of advanced efforts to gather data from social media and people’s understanding and practices around digital technologies. Doctoral students in computer science, iSchools, sociology, communications, political science, anthropology, psychology, journalism, and related disciplines are invited to apply to attend this summer’s 4-day intensive workshop on Technology-Mediated Social Participation (TMSP). The workshop explores the many ways social media can be applied to national priorities such as health, energy, education, disaster response, political participation, environmental protection, business innovation, or community safety. The workshop should be of interest to graduate students at US universities studying social-networking tools, blogs and microblogs, user-generated content sites, discussion groups, problem reporting, recommendation systems, mobile and location aware media creation, and other social media.