The Social Media Clarity podcast has just started with an initial 15 minute show devoted to analysis and advice on social media platforms and product design (http://bit.ly/12Edwvx), moderated by Randy Farmer (@frandallfarmer) and with Bryce Glass (@bryceglass) and myself. We plan to talk about a range of topics related to social media, online community, reputation systems, incentive design, analytics, visualization, and collective participation. I hope you will give a listen to the first show, Randy did a great job getting a quality recording and editing a polished package.
This episode we talk about changing rules for access to the data we all put into the cloud. The discussion is related to this post “Over the edge: Twitter API 1.1 makes “Follows” edges hard to get” – documenting the impact of changes in access to data through social media APIs. The implication from the discussion is that business may need to start building their own social network data sets since they cannot rely on cloud platforms to guarantee their access to their own data.
We plan to produce more episodes, please let us us know what topics you’d like to hear discussed.
This October 18, 2013 an NSF funded workshop called Kredible.Net to be held at Stanford University will bring together researchers studying reputation and social roles in social media.
The workshop will help researchers investigate how social media, especially Wikipedia articles and editors, shape public knowledge. The project aims to build a research community and to propose a research agenda for the study of reputation and authority in informal knowledge markets, such as Wikipedia.
I spoke about my concerns with the continued belief in selective sharing. I argue at this TedX Bay Area talk that it is unwise to expect that digital information systems are capable of privacy or selective sharing. In other words, it is a dangerous myth to believe in a feature that in practice fails regularly and by design. In fact, it seems that it is practically impossible to create any digital information system that is secure.
In such a world we may want to reconsider our sharing practices, particularly if they were built on the idea of selective sharing. If any of your digital information is something you would rather not share publicly, you may want to rethink the idea that you can keep your information private.
If you are building an information system, you may want to rethink the idea that you can offer selective sharing in a reliable form.
Thanks to the folks at TedX Bay Area, particularly Tatyana Kanzaveli for the opportunity to work out these thoughts and share them.
There will be a 3 hour hands-on tutorial on the use of NodeXL and broadly on social media network analysis. We will focus on the process of collecting, storing, analyzing, visualizing, and sharing insights into social media network graphs.
Many thanks to Maksim Tsvetovat (@maksim2042) for arranging the location.
Hello! Each Thursday at 10AM to noon (Pacific Time), I will be taking questions and providing support to NodeXL users in a Google Hangout. Join me for a Q&A about NodeXL, SNA, Social Media, Networks, Mapping, Visualization and Analytics.
July 28 – August 1, 2013
DSST 2013 Digital Societies and Social Technologies Summer Institute: NodeXL Training University of Maryland — College Park, Maryland USA
I will be teaching a workshop on Thursday August 1st on using NodeXL for social media network analysis at the upcoming 2013 Digital Societies and Social Technologies Summer Institute at the University of Maryland. The Institute is devoted to training researchers in methods and theory that can help frame research into the social impacts of information technology:
MOOCs, Education and learning; personal health and well-being; open innovation, eScience, and citizen science; co-production, open source, and new forms of work; cultural heritage and information access; energy management and climate change; civic hacking, engagement and government; disaster response; cybersecurity and privacy – these are just a few problem domains where effective design and robust understanding of complex sociotechnical systems is critical. To meet these challenges a trans-disciplinary community of scholars has come together from fields as wide ranging as CSCW, HCI, social computing, organization studies, information visualization, social informatics, sociology, information systems, medical informatics, computer science, ICT for development, education, learning science, journalism, and political science.
I spend a lot of my time studying social media and the networks that form in them. But I have growing doubts about the time I spend on commercial services. Despite seeming like public spaces, these services are really not public.
Social media is increasingly the space in which public life takes place. News, debates and discussions are more likely to take place now in Facebook, Twitter, and other social media services than in public squares, civic buildings, or community centers. Virtual public spaces fill the void created by the lack of public spaces and places in our cities and towns that allow for public mixing and interaction. But virtual public spaces are just that: virtual. They are not real public spaces, and the “virtual” public space they provide is not “as if” or even better than the real thing. Virtual public space lacks many of the features of real public space and is not an upgrade over the real thing.
Virtual public spaces try to seem like public spaces, but they are like shopping malls: commercial spaces that encourage only a subset of public behaviors. Raised in commercial spaces that have replaced public spaces, many people no longer even imagine behaviors that are not welcome in a mall. Protest, petitions, organizing, and protected speech have no place in a shopping mall. Some property owners allow some forms of speech, but no one but the owners have a “right” to speech in a mall. Shoppers, consumers, guests, customers, and visitors are not citizens while they are in a commercial space.
Virtual public spaces are not public spaces, but as we spend our public time in them, we drain the life from alternative public spaces. Our collective chatter in social media becomes the intellectual property of a company not a commonly owned public asset. Our history is not our history.
Social media services vary in terms of how open or restrictive they provide data generated by their users.
Some services, like Wikipedia, are very open, offering many methods to access large and small amounts of data from recent or historical times.
Some services, like LinkedIn, are very closed, offering almost no access to any data from their service.
For many services, the lack of access to data is not an ideological choice, rather it is a practical issue related to the costs associated with storing and serving large volumes of data. These companies are well within their rights to do as they like with their data and business plans.
However, their data is actually my data (and your data). We may soon realize that we prefer to commit our bits to repositories that hold and redistribute our content on terms that support civic goals of open access. What we need are credible alternatives to these services, with alternative funding models: perhaps a “Public Bit Service” or “National Public Retweet”?
The long awaited (and delayed) change to the Twitter API is now here: API 1.1 is now the only service available, the long used API 1.0 is gone.
This has an impact on people who have been collecting and analyzing data from Twitter. Twitter has given and taken away with the new 1.1 API. Mostly taken away. More Tweets are sometimes available from the new API, up to 18,000 rather than the old 1,500 tweet limit. This is a big change, but normal users often do not get much benefit from the limit increase if the topic they are interested in has fewer tweets. The length of time tweets are retained and served is not much longer than it was.
The big change is the effective loss of the “Follows” edge. Some users of the 1.0 API used to be able to get a significant number of queries that asked about who each user followed. These queries generated data that allowed a network to be created based on which users followed which other users. The “Follows” network in Twitter has been very informative, pointing to the key people and groups in social media discussions. But now the “Follows” edge will be effectively impossible to use.
Twitter API 1.1 changes the limit on the number of queries about who follow who in Twitter to 60 per hour. In practice, a network may have several hundred or thousand people in it, making a query for each person’s network of followers impractical. With the follows edge effectively gone, the remaining edges, “reply” and “mention” become more important. These edges are far less common than the “Follows’ edge. Many people follow lots of other people but mention the name or directly reply to very few. With the loss of the Followers edge, Twitter networks can become very sparse, with few connections remaining. Dense structures give way to confetti.
Here is a map of the topic #scaladays with the Followers edges compared to the same map with no Follower edges:
With the “Follows” edges gone, the loss of insight into the nature of the network is profound, but not fatal. The reply and mention network does have some density in many discussions, allowing many kinds of network positions and structures to be observed. Edges can also be synthesized from other evidence, for example a link could be created when two people use words in common that are not commonly used by others.
The NodeXL project has released a version that connects to the new Twitter API 1.1 and we will be releasing additional edge types that will link people when they share content like hashtags, URLs, words and word pairs with other people. These shared content edges are based on a presumption that when people use similar content that is rarely used by others they are likely to have an underlying connection. The assumption that shared content use is a surrogate for the “follows” relationship requires additional testing (which will be difficult with out access to the data that Twitter just removed). For now, these connections do return density to networks that have been shattered by the loss of the visibility of the Follows connection and can indicate common interests among Twitter users.
Here are recent Twitter social media networks that mention baseball related topics.
Sports teams have several “broadcast” structures in them as well as dense community groups with a small group of isolates – the island users who do not connect to anyone and who often indicate a brand or public topic. The names of baseball teams create networks that have a remarkably high density.
Networks, no matter how complex, are composed of simpler, smaller structures, called motifs. Some of these structures are easy to identify, like the pattern of a “star” where a single node acts as the sole connection to a connected component for one or more “pendant” nodes with a single tie. Another common pattern are nodes that are “parallel bridges” which share the only two connections they have with two or more other nodes. These common structures can be identified and removed and replaced with more efficient and comprehensible representations.
The result is a simplification of the network visualization, removing clutter to reveal the core structural properties of interest.
A complex network of voting relationships in the
2007 United State Senate is reduced to a simplified form
This method for collapsing complex network graphs into simpler forms has been implemented in NodeXL. Look for the feature in the NodeXL Ribbon menu, in the NodeXL > Analysis > Groups > Group by Motif… option.
NodeXL implements network motif simplification
The feature allows users to select the types of motifs that should be recognized and collapsed:
Here are recent graphs of Twitter networks for several news media outlets :
@FT OR @financialtimes
The common “broadcast” structure is common to most of these news media outlets, it appears as a “hub and spoke” pattern. The people at the end of these spokes are the “audience”. Some of these news networks have many more “isolates” or “brand” mentioners – these are the grids of individuals with no connections to others. In contrast some contributors are densely connected in communities of discussion formed around various topics.