If innovations in the citizen media community are shaping the political process, it’s worth looking closely at the structures and architecture of that new space. Two speakers at PDF specialize in visualizing and analyzing mass sets of data. Anthony Hamelle of linkfluence builds very pretty maps of the blogosphere, much like the famous Glance and Adamic visualisation, or my colleague John Kelly’s work on Iranian blogs.

The graphs are influence graphs, showing who links to one another within “like-minded” communities. The idea here is to look at linking between political blogs in only a political context, discarding other links that are outside of context. The result is a tight, pretty map that shows a decided red/blue (conservative/liberal) split in the US political blogosphere, plus a small set of common sources used by both sides. The graph is remarkably easy to explore, allowing users to mouse over it and see the media sources referenced.

A new tool (perhaps not yet available online?) tracks the emergence of terms and subjects over time, allowing for trend analysis – Hamelle shows the rise of “FISA” as a key term in discussions last week.

Matthew Hurst, who runs the Data Mining blog and is a researcher with Microsoft Live Labs is the king of these sorts of visualizations. He offers thoughts on a very broad topic – “What can you do with all the social media data – if you’re collecting information from Twitter, Usenet and blogs, simultaneously?”

Hurst points out that, with a bit of creativity, one can extract a great deal of data from blogs. You can often figure out the geographic location and the gender of the poster, and you can nearly always retrieve the complete (public) posting history of the author. One tool Hurst has been developing shows posts, in realtime, on a map of the US, giving a sense for how ideas emerge and move across the physical world.

An early Hurst visualization of the English-language blogosphere. The top cluster is technology blogs, and the two bright dots are BoingBoing and Engadget. The lower, larger cluster is the interconnected US political blogosphere.

Hurst graphs virtual communities as well. One gorgeous visualization, not shown here, clusters blogs based on their location on servers. Livejournal blogs tend to cluster closely together, while Blogspot blogs are evenly spread throughout the linksphere.

What can you do with these sorts of tools and the ability to look at citizen media in realtime? Well, you can watch ideas emerge, based on tools that track words. Matt offers a graph of bloggers talking about Obama versus those talking about Clinton – the lines crossed in February, allowing him to predict Obama’s rise several weeks before it became a dominant narrative in mainstream media. What’s rising now? Conversations about oil appear to be dominating all political discussions.

  2. galiel says:

    I worry that these gorgeous visualizations can hide biases introduced by a) the data that is chosen to be represented, vs the data that is not, and b) by tweaking the layout of the nodes. The “wow” can be seductive, but what can we do, as producers of the data, to ensure that we make these underlying decisions transparents; and, as consumers, how we can develop the skills, even the intuition, to interpret the data presented without losing sight of the “noise” introduced by the visualization medium (and by subtle choices and manipulations – innocent or not – of the visualizer).

  6. Dear Ethan,
    You were one of the first to blog about this visualization of the political blogosphere presented at the PDF conference.
    I thought you might be interested in the follow-up study, which now shows the dynamics of information, spreading in real-time across the network. Here’s a short video of this experiment (http://vimeo.com/1809757?pg=embed&sec=1809757), using the example of the uber viral video of Paris Hilton’s reponse to McCain.


