My Heart's in Accra

Ethan Zuckerman's musings on Africa, international development
and hacking the media.

01/26/2012 (12:02 am)

David Weinberger: Too Big To Know

Filed under: Berkman,ideas ::

David Weinberger‘s new book “Too Big To Know” (#2B2K – be sure to pick book titles that make good hash tags…) launched last night at Harvard Law School with a talk entitled “Unsettling Knowledge”. If you know David’s work, it’s obvious that the title is a pun. And David’s new book is a wonderfully unsettling piece – it challenges our notion of what knowledge is, and introduces the uncomfortable question of how we navigate this new space.

Knowledge as we know it is coming apart, David tells us. The bastions of knowledge, the physical emblems of knowledge, like encyclopedias, newspapers and libraries are undergoing radical transformation. We know we’re heading into a future that’s deeply different, though we don’t know quite how. The manifestations of knowledge are at risk, and all it took was the touch of a hyperlink.

How did these institutions fall apart so quickly? It’s an impossible question to answer, but he offers one path through the thicket. He starts with a famous quote from Daniel Patrick Moynihan, who tells us “Everyone is entitled to his own opinion, not his own facts.” This is the promise of knowledge: that if we all got together and had an honest conversation, we can eventually come to an agreement. There is knowledge and it can bring us together.

We tend to assume that knowledge gives us an accurate picture of the world, built up bit by bit, fact by fact. In acquiring knowledge, we nail down each piece with certainty. And we see knowledge as a product of filtering and winnowing – we move from perception to true perception, from a mob of opinion to true belief. Knowledge is about finding gold within the flux.

We’ve always had to filter, based on the fact that the world is way bigger than what fits in our skills. There’s too much to know (quoting Anne Blair’s book “Too Much to Know“) and the world is too big to know.

Traditionally, we’ve handled this by breaking off a brain-sized chunk of the world and getting an expert to understand it. Once we’ve got that expert, we can stop asking questions: we simply ask the expert. Experts, and the credentials that create them, are stopping points. They’re points beyond which we don’t need to look any further.

But that’s how knowledge works on paper. Books, for all their magnificence, are a disconnected medium. They are contained within covers, they are shelved apart, they don’t naturally connect to one another. The author’s job is to put everything she knows on a topic between two covers. The arguments move in sequence, from the beginning to the conclusion. And because the book is an essentially limited medium, good writers ruthlessly cast things aside, deciding what it put in the book and what is excluded. Books are born of long-form arguments, moving us forward step by step, brick by brick.

Links are a new form of punctuation. They give you a means of continuing. In the print world, to follow a footnote in a book, you need to get on a bus and go to the library. That’s why we don’t generally follow footnotes. But now we can jump from one book to the next. It’s a magic map – touch a place on the map and you go there.

The internet is an environment that’s all about connection and our knowledge is picking up properties of the medium. Knowledge in this space is characterized by the fact that it’s “too much, messy, unsettled, and unstructured”.

Clay Shirky suggests that there’s no such thing as information overload, only filter failure. This is a very modern response to an older question. Futurist Alvin Toffler warned us about information overload, popularizing the phrase. It’s an extension of the idea of sensory overload, the idea that too much input could overwhelm and paralyze you. This is based on the faulty assumption that brains are information processing machines, and that we can overwhelm and crash them.

This line of thinking led marketers to conclude that choosing between 16 brands would be overwhelming to an American housewife and that fewer choices needed to be offered. But we’re now headed to a point where there’s an exabyte of genomic information available, and that number doesn’t lead us to paralysis, but to fascination. We’ve redefined the term “information overload” through how we use it.

We’re less overwhelmed because we’re learning different ways to filter. When we filtered in the print world, we did so in a way that prevented us from seeing the dregs. We saw only the books that our local library chose to buy, and only the books the publisher chose to print. The manuscripts filtered out of that process were invisible to retrieve through ordinary means.

Now, in a digital age, we filter forward, not filter out. All that information – some of it very low quality – is out there somewhere on the internet. We could curate and try to delete the stuff that’s wrong, hurtful, harmful or hateful. But it’s expensive to exclude information and cheaper to include everything. When you curate, you’re making decisions about what is interesting to your users, and no one can accurately predict what might be useful to a researcher in the future. Filter out all the gossip and crap from new media and you harm the scholar who wants to study celebrity behavior. You couldn’t have predicted the high level of interest in notes from a committee meeting in Wasilla, Alaska in 2008 until Sara Palin became a public figure.

The web has worked by developing tools that include all content and filter when we retrieve it. As recently as a decade ago, information retrieval experts told us that ordinary users would never use tools this complicated. But now we use them everyday, because we have to. And we’re seeing much better tools, like Shelflife, the tool Harvard’s Library Lab has created to allow users to browse the vast set of information in Harvard’s library systems.

We don’t just have a lot of information – the information is very messy. We like order – David shows a slide of zoological specimens, beetles mounted on pins – and we’re very good at establishing it. We understand where everything fits in a tree of species, based on similarities and differences. To know where a species fit into this tree was to know how the world works – to not know it was to be adrift.

In the physical world, there’s only one way to sort manifestations of information. You might want to sort your CDs by artist, while your partner might want them sorted by genre. There’s only one possible they can be stacked on the shelf, because no two things can be in the same place at the same time. In a digital age, we simply make playlists. We end up with a mess of information, but it’s a rich and fertile mess.

Figuring out where things fit in the natural order of things was an essential piece of being human. Human beings saw ourselves as “the knowers. But there’s multiple orders and multiple ways of categorizing, through tags, playlists and other ways to sort information. Messiness is an essential feature of how we scale meaning. But, David warns, we still tend to think of knowledge in the ways we did when books had to sit on a single place on the shelf, when knowledge had a single, possible, right form, rather than multiple forms.

Knowledge is too big, messy and wildly unsettled, just like the internet. “For every fact on the internet, there is an equal and opposite fact.” David warns that there is nothing we all agree on – you can find someone willing to argue that 2+2 is not 4 (and, indeed, a quick Google search shows this to be true.) We don’t agree about anything, and David warns, we never will. “This doesn’t mean there are no facts – but it does mean that people are going to insist on being wrong.”

What this persistence of disagreement means is that the promise of knowledge Moynihan offers – that we can agree on a set of facts and then argue our opinions – is not going to be fulfilled. As it turns out, we don’t even know whether Moynihan said “everyone is entitled to his own opinion, not his own facts” or whether that’s exactly what he said.

The good news is that we’re rapidly developing ways of dealing with difference and disagreement. YouTube has a crummy commenting system, as is well documented and well established. David shows us a threat of comments on a recent Batman movie trailer. Somewhere deep in this comment thread is an impassioned argument about circumcision. It would have been great if YouTube supported forking of conversations. Forking is a powerful way to deal with disagreement. It’s very hard to do in the real world without social consequences – if we decide to move away from the dinner party to our own table where we talk about circumcision, it makes people uncomfortable – but it’s very easy to do this on the web.

In the 19th century, it was very challenging to classify the platypus. There was one space in a taxonomy for warm-blooded animals, and another for animals that produce eggs. Scientists thought the platypus must be a hoax, because it didn’t fit within existing categories. Even when presented with a specimen from Tasmania with eggs intact, they fought the platypus “hoax” as something that didn’t work within existing categories.

Now we can solve problems of overly rigid taxonomies by using linked namespaces. We can create a database of names, and a database of taxonomies. We can deal with the platypus and the water mole, and map scientific and colloquial names onto different possible structures. “Pick your name, pick your taxonomy and get on with your life. So what if we disagree? Yay for difference!”

David is actually quite concerned about difference, and just how much difference we can tolerate and still interact and function. He acknowledges that there’s a human tendency towards homophily, flocking together in groups united by race, gender, belief, socioeconomic status, etc. This can lead to a serious challenge to public discourse – echo chambers that can solidify beliefs, making them more extreme and polarized. But David worries that posing issues this way relies on an unquestioned assumption: that conversations are between people who disagree deeply and looking for solutions and common ground by trying to get to the facts. This analysis misses the social role of conversation. We need so much context and so much agreement to even have a conversation. “To have a good conversation, you need to have 99% similarity and 1% difference.” He suggests that some of the work Yochai Benkler and I have been doing may help us find productive paths towards including difference, but reminds us that the high level of disagreement and the difficulty of finding common ground is likely a core feature of the internet and knowledge in an internet age.

Finally, knowledge in this new paradigm is unstructured. We’re used to the idea that knowledge has a basic structure. We have grown used to long form arguments that take us from A to Z, and we’re particularly fond of arguments that take us from A to Z in an orderly path, where Z is an unexpected place to end up. “This is a magnificent form of thought, but the long form argument is losing it’s preeminence.”

We might think of Darwin as a leading proponent of the long form argument. And his argument certainly led somewhere unfamiliar. But he wouldn’t have analyzed data for years and released a massive book if he were working today. He would publish online. And even if he didn’t, the conversation about his work would be based online. Whether or not we imagine Darwin tweeting from The Beagle, the web is where the thinking about and reacting to Darwin’s work would take place, and collectively, it will have more value that Darwin’s long form work taken alone. Moving forward, we will not just see these long form works, but the webs that precede and follow them.

Michael Nielsen has recently written about scholarly community reaction to results at CERN that offer evidence for faster than light neutrinos. As these results came in, they were posted to arXiv.org, a journal preprint site. They stirred up a firestorm of interest and reactions. Some of those reactions are brilliant, some are stupid and wrong. But that welter of discussion is where knowledge is – it’s taking place outside of printed peer review journals.

Darwin spent seven years studying and dissecting barnacles before working on The Origin of Species. His two volume work on barnacles includes countless facts, and his hard work to discover and pin them down was an act of nobility. But science doesn’t work quite like that anymore. We work with clouds of data about genetics, astronomy, and other topics. These data clouds are fundamentally different than facts. When data.gov released sets of government information, they didn’t clean or normalize it ahead of time – they released raw data. They concluded that it was better to put the data out there than to constrain themselves to information that was consistent and known, for the simple reason that this constraint would have slowed them down badly. Darwin would not have agreed – he spent seven years on one fact.

There’s value in getting the data out quickly, David argues. It may be the one approach that’s scaleable – releasing raw data and letting individuals and groups clean, analyze and share what they find. Peer review scientific journals don’t scale, but perhaps peer to peer peer review might. We’re seeing growth in the Open Access journal field, particularly in spaces of repository where data is released, not peer reviewed.

One way we can start making sense of these new data sets is through the magic of linked data, a format suggested by Tim Berners-Lee, father of the web. We organize information in triples:

the platypus | lives in | Tasmania
Watermoles | lay | eggs

When we link triples to a central reference, we can resolve our platipae to water moles and link our triples together. Facts, which used to look like bricks, now look like links.

David closes by returning to his original question: why were old knowledge systems so fragile? These systems assumed knowledge was bounded, settled, orderly and proceeded step by step. But that’s not what knowledge feels like in the age of the internet. It feels unbounded, overwhelming, unsettled, messy, linked and governed by our interests. And those properties are the properties of what it means to be human in the world.

“Networked knowledge may or may not be truer about the world, but is is truer about knowing… This crazy approach to knowledge feels familiar to us, because it’s how we tend to know.” He closes with an observation that’s both hopeful and unsettling: “What we have in common is a shared world about which we disagree, not a common knowledge we share and can collectively come to.”


I’ve followed David’s work for a long time, and had the pleasure of watching him work through the ideas behind this book – David and I are both part of a group at Berkman that helps colleagues explore book-length projects. While I’m familiar with this line of David’s though, it was exciting and unsettling to hear him work through these ideas covering the whole arc of the book. I think this may be the most unsettling and radical book David’s put forth. On the one hand, it’s not a surprise that people will disagree on any concievable fact. But David’s suggestion that we give up on achieving an impossible consensus and proceed with the hard work of getting on with our lives strikes me as challenging and liberating, a very different path than I hear from most activists and advocates. I’m enjoying wrestling with the ideas David puts forth both in this talk and in the paper and hope lots of readers will take up the challenge as well.

01/25/2012 (11:42 am)

Beth Kolko: “Hackademia – Leveraging the conflict between expertise and innovation to create disruptive technologies”

Filed under: Berkman ::

Beth Kolko is the sort of academic who follows her muse from one fascinating topic to another. Colin Maclay traces some of her past work from a doctorate in English through research on use of technology in the developing world, through her current research on human-centered design and engineering at the University of Washington. For the past couple of years, Beth has been focused on research for a book on hackers and makers. This is a project that comes from her daily life, where she’s spent the last six years participating in hacking and making events in the Seattle area – she’s now considering the implications of hacking for academia and larger questions of how the DIY movement could impact civic engagement and educational reform.

There are three major areas her talk – titled “Hackademia” – focuses on. She’s interested in how hackers, makers and students, especially undergrad students, can work as innovators. She’s starting to identify patterns within non-expert communities that allow hackers and makers to innovate. And she’s interested in how we “make more of this ‘stuff’” – as society and as educators, how to we scaffold and maximize these contributions?

The key to understanding hacking and making, she suggests, is imagination: looking at people as creative problem-solvers. While there’s lots of research on how corporate and university researchers solve problems, there’s less research on how people without credentials solve problems. She’s specifically interested in rulebreakers, people who either break the rules of the academy or laws to innovate. Rulebreaking, she argues, is a type of power play: it’s a way ot fighting against the cultural and economic power of “being technical”, finding ways to be technical outside of an existing ruleset.

The people Beth studies are functional, rather than accredited engineers. She confesses, “I don’t really care about formal STEM (science, tech, education and math) education – okay, I care a little. But there are lots of studies on getting people to work in those fields. Instead, I’m trying to get people to be STEM literate and facile.”

Beth tells us about an experiment in group learning she participated in. A group is given a task – from three feet away, collaboratively find a way for the group to touch each card in a set of cards in order. While it’s a simple task, the challenge is to execute it collaboratively, and she reports that her group took a long time to discuss what ways would be sufficiently participatory, while another group never completed the task. When we’re faced with new sets of rules, we are forced to think through tacit assumptions that define our behavior, bringing those internalized constraints to the surface.

She tells us about an independent inventor in Detroit, who created a novel flash heating process for steel. It saves energy, and makes steel that’s 7% stronger than through conventional processes. While his research was independent and uncredited, it’s now being analyzed within metallurgy schools to verify the success of the process. One of the people verifying observes that, “Steel is a mature science”. We tend to assume that all that could be done has been done, but that’s not true.

For an example that’s even further from the academic community, she points us to a YouTube video of a fun parlor trick – removing a cork from a wine bottle without harming cork or bottle. The key is to insert a plastic bag, snare the cork, partially inflate the bag and then pull the apparatus out. An auto mechanic – Jorge Odon – was watching YouTube videos in his native Argentina, and thought this was a cool trick. He wondered if it would work for babies. And it does – the Odon device is now in trials as part of birth kits for the developing world.

There’s innovating from hacking as well. She points to wardriving, a technique developed to compromise networks, which now is part of business processes to ensure corporate networks are locked down. And she suggests that password testing tools have emerged almost exclusively from the hacking community. Security techniques designed to compromise networks become part of standard business practices.

Some of Beth’s recent work has focused on non-expert innovation from students, specifically work on a low-cost portable ultrasound kit. A colleague at the University of Washington working in radiology reached out to Beth for help with user interfaces for ultrasound systems used by midwives in Kampala, Uganda. The goal of the project was to train midwives to identify the three conditions that most contribute to maternal mortality and send affected women to hospitals, rather than giving birth at home.

As Beth and her students worked on the project, they discovered that one major problem was that midwives were trained for 2-6 weeks, while ultrasound readers in the US train for two years before being certified. Even the technicians who train for two years don’t use all the functions of a commercial ultrasound machine – in US ultrasound practice, the complex machines are heavily marked with signs created by the technicians warning not to use certain buttons or to use only certain ranges of frequencies.

Can we make this technology simpler for technicians with less training? This makes sense, as the Ugandan technicians are only trying to diagnose three conditions. The solution Beth and her team found was to move back to an older, cheaper technology and to marry those wands with simple netbooks, then focus on making the user interface as easy as possible.

Through ethnography with midwives and mothers, they discovered that the use of ultrasound is utterly different in Uganda than in US clinical practice. In the US, the technician can pass any ambiguous results to a support structure of doctors. Midwives in Uganda are generally all on their own – they need to give answers to mothers directly. So she and her students built a help system for the ultrasound device that was a learning system about maternal health, not just a manual for the tool.

“Not understanding the boundaries of the problem space allows innovation – including a help and learning system into the product was something my students did not know was prohibited.”

Beth’s insights in this field come from studying creativity around technology in the developing world, as well as US hackerspaces, makerspaces, hacker cons, and makerfaires. Extrapolating from both types of sites, she observes three characteristics:

- The importance of actual space in bringing communities together
- Systems of apprenticeship or scaffolded learning, including workshops that show people what they need to know to join a community
- Contests and other systems for building reputations, like the “black badges” issued to winners of capture the flag contests at Defcon, or the badges people win on instructables.com

She’s interested in the possible overlaps between university research, industry labs and independent researchers. Her goal is not to map the actual Venn diagram of the space, but to understand how independent researchers work in this space. She believes that independent researchers are particularly important for building disruptive technology. Academics have a disincentive to build highly disruptive systems – they’re hard to get academic funding for, and hard for PhD students to pitch dissertations around. It’s hard to disrupt in the corporate community, especially when disruptive tech is cheaper, as those sorts of innovations tend not to fit within existing sales structures. Independent researchers may be immune to these restrictions and especially capable of pushing forward disruptive innovations.

The structural constraints suggest that independent researchers may not be able to do fundamental research – it’s hard to investigate the deep structure of matter without strong funding. What independent researchers excel at is technological remix. She shows photos of makers building a panoramic camera designed to take photos from near space. There’s not much novel tech development involved with the project, but lots of remix of existing photographic technology.

Beth’s “Hackademia” project has attempted to learn from these general observations. She invited six undergraduate students to meet regularly in a physical space, equipped with desks and chairs and salvaged gear to hack with, including Arduino controllers. She asks the students to learn and keep track of how they learn. She offers no formal instruction, but lots of pointers to places her students can find learning materials.

One of the projects the Hackademia team took on was assembling a makerbot, a 3D printer that comes as a kit. Very seasoned engineers have been able to assemble the product in seven hours – her team took it slowly and took weeks. But they got it together, and developed some intense technical skills in the process. One student, who had been worried about touching any pieces of the kit for fear of breaking them, found herself some weeks later slapping Beth’s had when she tried to assemble something for her. This student had thought of herself as “non-technical”, Beth tells us. “But that notion of technical and non-technical broke down for them.”

Why Hackademia? Because there are few mechanisms at the university to allow non-science students to gain technical skills. It’s very hard for someone not on an engineering track to learn how to solder. But Beth’s work isn’t designed to create more professional engineers – it’s to get people to functional technical literacy. “We’re creating functional engineers one blinky LED at a time.”

Interventions like Hacakdemia, Beth hopes, can address at least six issues:

- self-efficacy – considering yourself capable of engaging in technical acts
- material technical practice – gaining concrete technical skills
- identity formation – identifying personally and socially as a technically competent person
- conception – understanding the scope and practice of technical knowledge
- motivation – articulating possible future selves
- social capital and sustainable participation – understanding how to seek out expert knowledge when necessary

On this last point, Seattle is a particularly sustainable place to build this sort of interventions, as it’s filled with hacker spaces and expert communities who can support this form of experimentation.

Beth’s new effort is Shiftlabs, an engineering and manufacturing company that works only with hackers. The company focuses on the engineering of low cost devices in the global health space, using R&D from independent researchers. Why a company and not a book? Beth explains that she’d never intended for this space to be the main locus of her research – it’s the product of taking a close look at something she’s become fascinated with in her personal life that’s turned into an academic and professional focus.

12/23/2011 (10:24 am)

SOPA and our 2010 Circumvention Study

Filed under: Berkman,Human Rights ::

Daniel Castro of The Information Technology & Innovation Fund recently published a paper supporting the Stop Online Privacy Act (SOPA) currently being debated in congress. In that report, he claims that research performed by us supports the domain name system (DNS) filtering mechanisms mandated by SOPA. This claim is a distortion of our work. We disagree with the use of our study to make the point that DNS-based Internet filtering works and that we should therefore use it as a means of stopping websites from distributing copyrighted content. The data we collected answer a completely different set of questions in a completely different context.

Among other provisions that seek to control the sharing of copyrighted material on the Internet, SOPA, if enacted, would call upon the U.S. government to require that Internet service providers remove from their DNS servers the names of any sites that either infringe copyright directly or merely “facilitate” copyright infringement. So, for example, the government could require that ISPs remove the name “twitter.com” from their DNS servers if twitter.com was not being sufficiently aggressive in preventing its users from tweeting information about places to download copyrighted materials. This practice is known as DNS filtering. DNS filtering is one of the most common modes of Internet-based censorship. As we and our collaborators in the OpenNet Initiative have shown over the past decade, practices of this sort are used extensively in autocratic countries, including China and Iran, to prevent access to a range of sites offensive to the governments of those countries.

Opponents of SOPA have argued that the DNS filtering, even though it will have a number of harmful effects on the technical and political structure of the Internet, will not be effective in preventing users from accessing the blocked sites. Mr. Castro cites our research as evidence that SOPA’s mandate to filter DNS will be effective. He quotes our finding that at most 3% of users in certain countries that substantially filter the Internet use circumvention tools and asserts that “presumably the desire for access to essential political, historical, and cultural information is at least equal to, if not significantly stronger than, the desire to watch a movie without paying for it. Yet only a small fraction of Internet users employ circumvention tools to access blocked information, in part because many users simply lack the skills or desire to find, learn and use these tools.”

In our report, we looked at three sets of censorship circumvention tools: complex, client-based tools like Tor; paid VPNs; and web proxies. We estimated usage of those three classes of tools. We used reports from the client tool developers, a survey to gather usage data from VPN operators and used data from Google Analytics to estimate usage of web proxy tools. Counting all three classes of tools, we estimated as many as 19 million users a month of circumvention tools. Given the large number of users in China, Iran, Saudi Arabia and other states where filtering is endemic, this represents a fairly small percentage of internet users in those countries; 19 million people represents about 3% of the users in countries where internet filtering is pervasive. We actually believe that 3% figure is high, as some of the tools we study are used by users in open societies to evade corporate or university firewalls, not just to evade government censorship.

We stand behind the findings in our study (with reservations that we detail in the paper), but we disagree with the way that Mr. Castro applies our findings to the SOPA debate. His presumption that people will work as hard or harder to access political content than they do to access entertainment content deeply misunderstands how and why most people use the internet. Far more users in open societies use the Internet for entertainment than for political purposes; it is unreasonable to assume different behaviors in closed societies. Our research offers the depressing conclusion that comparatively few users are seeking blocked political information and suggests that the governments most successful in blocking political content ensure that entertainment and social media content is widely available online precisely because users get much more upset about blocking the ability watch movies than they do about blocking specific pieces of political content.

Rather than comparing usage of circumvention tools in closed societies to predict the activities of a given userbase, Mr. Castro would do better to consider the massive userbase of tools like bit torrent clients, which would make for a far cleaner analogy to the problem at hand. Likewise, the long line of very popular peer-to-peer sharing tools that have been incrementally designed to circumvent the technical and political measures used to prevent sharing copyrighted materials are a stronger analogy than our study of users in authoritarian regimes seeking to access political content.

Second, our research has consistently shown that those who really wish to evade Internet filters can do so with relatively little effort. The problem is that these activities can be very dangerous in certain regimes. Even though our research shows that relatively few people in autocratic countries use circumvention tools, this does not mean that circumvention tools are not crucial to the dissident communities in those countries. 19 million people is not large in relation to the population of the Internet, but it is still a lot of people absolutely who have freer access to the Internet through the tools. We personally know many people in autocratic countries for whom these tools provide a crucial (though not perfect) layer of security for their activist work. Those people would be at much greater risk than they already are without access to the tools, but in addition to mandating DNS filtering, SOPA would make many circumvention tools illegal. The single biggest funder of circumvention tools has been and remains the U.S. government, precisely because of the role the tools play in online activism. It would be highly counter-productive for the U.S. government to both fund and outlaw the same set of tools.

Finally, our decade-long study of Internet filtering and circumvention has documented the many problems associated with Internet filtering, not its overall effectiveness. DNS filtering is by necessity either overbroad or underbroad; it either blocks too much or too little. Content on the Internet changes its place and nature rapidly, and DNS filtering is ineffective when it comes to keeping up with it. Worse, especially from a First Amendment perspective, DNS filtering ends up blocking access to enormous amounts of perfectly lawful information. We strongly resist the claim that our research, and that of our collaborators, makes the case in favor of DNS-based Internet filtering.

Links:

Mr. Castro’s report may be found here:

http://www.itif.org/publications/pipasopa-responding-critics-and-finding-path-forward

with the reference to our work on p. 8.

The study that is being misused by Mr. Castro is here:

http://cyber.law.harvard.edu/publications/2010/Circumvention_Tool_Usage.

The findings of our decade-long studies are documented in three books, published MIT Press and available freely online in their entirety at:

http://access.opennet.net/

- Rob Faris, John Palfrey, Hal Roberts, Jill York, and Ethan Zuckerman

11/07/2011 (7:54 pm)

Mapping Media Ecosystems at Center for Civic Media

This summer, Sasha, Lorrie and I started brainstorming the sorts of events we wanted to host at the Center for Civic Media this fall. The first I put on the calendar was a session on “mapping civic media”, a chance to catch up with some of my favorite people who are working to study, understand and visualize how ideas move through the complicated ecosystem of professional and participatory media.

To represent the research being done in the space, we invited Hal Roberts, my collaborator on Media Cloud (and on a wide range of other research), Erhardt Graeff from the Web Ecology project, and Gilad Lotan, VP of R&D for internet analytics firm BetaWorks. On Wednesday night, I asked them to share some of the recent work they’ve been doing, understanding the structure of the US and Russian blogosphere, analyzing the influence networks in Twitter during the early Arab Spring events and understanding the social and political dynamics of hashtags. They didn’t disappoint, and I suspect our video of the session (which we’ll post soon) will be one of the more popular pieces of media we put together this fall. In the meantime, here are my notes, constrained by the fact that I was moderating the panel and so couldn’t lean back and enjoy the presentations the way I otherwise might have.

Hal Roberts is a fellow at the Berkman Center for Internet and Society, where he’s produced great swaths of research on internet filtering, surveillance, threats to freedom of speech, and the basic architecture of the internet. (That he’s written some of these papers with me reflects more on his generosity than on my wisdom.) He’s the lead architect of Media Cloud, the system we’re building at the Berkman Center and at Center for Civic Media to “ask and answer quantitative questions about the mediasphere in more systematic ways.” As Hal explains, media researchers “have been writing one-off scripts and systems to mine data in haphazard ways.” Media Cloud is an attempt to streamline that process, creating a collection of 30,000 blogs and mainstream media sources in English and Russian. “Our goal is to get as much media as possible, so we can ask our own questions and also let others ask questions of our duct tape and bubblegum system.”


Hal’s map of clusters in popular US blogs. An interactive version of this map is available here.

Much of Hal’s work has focused on using the content of media – rather than the structure of its hyperlinks – to map and cluster the mediasphere. He shows us a map of US blogs that cluster into three main areas – news and political blogs, technology blogs and what he calls “the love cluster”. This last cluster is so named because it’s filled with people talking about what they love. Subclusters include knitters, quilters, fans of recipes and photography. The technology cluser breaks down into a Google camp, an iPhone camp and a camp discussing Android Apps. Hal’s visualization shows the words most used in the sources within a cluster, which helps us understand what these clusters are talking about. The Google cluster features words like “SEO, webmaster, facebook, chrome” and others, suggesting the cluster is substantively about Google and its technology projects.

While we might expect the politics and news cluster to divide evenly into left and rightwing camps, it doesn’t. Study the link structure of the left and the right, as Glance and Adamic and later Eszter Hargittai have, and it’s clear that like links to like. But Hal’s research shows that the left and right use very similar language and talk about many of the same topics. This is a novel finding: It’s not that the left and right are talking about entirely different topics – instead they’re arguing over a common agenda, an agenda that’s well represented in mainstream media as well, which suggests the existence of subjects neither the right or left are talking about online.

Building on this finding, Hal and colleagues at Berkman looked at the Russian media sphere, to see if there was a similar overlap in coverage focus between mainstream media and blogs. “Newspapers and the television are subject to strong state control in Russia – we wanted to see if our analysis confirmed that, and whether the blogosphere was providing an alternative public sphere.

The technique he and Bruce Etling used is “the polar map” – put the source you believe is most important at the center, and other sources are mapped at a distance from that source where the distance reflects degree of similarity. The central dot is a summary of verbiage from Russian government ministry websites. Right next to it is the official government newspaper. TV stations cluster close to the center, while blogs cover a wide array of the space, including the edges of the map.

It’s possible that blogs are showing dissimilarities to the Kremlin agenda because they’re talking about knitting, not about politics. So a further analysis (the one mapped above) explicitly identified democratic opposition and ethno-nationalist blogs and looked at their placement on the map. There’s strong evidence of political conversations far from the government talking points in both the democratic opposition and in the far right nationalist blogosphere.

What’s particularly interesting about this finding is that we don’t see the same pattern in the US blogosphere. Make a polar map with the White House, or a similar proxy for a US government news agenda, at the center, and you’ll see a very different pattern. Some right wing American blogs flock quite closely to the White House talking points – mostly to critique them – while the left blogs and mainstream media generally don’t. However, when Hal and crew did an analysis of stories about Egypt, they saw a very different pattern than in looking at all stories published in these sources. They saw a tight cluster of US mainstream media and blogs – left and right – around the White House. The government, the media and bloggers left and right talked about Egypt using very similar language. In the Russian mediasphere, the pattern was utterly different – the democratic opposition was far from the Kremlin agenda, using the Egyptian protests to talk about potential revolution in Russia.

The ultimate goal of Media Cloud, Hal explains, is to both produce analysis like this, and to make it possible for other researchers to conduct this sort of analysis, without a first step of collecting months or years of data.

Erhardt Graeff is a good example of the sort of researcher Media Cloud would like to serve. He’s cofounder of the Web Ecology Project, which he describes as “as a ragtag group of casual researchers that has now turned in a peer-reviewed publication“. That publication is the result of mapping part of the Twitter ecosystem during the Tunisian and Egyptian revolutions, and attempting to tackle some of the hard problems of mapping media ecosystems in the process.

The Web Ecology Project began life researching the Iranian elections and resulting protests, focusing on the #iranelection hashtag. With a simple manifesto around “reimagining internet studies”, the project tries to understand the “nature and behavior of actors” in media systems. That means considering not just the top users, or even just the registered users of a system like Twitter, but the audience for the media they create. “Each individual user on Twitter has their personal media ecosystem” of people they follow, influence, are followed by and influenced by.

This sort of research rapidly bumps into three hard problems, Erhardt explains:

- Did someone read a piece of information that was published? Or as he puts it, “Did the State Department actually read our report about #IranElection?” It’s very hard to tell. “We end up using proxies – you followed a link, but that doesn’t mean you read it.”

- Which piece of media influenced someone to access other media? “Which tweet convinced me to follow the new Maru video, Erhardt’s or MC Hammer’s?”

- How does the media ecosystem change day to day? Or, referencing a Web Ecology paper, “How many genitalia were on ChatRoulette today?” The answer can vary sharply day to day, raising tough problems around generating a usable sample.

The paper Erhardt published with Gilad and other Web Ecology Project members looks at the Twitter ecosystem around the protest movements in Tunisia and Egypt. By quantitatively searhing for information flows, and qualitatively classifying different types of actors in that ecosystem, the research tries to untangle the puzzle of how (some) individuals used (one type of) social media in the context of a major protest.

To study the space, the team downloaded hundreds of thousands of tweets, representing roughly 40,000 users talking about Tunisia and 62,000 talking about Egypt. They used a “shingling” method of comparison to determine who was retweeting whom ad sought out the longest retweet chains. They looked at the top 10% of these chains in terms of length to find the “really massive, complex flows” and grabbed a random 1/6th of that sample. That yielded 774 users talking about Tunisia, 888 talking about Egypt… and only 963 unique users, suggesting a large overlap between those two sets.

Then Erhardt, Gilad and others started manually coding the participants in the chains. Categories included Mainstream Media (@AJEnglish, @nytimes), web news organizations (@HuffingtonPost), non-media organizations (@Wikileaks, @Vodaphone), bloggers, activists, digerati, political actors, celebrities, researchers, bots… and a too-broad unclassified category of “others”. This wasn’t an easy process – Erhardt describes a system in which researchers compared their codings to ensure a level of intercoder reliability, then had broader discussions on harder and harder edge cases. They used a leaderboard to track how many cases they’d each coded, and goaded those slow to participate into action.

The actors they classified are a very influential set of Twitter users. The average organization in their set has 4004 followers, the average individual 2340 (which is WAY more than the average user of the system). To examine influence with more subtlety than simply counting followers, Erhardt and his colleagues use retweets per tweet as an influence metric. What they conclude, in part, is that “mainstream media is a hit machine, as are digerati – what they have to say tends to be highly amplified.”

The bulk of the paper traces information flows started by specific people. In the case of Egypt, lots of information flows start from journalists, bloggers and activists, with bots as a lesser, but important, influence. In Tunisia, there were fewer flows started by journalists, more by bots and bloggers, and way fewer from activists. This may reflect the fact that the Tunisian story caught many journalists and activists by surprise – they were late to the story, and less significant as information sources than the bloggers who cover that space over time. By the time Egypt becomes a story, journalists realized the significance and were on the ground, providing original content on Twitter, as well as to their papers.

One of the most interesting aspects of the paper is an analysis of who retweets whom. It’s not surprising to hear that like retweets like – journalists retweet journalists, while bloggers retweet bloggers. Bloggers were much more likely to retweet journalists on the topic of Egypt than on Tunisia, possibly because MSM coverage of Egypt was so much more thorough than the superficial coverage of Tunisia.

While Gilad Lotan worked with Erhardt on the Tunisia and Egypt paper, his comments at Civic Media focused on the larger space of data analysis. “I work primarily on data – heaps and mounds of data,” he explains, for two different masters. Roughly half his work is for clients, media outlets who want to understand how to interact and engage with their audiences. The other half focuses on developing the math and algorithms to understand the social media space.

This work is increasingly important because “attention is the bottleneck in a world where threshhold to publishing is near zero.” If you want to be a successful brand or a viable social movement, understanding how people manage their attention is key: “It’s impossible to simply demand attention – you have to understand the dynamics of attention in the face of this bottleneck.”

Gilad references Alex Dragulescu’s work on digital portraits, pictures of people composed of the words they most tweet or share on social media. He’s interested not just in the individuals, but in the networks of people, showing us a visualization of tweets around Occupy Wall Street. Different networks take form in the space of minutes or hours as new news breaks – the network around a threatened shutdown of Zuccotti Park for a cleanup is utterly different than the network in July, when Adbusters was the leading actor in the space.


Lotan’s visualizations of Twitter conversations about Occupy in July and October 2011

Images like this, Lotan suggests, “are like images of earth from the moon. We knew what earth looked like, but we never saw it
We knew we lived in networks, but this is the first time we can envision it and see how it plays out.”

When we analyze huge data sets, we can start approaching answers to very difficult questions, like:
- What’s the audience of the New York Times versus Fox News?
- What type of content gains wider audiences through social media?
- What topics do certain outlets cover? What are their strengths, weaknesses and biases?
- How do audiences differ between different publications? How are they similar?
- How fast does news spread, and how does it break?

Much of media and communications research addresses these questions, though rarely directly – as Erhardt noted, we generally address these questions via proxies. But Lotan tells us, we can now ask and answer questions like, “How many Twitter users follow Justin Bieber and The Economist?” The answer, to a high degree of precision, is 46,000. It’s just shy of the number who follow The Economist and the New York Times, 54,000.

Lotan is able to research answers like this because his lab has access to the Twitter “firehose” (the stream of all public data posted to Twitter, moment to moment) and to the bit.ly firehose. This second information source allows Lotan to study what people are clicking on, not just what media they’re exposed to. He offers a LOLcat, where the feline in question is dressed in a chicken costume. “We can see the kitty in you, and the chicken you’re hiding behind.” What people share and what they click is very different, and Lotan is able to analyze both.

This data allowed Lotan to compare what audiences for four major news outlets were interested in, my measuring their clickstreams. Al Jazeera and The Economist, he tells us, are pretty much what you’d think. But Fox News watchers are fascinated by crime, murders, kidnappings and other dark news. This sort of insight may help networks understand and optimise for their audiences. Al Jazeera’s audience, he tells us, is very engaged, tweeting and sharing stories, while Fox’s audience reads a lot and shares very little.

Some of Lotan’s recent research is about algorithmic curation, specifically Twitter’s trending topics. Many observers of the Occupy movement have posited that Twitter is censoring tweets featuring the #occupywallstreet hashtag. Lotan acknowledges that the tag has been active, but suggests reasons why it’s never trended globally. Interest in the tag has grown steadily, and has a regular heartbeat, connected to who’s active on the east coast of the US. The tag has spiked at times, but remains invisible in part due to bad timing – a spike on October 1st was tiny in comparison to “#WhatYouShouldKnowAboutMe”, trending at the same time.

At this point, Lotan believes he’s partially reverse engineered the Trending Topics algorithm. The algorithm is very sensitive to the new, not to the slowly building. This raises the question: what does it mean to “get the math right”. Lotan observes, “Twitter doesn’t want to be a media outlet, but they made an algorithmic choice that makes them an editor.” He’s quick to point out that algorithmic curation is often very helpful – the Twitter algorithm is quite good at preventing spam attacks, which have a different signature than organic trends. So we see organic, fast-moving trends, even when they’re quite offensive. He points to #blamethemuslims, which started when a Muslim women in the UK snarkily observed that Muslims would be blamed for the Norway terror attacks. That tweet died out quickly, but was revived by Americans who used the tag unironically, suggesting that we blame Muslims for lots of different things – that small bump, then massive spike is a fairly common organic pattern… and very different from the spam patterns he’s seen on Twitter.

When we analyze networks, Lotan suggests, we encounter a paradox that James Gleick addresses in his recent book on information: just because I’m one hop away from you in a social network doesn’t mean I can send you information and expect you to pay attention. In the real world, people who can bridge between conversations are rare, important and powerful. He closes his talk with the map of a Twitter conversation about an event in Israel where settlers were killed. There’s a large conversation in the Israeli twittersphere, a small conversation in the Palestinian community, and two or three bridge figures attempting to connect the conversations. (One is my wife, @velveteenrabbi.) Studying events like this one may help us, ultimately, determine who’s able to build bridges between these conversations.


I can’t wait for the video for this event to be put online – we’ll get it up as soon as possible and I’ll link to it once we do.

10/18/2011 (2:05 pm)

Beth Coleman on “Tweeting the Revolution”

Filed under: Berkman ::

Beth Coleman presents some of her recent research on the protests in Tahrir square, and a broader theory of how social networks and activism in the physical world work together today at the Berkman Center. With her is Mike Ananny, her coauthor and researcher in danah boyd’s lab at Microsoft Research. The presentation, “Tweeting the Revolution”, tries to understand how we read large data sets to understand located action. This is a timely topic because we’re seeing a rise in protest activity that’s been missing from the public sphere for a few decades. Coleman wants to know what we can understand about social media and people’s willingness to take an activist stance. One of the foci of her work is the idea of mediated copresence, which she sees as a major way of understanding the relationship between technology and public action.

Tahrir Square offers an opportunity to think through the relationship between three types of speech:
- Public speech, the broadcast of information to a broad audience
- Civic speech, speech within the networks of your located environment
- Poetic speech, speech about expressing needs and interests

What’s the effect of Twitter, SMS and other technologies in a space like Tahrir? They may be critical in understanding the sustainability of commitments to a movement beyond the initial phase of protest.

In his critiques of online activism in understanding the Arab Spring, Malcolm Gladwell has suggested that activism needs to include bodily presence, risk of harm or arrest, and developed organizational infrastructures. It’s worth asking those questions – does online participation matter? Do we need bodily presence for activism? Coleman and Ananny use the possibility of bodily risk – in this case, the physical presence in Egypt – as a precursor for inclusion for her interview group. She cites Elaine Scarry’s work on body and pain, suggesting that when a body is in pain, there’s a loss of self, a loss of agency, and a loss of language. Pain cannot be articulated, and there’s the failure of “subject as a system”. So physical location in Egypt opens risk of incarceration and torture, and creates a category of potentially effected actor.

There’s lots of analysis of network collective action from at least two points of view: considering social media as an augmentation to traditional organizing tools, and considering network media as a form of command and control. There’s an open space for analysis around strategic and tactical engagement around located network media. We might think of social media as a way of facilitating co-presence, the way of being part of a phenomenon either in physical space or in a complementary virtual space. If we’re continually surrounded by Twitter, Facebook and SMS, which remind us of people’s presence even if we’re not interacting with them, how does this help us understand a move from onlooker to participant in collective action.

To understand copresence, we need to understand quotidian media engagement. 17% of Egyptians were online before the revolution and 72% on mobile phones. Coleman notes that Kate Crawford, studying non-literate women in India, sees SMS use from people you wouldn’t expect to be able to use SMS. It’s worth being open to the notion that SMS could be a powerful tool for sending the sense of presence for a very large swath of an Egyptian audience. Coleman suggests that we need to engage in careful consideration of the oral and the local to understand the cascae of strong and weak ties and their relationship to collective action.

She and Ananny propose a way of thinking through Egyptian positions towards the Tahrir protests. There were people who were present in Tahrir and those who weren’t. There were people engaged with the protests online and those who weren’t. We can create four categories of engagement by considering those categories in terms of binaries. This separates some figures from the discussion – individuals like Alaa Abdel Fatteh, who was deeply engaged online, but in South Africa for much of the protest. But it’s a useful structure in part because it forces you to consider the bottom quadrant, those who didn’t engage physically or online, and are therefore the hardest to study. Eszter Hargittai’s contribution to the work, Coleman notes, is to urge her to take that quadrant of nonparticipation seriously.

Interviews with participants quickly complicate and stretch the boundaries of these categories. An interview with a 20-something woman, upper middle class, who’s been using Ushahidi to map sexual harassment, shows Coleman that “on/off the square” may be too binary a distinction. In the wake of the media blackout on the 28th, she tells Coleman, she was motivated to go to the square because she didn’t want to be alone, she wanted to find other people, and she felt like the movement was moving from online to offline. But as she headed to the square, she felt a sense of risk and turned around. Her story calls into question the idea of whether you needed to be in Tahrir physically to be part of the revolution.

Coleman shows us a graph of Dima Khatib’s Twitter network rendered by Gilad Lotan. Based on the frameworks Coleman is suggesting, can we better understand who connects, who retweets and how information cascades? “How might the data trace of media engagement overlap with the human narrative?”

This matters, ultimately, because it influences how we might develop new tools. This past weekend, Coleman led a workshop with Juliana Rotich of Ushahidi, a platform for crisis mapping and management. “After the crisis, what are the tools for sustaining movements?”

09/12/2011 (8:55 pm)

A Vast Wasteland, Five Decades Later

Filed under: Berkman,CFCM,ideas ::

Fifty years ago, Newton Minow, the 35 year old FCC chairman, gave a speech that’s still studied today. It’s taught in rhetoric courses, tested on the LSAT reading comprehension test and still is invoked in discussions of how communications technology affects entertainment, news, and democracy. The speech challenged broadcasters to actually watch their programming, and urged them to consider whether they were proud of what’s they’d see. It read, in part:

“When television is good, nothing — not the theater, not the magazines or newspapers — nothing is better.
But when television is bad, nothing is worse. I invite each of you to sit down in front of your own television set when your station goes on the air and stay there, for a day, without a book, without a magazine, without a newspaper, without a profit and loss sheet or a rating book to distract you. Keep your eyes glued to that set until the station signs off. I can assure you that what you will observe is a vast wasteland.”

Today, Minow’s daughter, Martha Minow, dean of the Harvard Law School, welcomed her father to the stage at her institution as part of an event titled, “News and Entertainment in the Digital Age: A Vast Wasteland Revisited“. Minow (I’ll refer to Newton Minow throughout the rest of this post) starts his talk by noting that we’re a day past the ten year anniversary of 9/11, a time at which there was no YouTube, no Twitter, none of the social media we discuss today to understand the tragic events of the day. If that shift is difficult to comprehend, it’s much harder to understand the landscape of fifty years ago, when phone calls traveled by wire, when there were no computers, one phone company and two and a half television companies. There was no public television or radio. Audiences, Minow reminds us, were passive – they gathered around the single set in the house and watched in silence.

When Minow came to the FCC, it was a group wracked by scandal – previous commissioners had been fired for corruption. Minow’s relationship was a highly personal one with President Kennedy. He recalls a meeting with Kennedy and Commander Alan Shepard, recently returned from the first American voyage into space. Kennedy was enroute to a speech at the National Association of Broadcasters, and asked Minow what he thought Kennedy should say to the broadcasters. He told him, “Mr. President, tell them that this is the difference between a free and a closed society: when the Soviets send people into space, we don’t know whether they succeed or fail. In the US, we let people see and hear what’s going on.”

Kennedy gave a brief speech to the NAB which used Minow’s talking points and got a standing ovation. Minow’s infamous speech didn’t get quite as warm a reception. Minow reminds us that Sherwood Schwartz, producer of the television show Gilligan’s Island, honored him by naming the sinking ship on his show the S.S. Minow.

Why give such an incendiary speech? Television was the dominant medium of the era. The televised Kennedy/Nixon debate had decided the election. But there was little discussion about public interest and public responsibility on the part of broadcasters. Minow’s contribution as an FCC chairman was to try to expand choice – licensing the UHF spectrum, early cable TV systems and satellite television. When Kennedy invited him to visit the space program, Minow observed that satellites were more important to sending a man into space, because they permitted sending ideas into space, and ideas last longer than people. Minow notes that there’s a strong possibility that the recent events of the Arab Spring were a product, in part, of satellite communication.

Both Minow and Kennedy had lived in cities where there was a strong public television statement. They both assumed that public television would spread throughout the country, but there was no public TV in New York, LA or Washington DC. When Minow left the FCC, he went on to serve on the board of governors of the Public Broadcasting Service, and on the Carnegie Foundation, one of the major funders of public broadcasting.

As someone who’s been concerned with public broadcasting for his entire career, Minow tells us that he’s deeply disappointed by the relationship between money and politics. “Politicians need massive amounts of money to buy radio and television ads. They raise money from the public to gain access to something the public owns: the airwaves.” This is an absurdity – the US is one of the few countries in the world that doesn’t provide access to the airwaves to candidates. In the UK and Japan, it’s not possible to buy access to the airwaves. Much of the cost of American campaigning comes from the media.

Minow ends his remarks with praise for his host: “I wish the Berkman Center had existed 50 years ago,” because the issue of the responsibilities of broadcasters was neglected 50 years ago, and is still neglected today.

Anne Marie Lipinski, the new curator of the Niemann Center, is one of the three designated “respondents” to Minow’s remarks. She suggests that the most inspiring aspect of Minow’s remarks is the idea that we can do better – as individuals, as broadcasters. One of the challenges in helping us become better is defining the public interest. “I don’t think we have a shared ethos around te public interest in contemporary society.”

Journalist Jonathan Alter reminds us that Minow is also the father of the televised presidential debate. While we still see this important form of civic programming, most of what passes for civic discourse online is extremely poor. “The news business is the only business recognized by the Constitution and it’s largely dysfunctional.” Talk is cheap and reporting expensive, he argues – “the vast wasteland has a Tower of Babel on top of it.” Much of the news we get is “people like me babbling on MSNBC or Fox”, rather than the sort of expensive newsgathering required to report facts on the ground.

Yochai Benkler calls on a section of Minow’s speech where he challenges broadcasters to challenge their sponsors: “Tell your sponsors to be less concerned with cost per thousands and more concerned with understanding per millions.” This section points to the core tension between an American broadcast model that is anchored in markets, and the challenges of public responsibility. Public funding for media and nonprofit models tend to be foreign to American audiences. Yet there’s evidence that networks like the BBC produce some of the highest quality news content available.

Benkler provokes Alter by suggesting that there’s the possibility of producing key and investigative reporting via radically distributed methods. He suggests that the Neda Aga Soltan video, which Alter alluded to in his remarks, was an example of the power of citizen production. He (generously) references a talk I gave the week before about the complex interaction of Tunisians on the ground, activists in the diaspora and Al Jazeera – a state-funded media network – to amplify voices in Sidi Bouzid leading to the Tunisian revolution. “Because we all now carry sound, video and text generating and disseminating tools – phones – we’ve got an unprecedented opportunity to close the gap between what costs a great deal of money and what we all need as citizens.”

Lipinski asks whether anyone is prepared to pay for this sort of crowd-sourced media, asking if any of us pay people whose blogs and twitter feeds we read. Minow suggests that this may be the wrong place to ask for support. He notes that the Japanese closely studied media models around the world before starting NHK and based their model on the BBC, including charging a license fee for television sets. “Other countries started building public media before they built commercial. We tacked on public broadcasting after the fact, without a way to pay for it.” This leaves us with a difficult choice: “Do you want the market to decide and provide everything? And if the market is not going to provide everything, do you want to build an alternative system?”

Alter suggests we don’t hold our breath waiting for the rise of a new public media system in the US. What’s happening instead is the fragmentation of what media exists. He points to the evening entertainment market, where big shows like Leno’s and Letterman’s are ceding ground to the Colbert Report. “It’s a move towards greater choice.” But the downside of this move is that we may be seeing a divide between elites who have access to a vast selection of media, and masses who get little critical media. “The political conversation involves a maximum of 10 to 15 million people,” he asserts, “but 130 million vote in Presidential elections.”

Ellen Goodman offers a nutritional analogy. “People don’t want to eat their broccoli, but they still might vote.” She’s suspicious of the idea that public media will produce the broccoli and be able to get people to eat it, because “public broadcasting in the US is weak and designed to be weak.” Proposals that are unrealistic but still worth making for the production of marketing of broccoli might not be directed to our existing public media institutions, she argues, because these institutions may not be capable of innovation. “It’s reasonable to ask these actors to solve our problems, but they are not going to solve them.”

Virginia Heffernan, cultural critic for the New York Times, suggests we consider not just news. When we look at television entertainment, especially HBO and Bravo, we’re no longer facing a vast wasteland. Minow invites us to imagine the forces of art, daring and imagination unleashed on the television screen, and the artistic explosion we’ve seen the last few years suggests that “television both as an art form and a public health hazard makes these things possible.”

She offers a caution to Alter’s skepticism about digital media and direct sources – we quickly found dangerous media online, like Loose Change, a video that offered the conspiracy theory that 9/11 was an inside job. But we also were able to find video of Saddam Hussein’s execution, shot and distributed by an American serviceman. “Our million dollar Baghdad bureau didn’t get the execution story right” because they were working from eyewitness testimony from individuals in the room, and that testimony wasn’t correct. The actual account of Hussein’s final words came from the video, not the reporting.

What’s key in this world of internet video, she offers, is contextualization. As the New York Times invests in international reporting, they need to make a major investment in contextualizing these images and videos. Asked by Jonathan Zittrain, our moderator, how we might take on Minow’s challenge to “do better”, Heffernan asks us to “register as a Wikipedia editor today. Twice, if you’re a woman.”

Zittrain observes that the phenomenon of Doris Kearns Goodwin, sitting next to Heffernan, registering on Wikipedia could lead to some interesting edit battles over Lincoln’s biography. Asked whether she will register as a Wikipedian, Goodwin offers, “I didn’t know I could!” (Note to Jimmy Wales – we still have work to do.) C

With three former FCC chairs in the room, Susan Crawford – introduced as a “shadow FCC commissioner” in the Obama administration – is offered the first FCC response to Minow provocations with a line about “beauty before age”. She responds to Reed Hundt with a quip about pearls before swine(!) and suggests we think about parallels between Minow’s speech in the service of a “handsome young president, with a beautiful family” and suggests that such a speech would be unthinkable nowadays. For one thing, Minow would have been speaking to the wrong people. Distribution networks are now so much more powerful than content providers, and players like Comcast now control programming and internet access. “There’s only four actors in America who have any power” around these issues of content of the media, “and they really believe that personal preferences equal good programming.”

Kevin Martin, FCC chair under George W. Bush, focused his observations on a topic dear to my heart – the state of international media. He observed that business network Bloomberg now devotes significantly more resources to overseas coverage than the New York Times. (For the record, so does the Wall Street Journal – business papers cover international more thoroughly than “general interest” sources…) Despite those coverage resources, some Bloomberg channels have had difficulty gaining carriage on some cable systems, where they are perceived as specialist content.

Reed Hundt, who chaired the FCC under President Clinton, calls his moves to force broadcasters to show three hours a week of children’s programming his way of honoring Minow’s legacy. “Mandating children’s programming turns out to be a violation of the first amendment, to my amazement.” Like Minow, Hundt was “honored” by broadcasters’ response to his work – the WB network’s show Animaniacs introduced a clown named Reed Blunt… and offered the show as evidence of their compliance with creating children’s programming.

Minow points out that lawyers end up as chairmen of the FCC because “it’s the only government agency that’s regulating a medium of communication.” Lawyers who understand the first amendment understand how treacherous it is and how complicated regulation in the space can be.

Asked to comment on Minow’s legacy, Nicholas Negroponte offers the observation that photography is a medium where artists have been the technical innovators, while broadcasting is a field where the engineers have worked out the tech while the artists were creative. What the Media Lab tries to do, he tells us, is do for computer media what photographers have done – advance the field by advancing both the tools and the creativity.

Zittrain invites Minow to comment on the rise of Twitter: “threat or menace?” Minow demurs, arguing “the more communication the better.” And he thanks us for considering these issues of public interest fifty years after he raised their importance.

Terry Fisher offers a summation that introduces several new, important ideas. New technologies, and some of the practices that surround them (though are not dictated by them) are eroding some existing, long-standing dichotomies: public/private, professional/amateur, speaker/audience, news/entertainment, university/society. There are huge benefits and costs to this corrosion. We see the collapse of oligarchies, address of systematic biases, democratization of processes. But we also have fragmentation, loss of a coherent single culture, the rise of a tower of pundit babel, and the superficiality of much programming. This move, he argues, is impossible to stop. Instead, we need to think through the new opportunities the shift presents: the ability to change who contributes to this process. And we need to figure out how to ameliorate the costs we suffer. That means creating distributed models for sifting, curating, organizing, like Wikipedia, Slashdot and academic projects like Jeffrey Schapp’s Digital Humanities project. In this new world, the FCC may not be the prime mover – the real power is located in intermediaries like Google, and if we were to push for the public interest, that’s where we’d apply leverage.

06/10/2011 (4:48 pm)

Martin Nowak and the mathematics of cooperation

Filed under: Berkman,hyperpublic ::

Mathematical biologist Martin Nowak talks to us about the evolution of cooperation. Cooperation is a puzzle for biologists because it doesn’t make obvious evolutionary sense. In cooperation, the donor pays a cost and the recipient gets a benefit, as measured in terms of reproductive success. That reproduction can be either cultural or biological and the challenge to explain remains.

It may be simplest to consider this in mathematical terms. In game theory, the prisoner’s dilemma makes the problem clear to us. Given a set of outcomes where we’re individually better off defecting, it’s incredibly hard to understand how we get to a cooperative state, where we both benefit more. Biologists see the same problem, even removing rationality from the equation. If you let different populations compete, the defectors win out against the cooperators and eventually extinguish them. Again, it’s hard to understand why people cooperate.

There are five major mechanisms that biologists have proposed to explain the evolution of cooperation:
- kin selection
- direct reciprocity
- indirect reciprocity
- spatial selection
- group selection

Nowak works us through the middle three in some detail.

In direct reciprocity, I help you and you help me. This is what we see in the repeated prisoner’s dilemma. It’s no longer best to defect. As originally discovered by Robert Axelrod in a computerized tournament, the three-line program “Tit for Tat” wins:

At first, cooperate.
If you cooperate, continue to cooperate.
If you defect, defect.

While it’s a powerful strategy, it’s very unforgiving. If there’s a mistake, there’s an endless cycle of retaliation. Nowak wondered what would happen if natural selection designed a strategy. He created an environment to allow this, and permitting random errors to create a harder environment. If the other party plays randomly, the best strategy is to defect every time. But when tit for tat is introduced, it doesn’t last for long, but it does lead to rapid evolution. You’ll see “generous tit for tat” – if you cooperate, I will. If you defect, I will still cooperate with a certain probability. Nowak suggests that this is a good strategy for remaining married, and step towards the evolution of forgiveness.

In a natural selection system, you’ll eventually reach a state where everyone communicates, always. A biological trait needs to be under competition to remain – we can lose our ability to defect and become extremely susceptible to a situation where an always defect strategy can come into play. Cooperation is never stable, he tells us – it’s about how long you can hold onto it and how quickly you can rebuild it. Mathematically, direct reciprocity can come about if the benefits of cooperation, on average, outweigh the costs of playing a new round.

Indirect reciprocity is a bit more complex. The good samaritan wasn’t thinking about direct repayment. Instead, he was thinking “if I help you, someone will help me.” This only happens when we have reputation. If A helps B, the reputation of A increases. The web is very good at reputation systems, but we’ve got simple offline systems as well. We use gossip to develop reputation systems. “For direct reciprocity, you need a face. And for indirect reciprocity, you need a name and the ability to talk about others.” In indirectly reciprocal systems, cooperation possible if the probability to know someone’s reputation exceeds the costs associated with cooperation. And this only works if the reputation system – the gossip – is conducted honestly.

In spatial selection, cooperation happens based on people who are close geographically, in terms of graph theory. Graph selection favors cooperation if there’s a few close neighbors – it’s much harder to do with lots of loose collaborators. A graph where you’re loosely connected to a lot of people equally doesn’t tend towards cooperation.

06/10/2011 (4:28 pm)

Charlie Nesson and a new vision of the public domain

Filed under: Berkman,hyperpublic ::

Charlie Nesson, one of the founders of the Berkman Center, asks us to consider who we are, and what is our public space. The query that informed the early life of the Berkman Center was whether we, on the internet, were capable of governing ourselves. To address this question, we need to ask what our domain as a people is. He offers, “We are the people of the net, and our domain is the public domain.”

If you want an orderly world of real property, you should build a registry. It’s the same in the world of bits. Charlie is now working on a directory of public domain, starting with the Petrucci collection and the IMSLP – the international music score library project. Charlie doesn’t mean public domain in the strict legalistic sense. Instead, he asks us to think of the public domain as the bits you can reach through the net. We can then separate the space into the free and the not-free, as constrained by copyright and by market.

To ensure we can be the people of the public domain, we need to build our domain on a foundation that is solid in law. We’re going to build based on collections organized by registrars. The problem with that strategy is that registries can be the focus of litigation risk. So the goal is to work with a reputable law firm to protect the registrar, the registry and users of the registry. That helps us positively define the public domain and defend it.

How does this relate to privacy? It’s worth thinking about the key actors involved. What are actors that appreciate individual privacy? Governments are interested in surveillance. Corporations are interested in data acquisition. Look at the librarians and we’l find allies. They are connected to powerful institutions that share the values of privacy.

Judith asks Charlie to strengthen the connection to privacy. He responds, “I don’t like privacy. It tends to be too closely associated with fear, and it always seems like a rear-guard action against technology.” Instead, we should work on the architecture of the public space and ensuring we architect for private space.

06/10/2011 (3:47 pm)

Hubert Burkert – moving beyond the metaphor

Filed under: Berkman,hyperpublic ::

Herbert Burkert of the University of St. Gallen in Switzerland teaches internet law and heads a center at St. Gallen that parallels the work we do at Berkman. He suggests we consider the space between beauty and cocercion. There’s only a few occasions where an audience takes pity on a lawyer, and it’s when a lawyer ventures in the sphere of aesthetics. There’s such a thing as legal creativity, but it usually leaves you facing the ethics board and quickly turns from pity to self-pity. So he wants to move from a presentation on “criteria” to one about “comments”.

His comments are structured around two names. One is Johann Peter Willebrand, a German writer about public security who encouraged registration of foreigners in towns. But he also encouraged the pledge to treat citizens and foreigners politely, which you can read as you wait for hours to pass through immigration in Boston. He’s become something of a hero to Burkert, as someone who’s tried to change the relationship between beauty and coercion, coercing people into beauty.

Burkert’s point – design and architecture talk is dangerous talk. Le Corbusier wanted to design not just buildings, but how people life. Totalitarian designers gave certain architectures to control people. And today’s contemporary suggestions on public safety, walkability, and security need to be considered in this light. When you consider criteria of design, ask whether you’re designing for people, and whose interest you’re designing for. How much space for opportunities to live are you prepared to leave for others?

This leads us to Lina Bo Bardi, an Italian architect who worked in Brazil. She was asked to turn a factory in Sao Paolo into a recreation area. The city is a remarkable and challenging place: so crowded that it’s got the highest percentage of helicopters per capita, because it’s the only way to beat traffic, and has a serious problem with crime. She built a tower and bridges that connected to the factory, suggesting a dialog between work and play. It’s a very striking building – the windows look like the holes that might be made by grenades than designed openings.

How is this relevant? Bo Bardi was designing to create opportunities for social gatherings, and for cross-generational communication. Burkert suggests that cross-generational communication is quite rare in social media. So is cross-cultural communication. And these spaces encourage opportunity for variety, and opportunity for protected openness.

Perhaps the low walls that appear in her design are metaphors for scaled privacy. Or maybe we need to stop using these kinds of physical metaphors, at least from architecture, in these virtual spaces?

06/10/2011 (2:59 pm)

Data, the city and the public object

Filed under: Berkman,hyperpublic ::

Adam Greenfield is the principal designer of Urbanscale, a design firm that focuses on design for networked cities and citizens. He’s interested in the challenge of building spaces that support civic life, public debates, and the use of public space.

The networked city isn’t a proximate future, it’s now. We’ve got a pervasively, comprehensively instrumented population through mobile phones. We have widespread adoption of locative and declarative media through tools like Fourquare and systems of sentiment analysis. And we’re starting to see “declarative objects”, items in public spaces like the London Bridge, which now tweets in its own voice using data scraped from a website. Objets start having informational shadows, like a building in Tokyo, literally clad in a QR code – you can “click” on the building and read more about it.

We’re starting to see cities that have objects, buildings and spaces that are gathering, processing, displaying, transmitting, and taking action on information. We’re subject to new modes of surveillance which aren’t always visual. Tens of millions of people are
already exposed to this, which suggests we may need a new theory and jurisprudence around public objects.

Offering a taxonomy of public objects, Adam starts with the example of the Välkky traffic sensor. This detects the movement of people and bikes in a crosswalk and triggers a bright LED light to warn motorists. This is very important in Finland, which is very dark 20 hours a day, 10 months of the year. He describes this as “prima facie unobjectionable, because the data is not uploaded, not archived, and because there’s a clear public good.

Another example is an ad in the subway system in Seoul. There’s a red carpet in front of a billboard. Walk on it, and the paparazzi in an animated billboard will swivel and photograph you. It’s mildly disruptive and disrespectful, and there’s no consensus public good. On the plus side, it’s purely commercial – there’s no red herring of benefit. And it probably doesn’t rise to the threshold of harm.

And then there’s the soda machine. Adam shows us the Accure touch screen beverage machine in Tokyo, which uses a high resolution display to show you what beverages are available. Each customer is offered different consumables – an embedded camera guesses at age and gender and delivers beverage options to you based on that model. It’s prescriptive and insidiously normative. And it compares information with other vending machines. If you’re a bit abnormal – a man who likes beverages common in the female model, for instance – these systems leave you out of luck. And while they’re commercially viable, there’s no public good associated with this information gathering. We might put this into the same category as interactive billboards with analytics packages, like the Quividi VidiReports, which detects age, gender, and even gaze. There is no opt out – you’re a data point even if you turn away from the ad.

How do we think about these systems when power resides in a network? Adam gives the example of an access control bollard in Barcelona, a metal pile that rises out of the ground to block access to a street unless you present an RFID that gives you permission to pass. This system relies on an embedded sensor grid, RFID system, signage, and traffic law all interacting together. It’s a complex, network system that we largely interact with through that bollard. It’s even easier to understand these systems when they exist solely through code.

There’s a class of public objects that we need to define and have a conversation about. Adam proposes that they include any discrete object in the common spatial domain intended for general use, located on public right of way, or that have de facto shared access to the public. When we build these systems, Adam says, we should design in ways that the data is open and available. That means offering an API, and making data accessible in a way that’s nonrivalrous and nonexcludable.

An open city necessarily has an more open attack surface. It’s more open to griefing and hacking. We need a great deal of affirmative value to run this risk. And we need to develop protocols and procedures to establish precedence and deconfliction around these objects. We’re roughly a century into the motor car in cities and we still don’t handle cars well, never mind these public objects.

Adam advocates a move against the capture of public space by private interest and towards a fabric of freely discoverable, addressable, queryable and scriptable resources. We need to head towards a place where “right to the city” is underwritten by the technology of the space.


Jeffrey Huang of the Berkman Center and EPFL Media x Design Laboratory has been involved with the design of a “hyperpublic” campus in the deserts of Ras Al Khaimah, one of the seven Emirates of the UAE. The Sheik of the state has agreed to fund the joint development of the campus with Huang’s institution in Switzerland, and his design students have been focused on building a university campus that’s deeply public, both in terms of physical and architectural space.

One of the major constraints for the design is lowering water and energy usage. The goal is to make buildings make environmental sense using data. They’ve mapped the building site and located natural low points where water accumulates. The design makes use of these points as “micro-oasises”. The design for the building is large, open spaces around these points, an echo of the EPFL learning center in Laussane, Switzerland.

Within the building, a network of sensors can greet people by name and offer personal services to them. You can interact with people through data shadows, which physically track people through the building, a shadow cast on the wall that shows someone’s name, identity and interests.

He acknowledges the dangers of this system, making reference to Mark Shepard’s Sentient City Survival Kit and an umbrella whose visual pattern scrubs your data from surveillance. But he notes that there’s less need to design the private if hyperpublicness is adequately designed. We should build systems where everyone and no one owns the data, which are fully transparent.


Betsy Masiello from Google works on public policy issues and offers us a practicioner perspective on the topic of the hyperpublic. She tells us she originally misread the title of our session – “The risks and beauty of the Hyper-public life” and skipped over the risk part. She worried we might be celebrating a “Paris Hilton-like existence of life streaming,” making your identifiable behavior available to anyone who chooses to watch.

There’s a better way of thinking about data-driven lives and existences. Systems like Google Flu trends uses lots of discrete points of information to make predictions about health issues – this gets quite important when this helps us target outbreaks of diseases like dengue fever. Unlike the pure performance of a public life, we get public good that comes from big data analysis.

She offers a frame for analysis: predictive analytics based on your behavior, which use your data and make it clear how it’s used veruss systems that are predictive based on other people’s behaviors, like Google’s search, flu trends, and perhaps the soda machine Adam talks about. Both systems can be very valuable. But the risk is the collapse of contexts that happens in a hyperpublic life – the idea that data can be reidentified and attached to your identity.

She recalls Jonathan Franzen’s essay, “The Imperial Bedroom”, from 1998 about the Monica Lewinsky scandal. Franzen suggests that without shame, there’s no distinction between public and private. The more identifiable you are, the more likely you are to feel that shame.

The current challenge we face is contructing and managing multiple identities. Ideally, we’d have ways to manage an identity that includes a form of anonymity. It’s becoming trivial to reidentify people within sets of data. We may need to have policy interventions that put requirements on the data holders, punishing people who release information that allows people to be reidentified.


There’s an interesting argument that arises around privacy and transparency. Adam offers his frustration that Amazon continues recommending Harry Potter to him despite having 15 years of purchasing behavior data, none of which should indicate his desire to read fantasy. Jef sees this as a problem of too little data, not too much. Jeff Jarvis, moderating, criticizes Adam for asking for too much privacy and tells us he doesn’t want a world in which we can’t customize, and where we’re forced away from targeted data when it’s useful.

Next Page »