My Heart's in Accra

Ethan Zuckerman's musings on Africa, international development
and hacking the media.

05/06/2011 (1:56 pm)

Shirley Hung on control of the Chinese Internet

Filed under: Human Rights,Media ::

I spent part of Wednesday at MIT’s Communications Futures Program, where the topic at hand was internet openness. Political scientist Shirley Hung opened the afternoon session with a discussion of “a different vision of openness”, looking at the Chinese internet.

Hung suggests that the authorities who control the Chinese Internet believe their internet is open, just not the way we think of it in the west. To explain how this “walled garden” works, she asks us to think through questions of freedom of speech. We think of freedom of speech as a universal right… and it is, as enshrined in the Universal Declaration of Human Rights. And freedom of speech is explicitly addressed in the Chinese constitution, in article 35. It’s in the body of the document, not in an amendment, as in the US constitution… and she suggests that it’s worth remembering that the Chinese constitution is based on the US constitution.

The apparent disparity comes from a disagreement over what freedom of speech means. A 2010 Chinese government white paper declared, “Chinese citizens fully enjoy freedom of speech on the Internet”. A few months later, a party official in charge of China’s internet censorship apparatus, Wang Chen, celebrated the fact that 350 million pieces of “harmful information”, including text, pictures and video, had been removed from the Chinese internet in the previous year. That’s 1 million pieces of content a day. And this doesn’t count content like Facebook or Twitter that’s simply blocked in China – that’s content manually removed from websites.

Why do Chinese authorities believe they have an open internet, despite this heavy, pervasive control of content? When the West – and America, in particular – talks about internet freedom, we talk about a single, open internet that works the same way everywhere. That’s a value-laden conception of the internet, one associated with political ideas about expression and participation. China values the internet in terms of economic development and advancement. When China talks about the internet, they talk about national sovereignty and cultural differences, and express the idea that internets might be different in different countries. From their perspective, the US is trying to export their view of the internet, while China is asking for each country to determine its own priorities and future.

To understand the Chinese perspective, it’s helpful to review China’s history. Hung offers us an extremely abbreviated view:
- The top priority for the current government is stability, which means preservation of Communist party rule
- The party’s legitimacy is based on two pillars – economic development and territorial integrity
- China has a history of social unrest leading to revolutions. Almost every shift between dynasties has come through social unrest.

It’s worth remembering that this is the first regime to have solved the warmth and hunger problems – people aren’t generally starving, Hung reminds us, and they have money to buy televisions and mobile phones. This success in meeting people’s basic needs may give the government more latitude to control the internet.

The Chinese approach to control apparently rests on three principles:
- Push responsibility and implementation of control downwards, through the network
- Create multiple levers to enable fine-grained control, so you don’t need to shut off the internet, ala Egypt
- Rely on the panopticon and on deterrence to force users to self regulate.

She shows us an organizational chart of the various bureaucracies that control the Chinese internet. Her chart – an oversimplification she suggests – includes more than 20 entities, divided into three general groups. One set of groups are party organs. Another are government agencies. And a third set are “quangos”, quasi-NGOs, which are funded and approved of by the government but are not technically part of the government. This complex system exists at national and local levels – much of the control over the Chinese internet is delegated to the Beijing government because companies like Baidu are based in the capital city.

A first line of regulation is red tape – websites require licenses, registrations and permits. “You need a stamp and a seal to do almost anything.” These regulations look silly and inefficient, but they’ve got a purpose – they allow for multiple, different checks against certain behaviors. If you don’t like what a site is doing, you can deny its owners a stamp or seal and shut it down.

Almost all Chinese sites feature a button that can summon the Cyber110 police. That’s the Beijing police reporting center – click the button and you get a screen showing animated policemen, each of whom are willing to take your report of various different online behaviors where a web publisher might be crossing a redline. Those reports can cause publishers to lose points – each publisher has 100 points when they begin, and they can gain more by publishing pro-government stories and lose points for failing to remove content in a timely fashion. Sites that publish content are required to run third-party monitoring systems, separate bureaus that report to the editor in chief and which monitor whether the content the site publishes is appropriate.

Control is delegated down to these quasi-NGOs, and from them to individual citizens. The Beijing Association of Online Media recruits a team of citizens who monitor the internet, each of whom is required to report 50 pieces of “harmful information” each month. The 50 cent party (it’s more like 7 US cents when converted from RMB to US dollars) compensates individuals for posting pro-government information on bulletin board systems and fora. In this sense, the structure of monitoring reflects older structures on societies. Since the emperors, neighbors have been spying on neighbors, and this pattern continues in a digital age.

Hung explains that she’s stronger on policy than on specifics of technology, but offers a brief outline of China’s technological arms race. The “great firewall” – known locally as the “golden shield” – uses IP blocking, port blocking, keyword and URL filtering, packet filtering and other techniques to block content. When users access banned content, they often experience TCP resets, and sometimes longer bans from the internet. Recently, there’s evidence that commercial, subscription VPNs are being blocked. China has also signalled a willingness to filter on the client side – the failed Greendam project sought to put filtering on individual PCs, a further push towards decentralization. And we’re now seeing a rise of sophisticated attacks, like malware targeted at dissidents and independent media organizations.

She sees this as an indication of a growing vertical integration of control. China has a great deal of influence over equipment providers like Huwawei. State ownership of fiber allows another level of control, as does influence over and ownership of telephone companies. By blocking access to non-Chinese Web2.0 companies, the government has opened a market for domestic companies that compete in the social media space. Some of these systems are pretty amazing, like QQ, which has 500 million accounts, a pretty impressive metric in a country with 400 million internet users. When Chinese companies can’t build their own, they partner – we may see a Facebook/Baidu partnership in the near future. And control extends to devices, through registration of handsets and SIM cards, and the need to use a national ID to log in at a cybercafe.

The control isn’t just technical – it’s often focused on content. Hung tells us that there’s a set of approved media sources one should quote from, suggesting that you can’t write or report your own news. And given the presence of monitors within these platforms, it’s increasingly unlikely that you’d see original news reporting.

Questions for Hung focused on whether this overview of regulations accurately reflects the reality of the Chinese internet. If these regulations are all enforced, why is cinema and software piracy so rampant within the country? Clearly, there’s some disparity between the legal mechanisms that enable control and the actual practice.


While I thought this was a great overview – one of the very best I’ve heard – of the systems currently deployed to control conversations on the Chinese internet, I worry that looking at these systems may blind us to the richness of the content that gets created in China. My colleague Dong Hao offered a great discussion of some of China’s top sites, and the ways in which they often creative, participatory behaviors not seen on the English-language internet. I think many US observers of the Chinese internet hear the complexities of censorship and assume everyone posting on the Chinese internet is reciting party propaganda. It’s a shame we don’t have a site like Yeeyan.org helping translate large swaths of the Chinese internet into English. But even a quick visit to sites like EastSouthWestNorth or chinaSMACK should give a sense of the richness of content that’s available.

This isn’t to say that Chinese censorship isn’t onerous, or doesn’t profoundly shape online conversation. But it would be a mistake to limit an understanding of the Chinese internet to what isn’t permitted at the expense of what is.

05/06/2011 (1:20 pm)

Media Cloud, relaunched

Filed under: Berkman,Media ::

Today, the Berkman Center is relaunching Media Cloud, a platform designed to let scholars, journalists and anyone interested in the world of media ask and answer quantitative questions about media attention. For more than a year, we’ve been collecting roughly 50,000 English-language stories a day from 17,000 media sources, including major mainstream media outlets, left and right-leaning American political blogs, as well as from 1000 popular general interest blogs. (For much more about what Media Cloud does and how it does it, please see this post on the system from our lead architect, Hal Roberts.)

We’ve used what we’ve discovered from this data to analyze the differences in coverage of international crises in professional and citizen media and to study the rapid shifts in media attention that have accompanied the flood of breaking news that’s characterized early 2011. In the next weeks, we’ll be publishing some new research that uses Media Cloud to help us understand the structure of professional and citizen media in Russia and in Egypt.

With our relaunch of the site, many of our most powerful tools are now available for your use. We’re hoping Media Cloud proves useful to anyone interested in asking questions about what bloggers and journalists are paying attention to, ignoring, celebrating or condemning.

We hope the tools we’re providing are a complement to amazing efforts like Project for Excellence in Journalism’s News Coverage and New Media indices – we consider their tools the gold standard for understanding what topics are discussed in American media. PEJ works their magic using talented teams of coders, who sample different corners of the media ecosystem to find out what’s being discussed. We use huge data sets, algorithms and automation to give a different picture, one focused on language instead of topic.

At its most basic, Media Cloud gives a picture of what journalists and bloggers and writing about by counting the words used in recent stories. Above is a cloud of language used in our set of political blogs during the week starting on Monday, May 2nd. We can see language about the US raid on Osama bin Laden’s compound, including obvious words like Abbotabad, Bin Laden and raid, as well as words that suggest particular interests within those stories: helicopter, SEALs, intelligence, interrogation, Pakistan. Even with a major story dominating discussion, we see glimpses of other issues, like the US Congress Caregiver’s Act and speculation that Indiana governor Mitch Daniels will enter the Presidential race. You can click each word in the cloud and see what sentences in different blogs contained the term in question, how often it was used, and how that source compared to others.

Comparison is where our tool is most powerful. The cloud above shows the differences between words used in left and right wing blogs during the same time period. We start to see differences in what aspects of the Bin Laden story bloggers focused on. Bloggers on the left used the words “torture” and “waterboarding” while bloggers on the right use “interrogation” and “terrorist”. Other comparisons are less obvious – we see more discussion of debate about releasing raid photos on the right than on the left, and a discussion about expanding the Hyde Amendment (which affects congressional funding for abortion) on the left.

We’re also able to make general statements about the similarity or difference in word usage in these comparisons. While the left and right may both be focused on the raid in Pakistan, the similarity score (near the bottom of the word cloud, towards the right) suggests a larger disparity in agendas than we saw looking at these two sets of media a year ago, when both sides were talking primarily about Arizona’s tightened immigration laws. I’ve been taking an in-depth look at similarity scores to understand how media attention can shift at moments of international crisis, and how the recent, internationally-focused media cycle may differ from the news we often get in the US.

What our tools let you do with Media Cloud are really just the tip of the iceberg. The code behind our system is published under an open source license, so other researchers can build systems to monitor media in other countries and other languages. (We’ve got a system monitoring Russian media and blogs that you’ll hear more about soon.) We are publishing huge sets of data that include information on word frequencies in different stories for researchers who want to analyze American media without collecting their own data. And we’re hoping to collaborate with researchers around the world who’d like to use our tools and data to ask and answer pressing questions about what’s covered and how.

This new release is thanks for the hard work of Hal Roberts, architect of the project, David Larochelle, developer extraordinaire, Zoe Fraade-Blanar, whose skill at interface design has made our work vastly more useable as well as more attractive. Thanks to them and everyone else involved with the Media Cloud project. Hope you’ll check our work out and let us know what you discover.

05/02/2011 (7:03 pm)

links for 2011-05-02

Filed under: del.icio.us links ::

« Previous Page