My Heart's in Accra

Ethan Zuckerman's musings on Africa, international development
and hacking the media.

07/29/2010 (12:37 pm)

Counting International Connections on Facebook

Filed under: Geekery,xenophilia ::

My friend Onnik Krikorian has become a Facebook evangelist. Onnik, a Brit of Armenian descent, living in Armenia, is the Global Voices editor for the Caucuses, which means he’s responsible for rounding up blogs from Armenia, Georgia, Azerbaijan as well as parts of Turkey and Russia. This task is seriously complicated by the long-term tensions in the region. Armenia and Azerbaijan are partisans in a “frozen” conflict – the Nagorno-Karabakh war, which lasted from 1988 – 1994, and remains largely unresolved.

It’s taken Onnik years to build up relationships with bloggers in Azerbaijan, relationships he needs to accurately cover the region. Azeri bloggers are often suspicious of his motives for connecting and wonder whether he’ll cover their thinking and writing fairly. But Onnik tells me that Facebook has emerged as a key space where Azeri and Armenians can interact. “There are no neutral spaces in the real world where we can get to know each other. Facebook provides that space online, and it’s allowing friendships to form that probably couldn’t happen in the physical world.” (Onnik documents some of the conversations taking place between Azeri and Armenian bloggers in a recent post on Global Voices.)

Picture 1
Graph from the front page of peace.facebook.com

Onnik was talking about his love of Facebook at an event hosted by the US Institute for Peace, where I and colleagues at George Washington University and Columbia were presenting research we’d carried out on the use of social media in conflict situations. Onnik’s hopes for Facebook as a platform for peace were echoed by Adam Conner of Facebook, who showed the company’s new site, Peace on Facebook. The site documents friendships formed between people usually separated by geography, religion or politics. Some of the statistics seem clearly like good news – 29,651 friendships between Indians and Pakistanis per day. Others are rather dispiriting – 974 Muslim/Jewish connections in the past 24 hours.

I’m a data junkie, and there’s little more frustrating to me than an incomplete data set. Basically, by showing us a very small portion of the nation to nation social graph, Facebook is hinting that the whole graph is available: not just how many friendships Indian Facebook users form with Pakistani users, but how many they form with Americans, Canadians, Chinese, other Indians, etc. Obviously, this is info I’m interested in – I’ve been building a critique that argues that usage of social networking tools to build connections between people in the same country vastly outpaces use of these tools to cross national, cultural and religious borders.

Without the whole data set, it’s hard to know whether these numbers are encouraging or not. Are 29,651 Indian/Pakistani connections a lot? Or very few, in proportion to how many connections Indians and Pakistanis make on Facebook in total? In other words, we’ve got the numerator, but not the denominator – if we had a picture of how many connections Indians and Pakistanis make per day, we might have a better sense for whether this is an encouraging or discouraging number.

I made a first pass at this question this morning, using data I was able to obtain online. Facebook tells us that the average user has 130 friends – a number that might be out of date, as the same statistics page lists “over 400 million users”, not the half billion currently being celebrated in the media. (Ideally, we’d like to know how many new friends are added per day so we can compare apples to apples, but you got to war with the data you have…)

We also need a sense for how many Facebook users there are per country. Here, we turn to Nick Burcher who publishes tables of Facebook users per country on a regular basis. Nick tells readers that the data is from Facebook, and the Guardian appears to trust his accounts enough to feature those stats on their technology blog. They are, alas, incomplete – Burcher published stats for the 30 countries with the largest number of Facebook users, and revealed a few more countries in the comments thread on the post.

Because we don’t have data for Pakistan, we can’t answer the India/Pakistan question. But we can offer some analysis for Israel/Palestine and Greece/Turkey.

Facebook for Peace tells us that there are 15,747 connections between Israelis and Palestinians for the past 24 hours. The term “connection” is not clearly defined on the site – it’s not clear whether a reciprocated friendship is 1 connection or 2 – because I’m going to count the number of Israeli friends and Palestinian friends, it makes sense to count a reciprocal friendship as two connections. (If Facebook is counting differently than I am, my numbers are going to be half what they should be.)

3,006,460 Israelis are Facebook users… a pretty remarkable number, as it represents 39.92% of the total population of the nation and roughly 57% of the country’s 5.3 million internet users. There are very few Palestinian internet users – 84,240, or 2.24% of the population… This mostly reflects how few Palestinians are online, as Facebook is used by 21% of Palestine’s 400,000 internet users.

At 3,090,700 Palestinian and Israeli Facebook users, we should see almost 402 million friendships involving an Israeli or a Palestinian. If we extrapolate from 15,747 friendships a day to 5.7 million a year, we’re looking at Israeli/Palestinian friendships representing 1.43% of friendships in the Israeli/Palestinian space… with all sorts of caveats. (The biggest is that the use of a year-long interval to calculate total friendships is totally arbitrary and probably not supportable. If you’ve got better data or a suggestion for a better estimation method, please don’t hesitate to speak up.)

We get very different results from looking at Greece and Turkey. 2,838,700 Greeks are Facebook members (25.11% of the national population), while 22,552,540 Turks (31.08% of the population) are. That’s roughly 3.3 billion friendships projected, and our year-long approximation finds us just over 4 million Greek/Turkish connections. That suggests that only 0.12% of friendships in the pool are Turkish/Greek friendships.

What explains the disparity between these numbers? While there’s certainly a long history of tension between Greece and Turkey, the last major military confrontation between the nations ended in 1922. Israel and Palestine, on the other hand, are involved with an active conflict and Israel’s recent incursion into Gaza ended a few months ago. What gives?

It’s possible that the numerous efforts designed to build friendship between Israeli and Palestinian youth are having an impact, much as Onnik’s work in Armenia and Azerbaijan is showing positive results. But there’s another possibility – 20% of the Israeli population are Arab citizen of Israel, and the majority of this set is of Palestinian origin. It’s certainly possible that the high percentage of Israeli/Palestinian friendship includes a large set of friendships between people of Palestinian origin in Israel and Palestinians… indeed, given the difficulty for both populations in meeting in physical space, we’d expect to see increased use of the internet as a meeting space to compensate for the difficulties of meeting in the physical world. This could be a factor in explaining India/Pakistan friendships as well, as well as Albanian/Serbian friendships, as the emergence of new nations through partition and conflict left groups united by cultures, separated by borders.

My goal in this post isn’t to belittle the power of Facebook for providing a border-transcending space where friendships can be built – Onnik’s story makes it clear that Facebook is a real and powerful tool for good, at least in the Armenian/Azeri space. But I continue to think that we overestimate how many of our online contacts cross borders and underestimate how often these tools are used to reinforce local friendships. I’d invite friends at Facebook to correct my numbers or my math… and mention that we could do a much better job of answering these questions if Facebook would release a data set that shows us all the cross-national connections made on the service.

—-

Ross Perez has created some great interactive maps that visualize the adoption of Facebook around the world, using Burcher’s data – worth your time.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

05/22/2010 (2:29 pm)

Democrats, Republicans and Appropriators

Filed under: Geekery ::

I had the good fortune to catch a small part of a conference at Harvard yesterday on text analysis. Good fortune, because I was there long enough to hear Justin Grimmer‘s talk on his dissertation, Measuring Reputation Outside Congress. Grimmer is interested in an important – and tough to answer – question: how responsive are the people we elect to their constituents?

We could look for ways to answer this question by studying the voting record of legislators (qualitatively or quantitatively), examining their work in Washington (through Congressional literature) or through examining their communications with constituents at home. This latter set of questions is referred to as the “Home Style” of a politician, following the work of Richard Fenno (1978).

Home style tells us something about politicians that their voting record often doesn’t, Grimmer tells us. He invites us to compare Senators Jeff Sessions (R-Alabama) and Richard Shelby (also R-Alabama). If we consider them simply in terms of their voting behavior, they look nearly identical – they vote together the vast majority of the time and both can be described, in voting terms, as conservative republicans.

But anyone who knows Alabama politics will tell you that Sessions and Shelby are vastly diferent guys. Grimmer characterizes Session as “an intense policy guy” who will bore you to tears with incredibly long, thorough explanations of issues when all you wanted was a photo with him. Shelby, on the other hand, is all about bringing home the bacon… and there are Shelby Halls at two Alabama universities to prove it.

Evidence suggests that representational style – policy versus pork, heavy versus light communicators – cuts across party lines. And it’s likely that politicians have diverse, stable, nonpartisan home styles. If we can find ways to characterize these differences – Grimmer proposes studying the difference in communications with constituents that claim credit and those that discuss policy – we have the opportunity to compare across senators, and connect these differences to what senators do within the institutions of power.

When Fenno studied the “home style” of politicians in 1978, he engaged in “soaking and poking” – intense participant observation, which involved following 18 members of Congress over 8 years. This method, Grimmer observes, is expensive, underrepresentative (and really hard to replicate as a graduate student.) Instead, we might study texts produced by senators. One candidate is newspaper articles… but editorial bias makes it hard to use editorials as representative of senatorial communications. We might use the constituent newsletters produced by Senate offices… but they’re sent using the constitutional Franking privleges and are very hard to get hold of.

Instead, Grimmer has been studying the press releases that senate offices produce – over 64,000 in all. The average senator issues 212 press releases per year, and while the quantity produced has a wide range (some produce only a few dozen, while Hillary Clinton’s senate office produced over a thousand a year), there’s no strong correlation between political party and usage of the tool.

After collecting the releases, Grimmer used machine learning techniques to separate transcripts of floor statements (which are usually released as press releases) from pure press releases, which let him study how a senator chooses to speak to her constituents. Once that sorting has taken place, the task is pretty simple – determine the topic of a press release. This is simplified by the fact that congressional aides try hard to ensure that press releases are on a single topic.

Grimmer’s work clusters senators by the topics discussed in their press releases. His research reveals four basic clusters:

- Senate Statespersons. These folks speak like they’re running for president… and they may well be. Their releases discuss the Iraq war, intelligence issues, international relations and budget issues. John McCain’s office communicates this way.

- Domestic policy. These senators are also policy wonks, but their focus is domestic – the environment, gas prices, DHS, and consumer safety.

- Pork and policy – Communication from these senators includes discussions of water rights grants, but also has serious discussion of health and education policy. Sometimes this is because the office simply issues lots and lots of releases – (former) Senator Clinton’s office fits in this camp.

- Appropriators – These guys communicate about the grants they’ve won – fire grants, airport grants, money for universities, and for police departments.

As well as clustering press releases based on topic, Grimmer’s work considers another metric – how often a press release claims credit for an appropriation. There turns out to be a vast spectrum, ranging from John McCain, who basically only issues statements about policy, and a guy like Mike DeWine, an Ohio Republican, where virtually every press release claims credit for an appropriation. There’s a very strong correlation between the topic clusters in releases and the percentage of releases claiming credit. (That’s at least in part because claiming credit is one of the topic clusters – you’re correlating between, in part, the same factor. Interesting nevertheless.)

What’s most interesting is that this classification – either by type of politician or by their place on the credit spectrum – is tightly correlated to their voting behavior on a particular issue: votes on appropriations rules, or as Grimmer puts it, “How do legislators self-regulate the porkbarrel”. These votes aren’t partisan – the late Ted Kennedy voted with Richard Shelby on these sorts of votes, which suggests truth to the truism that there are three parties in Congress: Democrats, Republicans and Appropriators. In other words, the way a Senator communicates with constituents is strongly predictive of their legislative behavior, specifically on how they allocate funds.

I thought this was excellent stuff – I hadn’t seen someone take a large database of political communications and subject it to automated analysis, and I thought the demonstration of this “third party” was particularly compelling.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

05/03/2010 (4:26 pm)

ROFLCon: From Weird to Wide

An audio version of danah and my keynote is now available for download online. I recommend a background of lolcats – preferably multilingual ones – as you listen.


I gave a dozen public talks last month, and it’s possible that ROFLCon was the most intimidating of the bunch. I was asked by Tim Hwang, internet researcher (and Berkman Center affiliate) co-founder of The Awesome Foundation and of ROFLCon, to kick off the event by co-keynoting with (dear friend) danah boyd. danah actually works in the deep swamps of contemporary internet culture, so ROFLCon – a conference that takes both a loving and scholarly look at the phenomenon of internet memes – is close to home turf for her. I, on the other hand, tend to study things like the impact of cellphones in political organizing in the developing world, and wondered if there was any possible way to connect the sort of issues I work on with a conference that featured Mahir Cagri (of I Kiss You fame), the owner and videographer of Keyboard Cat and the author of Garfield Minus Garfield.

Turns out I was underestimating ROFLCon. Yes, there were panels where the main question seemed to be, “What’s it like to be a microcelebrity”… which may have included the panel danah and I moderated. And yes, there’s nothing to make you feel old and decrepit like walking into a panel where you don’t know a single one of the internet memes being celebrated. (No, I’d never heard of cornify. No, my world has not been substantially broadened by listening to their founder, wearing a unicorn mask, discuss vampires.) On the other hand, the panel on race – I can haz dream? – was one of the best conference panels I’ve ever attended. (If any network execs are reading this blog, let me just point out that a late night show based around Baratunde Thurston and Christian Lander would kill.) And many of the people at the conference seemed to be deeply engaged in the sorts of issues danah and I were talking about – Who creates internet culture? Whose voices are amplified and whose aren’t? What happens when marginal, weird cultures become mainstream?

Alex Leavitt did an excellent job of liveblogging our talks. I thought I’d post my notes and some of my slides as well – the full slide deck is online, though isn’t real useful without accompanying notes, which follow below.


It’s not easy being an academic at a conference like ROFLCon. The stars are the folks who’ve done something wonderful, weird, unforgetable, or so wonderfully weird it’s unforgetable. Those of us who are trying to make observations about the field feel a little like musicologists studying Bach – we can study his compositions exhaustively, but we’re acutely aware that we’re not going to write a mighty fugue. No matter how much I might study internet memes, I know I’m never going to accomplish something as majestic as keyboard cat… and I have to live with that truth every day of my life.

Unlike danah who can actually tell you something about internet culture, I study information in the developing world. Basically, I’m interested in the question of whether the internet, mobile phones and community radio can make people healthier, wealthier and more free.

slide4.004

If you work in this field for very long, you’ll end up realizing that the basic question behind development economics is “Why are some people rich and other people poor?” There are better and worse answers to these questions. Some of the smartest answers focus on which parts of the world had animals and plants that were easily domesticated and which had endemic diseases. Other smart answers look at the ways in which colonialism held back development or look at the problems of bad governance and persistent conflict. Bad answers to the questions focus on the idea that some people are inherently, biologically smarter than others. This idea – “scientific racism” – surfaces throughout history, as the basis for eugenics and more recently in psuedo-scientific analyses of IQ scores.

If you’d like to understand just what a stinking heap of bullshit scientific racism theories are, I recommend spending some time in very poor nations. You’ll discover that many of the people you meet display extraordinary creativity as they navigate the challenges of everday survival. And you’ll start learning about people like William Kamkwamba, whose near death from famine in Malawi didn’t prevent him from building a fiendishly ingenious power-generating windmill from an old bicycle and some recycled PVC pipe.

My time in the developing world suggests to me that intelligence, creativity and humor are evenly distributed throughout the world. People’s ability to express their intelligence, creativity and humor – and our ability to encounter said traits – are heavily geographically constrained, but the basic distribution is near constant.

slide7.007

All of which leads us to the question at hand today: Daddy, where do memes come from? I suspect Drew will be asking me this question any day now, due to Rachel and my egregious tendency to misuse Cafe Press and the fact that we gave him the middle name “Wynn” in part so we could title his blog “For the Wynn“. In answering these questions, I find that I’m usually referring to Randall Munroe’s brilliant
Online Communities map, and to the fertile equatorial regions that extend from the Gulf of YouTube through the Ocean of Subculture. Within this region, there are areas whose soils – turned black with the charring of endless flamewars – are especially fertile for the cultivation of new memes. (sup, /b/?)

slide10.010

I’m interested in mapping memes in a different way. Here’s a quick and dirty map of internet memes extracted from Know Your Meme. Yes, the US and Japan dominate global memetics (or, at least, they do based on the site, which has its own – recognized, now being addressed – cultural biases). But there’s a huge number of memes coming from almost all corners of the globe.

In development economics, we pay special attention to the so-called BRIC countries – Brazil, Russia, India, China – who we expect to become increasingly important over the next few decades due to their large populations, natural resources and rates of economic growth. And so we shouldn’t be surprised to find distinctly regional memes emerging from each of these countries – I offer as a gallery of superheroes Brother Sharp from China, Golimar from India, Glazastik from Russia and the legion that is Tenso from Brazil. You may not know who these viral wonders are, but the people who live in these rapidly developing nations do.

Assume I’m right and that creativity has a near-constant distribution. Assume also that access to the internet continues its explosive spread. The inescapable conclusion is that the next wave of internet memes is going to come from the developing world.

It’s already happening – I just watched the first major Kenyan internet meme come to life. The Nairobi-based band called “Just a Band” released a video for a song called “Ha-He” off their new album. The video’s absurdly good – it’s shot by the guys in the band, and it introduces a new superhero: Makmende.

Actually, “Makmende Amerudi” means “Makmende has returned”… “Makmende” was what you called a kid in the neighborhood in Kenyan in the 1990s who wanted to be Bruce Lee. I heard it and assumed that it was a sheng word – “sheng” is the blend of Swahili and English that’s Kenya’s unofficial national language – turns out that “Makmende” is what happens when Kenyans say “Go ahead, make my day”.

So Makmende kicks the ass of all comers in this video, gets the girl… who he promptly ignores, and spouts some incomprehensible but pithy aphorisms. This video went crazy in the Kenyan blogosphere – which is an extremely creative space – and we started seeing Makmende magazine covers, a 10,000 shilling note and lots of video remixes.

Above, we see a local television reporter come to a rapid and bad end when he has the misfortune of finding Makmende’s house… in sort of a Nairobi version of the Blair Witch project. And yes, Hitler’s upset about Makmende as well… But the best stuff actually has pretty low production values – it’s the website aggregating the sort of Makmende one-liners that shot across Twitter for a week or so after the video became popular. Sure, lots of the content here could have appeared on Chuck Norris Facts, but much of what’s there is indigenous to Kenya, and may not make sense if you’re not Kenyan.

Makmende’s so badass that he raises two philosophical questions for me. The first is, “Who gets to decide what’s a meme?”

slide21.021

Brilliant and funny lexicographer Erin McKean tells us that new worlds enter the language because people love them enough to use them. Lexicographers aren’t the bouncers at the language club; they’re anthropologists, discovering and documenting how language gets used. This is clearly how memes work as well – if people adopt it, love it and transform, it’s a meme… and what anyone else says doesn’t matter.

But it sure as hell helps if it ends up in Wikipedia. Getting Makmende into Wikipedia was one of the first things Kenyans tried to do… and getting things into Wikipedia is a lot harder than it used to be. The article was deleted a couple of times before the authors realized that they needed to make the case that Makmende was Kenya’s first major internet meme, which made it notable. It hasn’t made it into Know Your Meme yet – it was summarily deadpooled when last submitted.

My hope is that all of us who are interested in internet culture can be anthropologists, not bouncers. Yes, not everything that gets posted online is worthy of our study and amplification… but it’s worth keeping in mind that we sometimes don’t understand the unfamiliar at first and would find it intensely cool if we took a bit more time to try and understand it.

My second question is: “Who gets to play along with an internet meme?” On the one hand, there’s not much preventing you from adding some Makmende facts to the mix. On the other hand, a lot of the funny stuff already posted doesn’t make much sense unless you know the language and the culture. “Makmende hangs his clothes on a Safaricom line” only is funny if you know that Safaricom is Kenya’s largest mobile phone company and doesn’t have any traditional phone lines.

My sense is that most memes don’t cross between cultures because we don’t understand the language, don’t understand the references or weren’t paying attention to that corner of the internet to start with. Those that do tend to be funny in a way that’s independent of language. The Back Dorm Boys are pretty funny, and it’s not hard to figure out how to join in the fun.

This question parallels one that internet scholars are spending a lot of time on: Do we have one internet or many? When a country like China heavily censors their internet and encourages the growth of a parallel internet, do we hit a point where it just doesn’t make sense to talk about “the internet” anymore? Perhaps we’ve got to talk about internets, and how they interconnect. And if 340 million Chinese internet users look mostly at Chinese sites, laugh at Chinese memes, maybe it makes sense that the Chinese internet will eventually run on its own protocols, which might make it easier to censor or control. Go far enough down this road and you can imagine diverging internets, each trying to best meet the needs of their users, and no longer having a world where we readily peer into each other’s internets.

slide 26.026

If we care about a single, united internet, it is imperative that we develop, discover and disseminate internet memes that we can laugh at together. When governments censor political sites on the internet, they alienate the small portion of their populations who already identify as politically dissident – and they can make the case that they’re protecting their citizens from terrorism or incitement to violence or pornography. But when they block our access to videos of cats flushing toilets, we see them for the heavy-handed bullies that they are. The cute cats serve as cover traffic for more serious political speech – so long as chinese users want to laugh at our cat videos, we’re encouraging people to circumvent censorship and potentially encounter all sorts of stuff on YouTube.

The Chinese have developed cute cat technology. Even a cursory glance at Youku shows that the once apparently insurmountable cat gap has been thoroughly bridged. And not just simple cute cats – Youku features cats flushing toilets! And not just western style toilets – squat toilets as well! If we accept my assertion that it’s politically critical for us to LOL together, we need not just to be studying Chinese net memes – we need to develop memes we can LOL at across cultures.

When we cross cultural borders in internet memespace, we’re usually laughing at someone else. Engrish, funny though it is, is basically the act of laughing at someone for failing to speak your (absurdly complex and irregular) mother tongue. I’m deeply impressed with people like Mahir Ça?r? who managed to turn the experience of being laughed at by the entire internet into laughing along with the joke. It takes an unusual personality to pull this off – I’m not sure that laughing at and inviting folks to laugh along is always the best way to go.

I’d rather take the example of Matt Harding, the video game developer who spent years travelling the world, dancing badly. After the success of his first video, Matt discovered that the piece of music he’d used – “Sweet Lullaby” by Deep Forest – had a problematic history. The very short version – the French musicians behind Deep Forest used a lullaby from the Solomon Islands to record their hit song, without seeking permission from the woman who sang the song and over the explicit objections of the musicologist who recorded it. Worse, they presented it in such a way that most listeners thought it came from central Africa, not from the south Pacific.

Matt could have dismissed this story as an ugly footnote to his adventures with internet fame. To his great credit, he didn’t. Instead, he went to Auki, a small town in the Solomon Islands, to interview a nephew of Afunakwa, the woman who’d recorded the original song. It was his way of apologizing for the complex past of the song, and his way of using the weirdness of internet fame to make his world – and all those of us who’ve watched the video – a little wider.

My conclusions?
- We can go from weird to wide, as Matt did, using the strange and quirky corners of the internet to prod us into curiosity
- It’s worth asking ourselves if we’re laughing at, or laughing with. And if we don’t like the answer, perhaps we need to change our behavior.
- Anthropologists are cooler than bouncers.
- If we don’t laugh at Chinese internet memes – the first step towards getting Chinese users to laugh at global memes – the censors win.
- “Erinaceous” is a totally awesome word.


Highlights of presenting the talk included:

- Co-presenting with danah, which encouraged significantly sillier behavior than I generally engage in when on stage. I’d like to believe that I would always be willing to crouch behind a podium wearing a fluffy red hat before delivering a keynote… but it’s just not true. Add danah to the mix and it suddenly is.

- Matt Harding jumping up when his name was mentioned and dancing in the audience. I’m thankful that he came on stage after the talk to introduce himself and apologize if I freaked him out by spontaneously hugging him. I just think he’s wicked cool and deserves recognition for using the internet to show us (one facet of) how wide and wonderful the world can be.

- Meeting Mahir, who turns out to be utterly lovely in person. Yes, he immediately started filming our meeting via flip video and digital camera, and yes, he did invite me, my wife and infant son to visit him in Izmir… but I got the sense that it wasn’t in any way an act, just his particular version of friendliness. It felt more wonderful than weird.

- Talking with the guys from Know Your Meme, who are working really hard to ensure that their site is global and inclusive, and who are trying to take some pages from the Global Voices playbook, recruiting local editors who understand memes in their corners of the world. I’ve got high hopes of a Makmende article in development soon, and hope perhaps for a GV/KYM alliance where we source and research global memes.

In other words, I had a blast. Thanks to everyone involved and hope you had as much fun as I did.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

04/27/2010 (5:13 pm)

Eric Rescorla – How paranoid should we be?

Filed under: Geekery ::

Eric Rescorla is a man who speaks frankly about internet security. You might have guessed that from the name of his consultancy – RTFM – an acronym politely translated as “read the friendly manual”… except that it’s usually not politely translated. He’s the co-designer of secure HTTP, and now works on security issues with Skype.

In a conversation about cyberwarfare at Princeton, his assertion – “The internet is still too secure” – raises a few eyebrows. But his key message – “things could be a lot worse” – is a useful counterpoint to the rhetoric of cyberwarfare that Dr. Lin explores, and helps show the tensions between the computer security community and the defense community (not to mention with the human rights community.)

Rescorla asserts the following:

We have nearly unbreakable crypto primitives.
AES resists all practical attacks, RSA isn’t quite as strong as we’d like – but there is good new stuff in the pipeline, and there’s been significant progress made on addressing collisions in SHA-1. “There is no serious concern we’re going to run out of crypto any time soon.” The problem instead is getting the new stuff into use – we’re not using the strong stuff that we already have.

We think we know how to build secure protocols.
There’s been no basic change in security design since 2000. We’ve got good protocols for data object security (S/MIME, PGP), for stream security (SSL/TLS, SSH), for packet security (IPsec, STLS) and for authentication (Kerberos). Our work isn’t on deploying new protocols – it’s on addressing flaws in existing implementations – updating to newer hash functions after MD5/SHA-1 attacks, for instance. And it’s on gluing together existing protocols, none of which have really changed since 2004.

We can’t build systems that are reliably secure.
Good implementations are hard. We recently saw an attack on OpenSSL where a single “record of death” can crash the whole system. Debian managed to break their psuedorandom number generator, which meant that Debian keys were highly predictable and crackable with a table of only 32,000 keys. This probably affected 1% of internet connected machines.

This is just the security critical code. As Steven Bellovin at Columbia has pointed out, “Software has bugs. Security software has security relevant bugs.” And we’re bad at finding these bugs – audits are time consuming and there’s no evidence that they significantly affect the discovery rate for bugs. And affected implementations are slow to go away.

User practice is appaling.

Users are careless. They install random software from untrusted sources. They ignore our carefully constructed messages designed to prevent man in the middle attacks. And crypto doesn’t do much for you if your system’s been compromised.

Things could be a lot worse.

Despite all this, things aren’t so bad. Why is computer security so poor despite twenty years of work on the topic? We’ve been working on personal security for 10,000 years – why aren’t human beings invulnerable? Security people always think about the worst case scenario, but the actual attacks we experience are fairly primitive.

The Debian PRNG bug is about as bad as it gets, but there was no evidence of practical attacks in the field. You can mount a DOS on the whole internet pretty easily – publish bogus BGP routes, as Pakistani ISPs did while trying to block YouTube. As a practical matter, we can almost always get to YouTube. The internet shouldn’t work, but it does. Perhaps the question we need to be asking ourselves is “What’s a rational level of paranoia?”

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

04/23/2010 (3:49 pm)

Towards Texas Transparency

Filed under: Geekery,ideas ::

I’m at the LBJ School of Public Affairs at the University of Texas at Austin today. There’s a conference on financial transparency in Texas, featuring some excellent student work on making Texas’s finances on the local and state level more open and accessible.

The students presenting work offer an excellent model for how a state might pursue financial transparency. They suggest that:
- Data must be public, which equals being online
- The release of that data must be timely and user-friendly, which means accessible formats
- Data needs to follow the money, to allow citizens to monitor all aspects of allocation and spending
- Tranparent information must lead towards public participation

They offer these proposed payoffs to this sort of transparency:
- Efficiency – having data open and online means more efficient inter- and intra-agency cooperation
- Innovation – open data means that independent individuals, organizations and groups can use, remix and reformat in interesting ways
- Increased accountability, as citizens can review
- Increased participation – citizens can become more involved in government decisions through access to this data

The students propose centralizing information that’s scattered across dozens of existing websites into a one-stop shop, analagous to Alabama’s Open.Alabama.Gov site. This site could also make it possible to map spending, like Maryland’s site tracking the stimulus. It would include information in .CSV format, as the Texas comptroller’s office is currently doing.

They recognize the need to move beyond releasing data to energizing the community, and reference Sunlight Foundation‘s model that identifies a wide range of groups who could get energized by access to information. But they believe that there’s a need for educating the public, using something like North Carolina’s Budgeting 101 website, letting citizens understand the basics of how the budget works.

It’s not enough to do this work at the state level, the students argue. It needs to happen at the local level, with local governments publishing budgets, check registers and financial reports. The fight now is about formats – it’s not sufficient to release this as PDFs – it needs to be in sortable, searchable data. And not every group is equally open – while the Texas legislature is quite open, the appropriations process is not – the students recommend opening a set of appropriations documents, including the markup and decision documents, acknowledging the difficulty of releasing documents that are changing in real-time as negotiations take place.

They suggest that governments face four main challenges:
- The difficulty of working with outdated, incompatible software
- Limits to technical, financial and human resources capacity
- The perception that there’s risk from citizens misunderstanding the data released
- The lack of incentives and requirements to force governments to participate in this process

In the hopes of making this easier, the students are working with the UT Computer Science department to build a template that local governments could use to release their information. It’s encouraging to see such in-depth thinking about both the mechanics of and rationale for opening government financial data – here’s hoping the students are able to have an influence on the future of this movement within Texas.


It’s likely that these LBJ school students will have an ally in Victor Gonzales, the CTO for the Comptroller of Public Accounts. Gonzales explains the role of the comptrollers office – it’s the state’s monitor of revenue and spending, and the state’s purchaser. It’s also the main accountant for the state, and processes over a hundred billion dollars in checks and electronic fund transfers. As such, they’re very well positioned to provide a window into the state’s finances.

Gonzales has been building systems that allow citizens, groups and legislators track expenditure, drilling down to the check register level by agency, payee and object of expense. Putting this information online has already led to $10 million in savings – he gives the example of discovering how much money the state was spending on copier toner, and deciding to negotiate a new contract to get a better deal from a central supplier.

His principle in building these systems – start small and keep them simple. When he took the position, his first question was “what could we do by the end of the week?” Turned out, they were able to release information from their own shop and set a precedent for the rest of the government. The project is no longer so simple – it’s quite powerful, with a site called “Where the Money Goes“, which allows deep exploration of government spending, and will soon be complemented with a site called “Where the Money Comes From”.

He closes with a great story: looking at accounts published online, the comptroller’s office discovered that a government department had bought a goat. For a little while, they worried that someone was eating cabrito for lunch at government expense. Turns out the goat was for scientific research. Score another victory for transparency.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

04/01/2010 (3:06 pm)

Is Vietnam conducting surveillance via malware?

Filed under: Geekery,Human Rights ::

I’ve been following reporting on the discovery of a new botnet in Vietnam with interest. McAfee and Google both posted information on the botnet on Tuesday the 30th, and the Wall Street Journal, Washington Post and New York Times all ran pieces on the phenomenon yesterday. Collectively, they offer an insight in just how difficult it is to report about internet abuse, hacking and “cyberwar”.

George Kurtz, CTO of McAfee, offered the most detailed technical report. McAfee has been investigating “Operation Aurora“, the attack on Google and other US companies that provoked Google to discontinue its google.cn search engine and redirect Chinese users to their uncensored Hong Kong engine. In the course of investigating these attacks, Kurtz reports that they discovered an apparently unrelated and unconnected botnet controlled by computers in Vietnam and apparently spread via a Vietnamese-language keyboard driver.

Vietnamese is a language that uses a complex set of diacritic marks to distinguish between characters and signify tone. To type in Vietnamese, you need a keyboard driver that can associate certain key combinations with the appropriate Unicode characters. Many Vietnamese speakers use VPSKeys, a driver that’s been distributed by the Vietnamese Professional Society, a group dedicated to connecting Vietnamese professionals in the diaspora. According to Kurtz, the Windows driver distributed by VPS has been compromised – if you download and install it, you’ll end up installing a rich set of trojan horse programs that will hijack your machine and enlist it in a botnet that appears to be controlled from within Vietnam.

Kurtz is clear that he doesn’t think VPS is intentionally distributing malware. Instead, “We believe that the perpetrators may have political motivations and may have some allegiance to the government of the Socialist Republic of Vietnam.” Writing for Google’s security team, Neel Mehta goes further: “These infected machines have been used both to spy on their owners as well as participate in distributed denial of service (DDoS) attacks against blogs containing messages of political dissent. Specifically, these attacks have tried to squelch opposition to bauxite mining efforts in Vietnam, an important and emotionally charged issue in the country.”

Mehta’s statement helped me figure something out that’s troubled me for a couple of years now. I’ve been fortunate enough to work with Vietnamese activists and dissidents in the US, and am aware of sophisticated attacks on people who’ve attracted the attention of the security forces. Some of these attacks have suceeded in accessing encrypted texts, including a text encrypted with PGP. We suspected that security forces weren’t breaking PGP (duh), but had physically accessed people’s computers (quite common in Vietnam), copied PGP private keyrings and installed keyloggers that captured users passphrases. I’m now inclined to think that the attack might have been much simpler – had someone compromised an earlier version of a Vietnamese-language keyboard driver, they could have easily inserted keylogging code and routines that sought out PGP keyrings.

Both the Washington Post and Wall Street Journal connect the Vietnam attacks to silencing political dissent about a Chinese mining project in Vietnam. There’s good circumstantial evidence for this – bauxitevietnam.info has been attacked in the past and blogger Nguyen Ngoc Nhu Quynh – aka Me Nam – was arrested last year in conjunction with her activities in opposition to the mine.

But it seems possible to me that what’s going on is more complex and sinister than just a denial of service attack. There’s no particular reason to harness the computers of Vietnamese-speaking users to launch a DDoS attack – there are existing, robust botnets that can be rented to attack whatever site you’d like. (I suppose a botnet built of Vietnam-based and diasporan users would be particularly effective at targeting targets within Vietnam… but bauxitevietnam.info was registered to a group in Hong Kong and there’s no reason to believe dissidents would be foolish enough to host an anti-government site within Vietnam.) But being able to intercept communications from anyone writing in Vietnamese and search for key phrases like “Kh?i 8406” would be a dream for a government with a long track record of tracking and harassing dissenters.

Here’s the problem – it’s almost impossible to know what’s actually going on. The ability to log user keystrokes isn’t just helpful for repressive governments – it’s a terrific tool for stealing banking passwords or other sensitive information. The Vietnam trojan could have been ordered by a government department, outsourced to private hackers to build and deploy… or engineered by enterprising criminals who saw an opportunity to infect a set of users through a vulnerability in VPS’s server… or created by a group of nationalist Vietnamese hackers operating independently of the government… and so on.

What’s scary about “cyberwar” – as far as I’m concerned – isn’t nightmare scenarios of nations shutting down each other’s electrical grids as a “force multiplier”. (This excellent oped from Marcus Ranum points out that some of these fears are a function of sloppy reporting and thinking that blurs the lines between hacking as prank, as crime and as military attack.) It’s the difficulty of figuring out whether a particular incident should be thought of as criminal or political activity. What’s appropriate response to state-led political/military actions (censure, sanction, etc.) is useless if the attack was criminal in nature, and vice versa.

Obviously, governments who decided to engage in cyberattacks would do their best to disguise them as criminal activity. This example suggests to me just how effective this disguise can be – as much as I worry about the government of Vietnam’s human rights record, it’s not hard for me to spin a scenario where this is a criminal attack, not a state-based one.

Hope that McAfee and others will release more information as they learn about the details of the trojan. If this turns out to be explicitly designed to spy on communications, it will be a fascinating development in the world of internet surveillance.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

03/24/2010 (11:31 am)

Makmende’s so huge, he can’t fit in Wikipedia

“After platinum, albums go Makmende”

“They once made a makmende toilet paper, but there was a problem: It wouldn’t take shit from anybody!!!”

“Makmende hangs his clothes on a safaricom line and when they dry he stores them in a flashdisk!”

If those simple truths don’t make sense to you, you’re probably not a Kenyan blogger. For the past few days, Kenya’s blogosphere and twitterers have been in thrall to the latest African superhero, and what might be Kenya’s first viral internet meme. An article in a Wall Street Journal blog today confirmed that Makmende is receiving attention beyond East Africa, demonstrating that our Kenyan friends are just as capable as any Moldovan boy band of creating internet buzz.

The video for Just a Band’s single “Ha-He” features a badass protagonist straight out of blaxploitation films. Armed with an array of freeze-frame kung fu moves, Makmende brings justice to the mean streets of a hazy, sun-drenched city that seems caught somewhere between Nairobi and 1970s LA. Tongue is firmly in cheek, as the video credits introduce characters including “Taste of Daynjah”, “Wrong Number” and bad guys “The Askyua Matha Black Militants”.

archer at Mwanamishale fills the rest of us in on the meaning of the term, Makmende:

Makmende was a term used way back in the early to mid 1990s to refer to someone who thinks he’s a superhero. For example, if a boy who’s watched one too many kung-fu movies on TV decides to unleash his newly acquired combat skills, he would be asked “Unajidai Makmende, eh?” (Who do you think you are, Makmende?) Trust me, there was a Makmende in every hood!

Given the high production values of the video, the fact that it accompanies a sweet track from Just a Band, and that the video producers evidently released a set of photoshopped magazine covers featuring Makmende as GQ’s sole “Badass of the Year”, perhaps it’s not surprising that Kenyan netizens have taken the Makmende trend to the next level. He’s got a Facebook page, a Twitter account, and a dedicated website filled with thousands of testimonies to his badassitude: “Makmende uses viagra in his eyedrops, just to look hard.”

The obvious parallel is Chuck Norris Facts, an internet meme that manifested mostly through image macros that attest to the action star’s manliness. (“Chuck Norris counted to infinity. Twice.”) For now, the Makmende phenomenon appears to be largely text-based, with Kenyans around the world connecting the events of the day to Makmende’s movements: “is the massive pour in Nairobi as a result of Makmende’s tear after the WSJ feature?”

What he doesn’t have is a Wikipedia page. I searched this morning on the English-language Wikipedia and got a page telling me that Makmende had been deleted:

* 00:37, 24 March 2010 Flyguy649 (talk | contribs) deleted “Makmende” ? (CSD G3: Pure Vandalism)
* 22:53, 23 March 2010 Malik Shabazz (talk | contribs) deleted “Makmende” ? (G12: Unambiguous copyright infringement (CSDH))
* 18:30, 23 March 2010 JoJan (talk | contribs) deleted “Makmende” ? (G1: Patent nonsense, meaningless, or incomprehensible)

Looks like multiple attempts to establish a Makmende page have been shot down. Fair enough – the inclusionist/deletionist argument that’s gripped Wikipedia centers in part on the documentation of ephemeral culture. Perhaps an English language encyclopedia doesn’t need mention of every internet meme… though pages exist for Numa Numa, the song that inspired the viral video, the guy who performed in the viral video, and so on. Perhaps if Makmende reaches the heights of internet fame that memes like Eduard Khil or Back Dorm Boys have achieved, he’ll no longer be “patent nonsense, meaningless or incomprehensible.”

Here’s an interesting puzzle for Wikipedia. Makmende may never become particularly important to English speaking users outside of Kenya. But the phenomenon’s quite important within the Kenyan internet: it’s the first meme I can remember going truly viral and inspiring a wave of participation from Kenyans around the world. I recall a conversation at 2006 Wikimania in Cambridge where (friend and GV editor) Ndesanjo Macha, a major contributor to the Swahili Wikipedia, explained that the topics covered in that wikipedia were likely to be different from those included in the English wikipedia. (More articles on east African culture, less on Pokemon, perhaps.) Indeed, the Wikipedias in Gaelic, Welsh and Plattdüütsch are cultural projects as much as attempts to make key reference materials available, as most speakers of these languages are fluent in other languages that have much larger Wikipedias.

Most Wikipedians seemed to accept the idea that different languages and cultures might want to include different topics in their encyclopedias. But what happens when we share a language but not a culture? Is there a point where Makmende is sufficiently important to English-speaking Kenyans that he merits a Wikipedia page even if most English-speakers couldn’t care less? Or is there an implicit assumption that an English-language Wikipedia is designed to enshrine landmarks of shared historical and cultural importance to people who share a language?

For me, Makmende’s a reminder that the internet isn’t as small and connected as we tend to believe it is. We occasionally catch glimpses over cultural walls when we use these tools. Sometimes we respond with fascination and seek to learn more. Often, our behavior’s not as admirable. danah boyd closed her talk on Digital Visibility at Supernova this past year with an uncomfortable observation about racism in Twitter:

Think of those who complained when the Trending Topics on Twitter reflected icons of the black community during the Black Entertainment Television awards. Tweets like: “wow!! too many negros in the trending topics for me. I may be done with this whole twitter thing.” and “Did anyone see the new trending topics? I don’t think this is a very good neighborhood. Lock the car doors kids.” and “Why are all the black people on trending topics? Neyo? Beyonce? Tyra? Jamie Foxx? Is it black history month again? LOL”. These tweets should send a shiver down your spine. Perhaps these people assumed that Twitter was a white-dominant space where blacks were welcome only if they were a minority.

danah goes on to point out that not everyone reacts to encountering topics outside of their comfort sphere with shock or surprise. I found it encouraging that the Wall Street Journal saw the emergence of a Kenyan meme as a chance to explore Kenyan internet culture rather than to turn away in ignorance or disinterest. Let’s hope the next time Makmende seeks a place in Wikipedia, he’s met with a bit more curiosity and less dismissal.


Roughly six seconds after I posted this piece, Twitter users reported a new version of the Makmende article on WIkipedia. Here’s hoping this one survives summary deletion…!

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

03/01/2010 (3:46 pm)

ChatRoulette survey (long bookmark)

Filed under: Geekery,Media,long bookmark ::

ChatRoulette: An Initial Survey

The fine folks at the Web Ecology Project pride themselves on researching web trends that are just starting to catch the attention of the media and other researchers. As such, we can count on them not only to offer insights into online, randomized chat site ChatRoulette, but into derivative works like CatRoulette. (Yes, I have considered surfing the site with Drew in front of the camera. Rachel told me not to.)

The survey – admittedly a first pass – has some big predictions from a fairly small data set. Alex Leavitt, Tim Hwang and friends sampled 201 sessions on the system, taking snapshots and logging off to see their potential correspondents. They also conducted 30 interviews, though were only able to talk to users who didn’t immediately close the connection, which may have skewed their sample set away from people using the system to find explicit content. (Someday, we’ll see a methodology section in a paper that debates the merits of logging onto a system while naked to get a more representative sample…)

The big takeaways: Yes, the folks using CR are male, 18-24. While some of them are looking for online sexual encounters, lots more are simply curious about the system or looking to chat. The authors frame the space as a “probabilistic online community”, with radically different dynamics than a traditional social network as it “mediates the encounters between its users, specifically by eliminating lasting connections in the framework of the platform”. It’s impossible within this framework to maintain traditional “friend” relationships – instead, we’d expect to see people creating online persona by wearing creative costumes/masks and developing that identity outside of the system, on blogs/tumblr/message boards. They further suggest that the fact that the majority of people on the site don’t appear to be seeking sexual imagery will lead towards a decline in explicit content. (That’s unclear to me – it’s quite possible that people not overtly seeking sexual content (i.e., focusing webcams on their genitals) are curious about what sort of explicit content they might come across as they switch cam partners.)

The paper includes my current candidate for “most enjoyable graph in a social science paper, 2010″:

Picture 1

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

02/22/2010 (7:27 pm)

Internet Freedom: Beyond Circumvention

Filed under: Geekery,Human Rights ::

Secretary Clinton’s recent speech on Internet Freedom has signaled a strong interest from the US State Department in promoting the use of the internet to promote political reforms in closed societies. It makes sense that the State Department would look to support existing projects to circumvent internet censorship. The New York Times reports that a group of senators is urging the Secretary to apply existing funding to support the development and expansion of censorship circumvention programs, including Tor, Psiphon and Freegate.

I’ve spent a good part of the last couple of years studying internet circumvention systems. My colleagues Hal Roberts, John Palfrey and I released a study last year that compared the strengths and weaknesses of different circumvention tools. Some of my work at Berkman is funded by a US state department grant that focuses on continuing to study and evaluate these sorts of tools and I spend a lot of time trying to coordinate efforts between tool developers and people who need access to circumvention tools to publish sensitive content.

I strongly believe that we need strong, anonymized and useable censorship circumvention tools. But I also believe that we need lots more than censorship circumvention tools, and I fear that both funders and technologists may overfocus on this one particular aspect of internet freedom at the expense of other avenues. I wonder whether we’re looking closely enough at the fundamental limitations of circumvention as a strategy and asking ourselves what we’re hoping internet freedom will do for users in closed societies.

So here’s a provocation: We can’t circumvent our way around internet censorship.

I don’t mean that internet censorship circumvention systems don’t work. They do – our research tested several popular circumvention tools in censored nations and discovered that most can retrieve blocked content from behind the Chinese firewall or a similar system. (There are problems with privacy, data leakage, the rendering of certain types of content, and particularly with usability and performance, but the systems can circumvent censorship.) What I mean is this – we couldn’t afford to scale today’s existing circumvention tools to “liberate” all of China’s internet users even if they all wanted to be liberated.

Circumvention systems share a basic mode of operation – they act as proxies to let you retrieve blocked content. A user is blocked from accessing a website by her ISP or that ISP’s ISP. She wants to read a page from Human Rights Watch’s webserver, which is accessible at IP address 70.32.76.212. But that IP address is on a national blacklist, and she’s prevented from receiving any content from it. So she points her browser to a proxy server at another address – say 123.45.67.89 – and asks a program on that server to retrieve a page from the HRW server. Assuming that 123.45.67.89 isn’t on the national blacklist, she should be able to receive the HRW page via the proxy.

During the transaction, the proxy is acting like an internet service provider. Its ability to provide reliable service to its users is constrained by bandwidth – bandwidth to access the destination site and to deliver the content to the proxy user. Bandwidth is costly in aggregate, and it costs real money to run a proxy that’s heavily used.

Some systems have tried to reduce these costs by asking volunteers to share them – Psiphon, in its first design, used home computers hosted by volunteers around the world as proxies, and used their consumer bandwidth to access the public internet. Unfortunately, in many countries, consumer internet connections are optimized to download content and are much slower when they are uploading content. These proxies could get the homepage at hrw.org pretty quickly, but they took a very long time to deliver the page to the user behind the firewall. Psiphon is no longer primarily focused on trying to make proxies hosted by volunteers work. Tor is, but Tor nodes are frequently hosted by universities and companies who have access to large pools of bandwidth. Still, available bandwidth is a major constraint to the usability of the Tor system. The most usable circumvention systems today – VPN tools like Relakks or Witopia – charge users significant sums annually to defray bandwidth costs.

Let’s assume that systems like Tor, Psiphon and Freegate receive additional funding from the State Department. How much would it cost to provide proxy internet access for… well, China? China reports 384 million internet users, meaning we’re talking about running an ISP capable of serving more than 25 times as many users as the largest US ISP. According to CNNIC, China consumes 866,367 Mbps of international internet bandwidth. It’s hard to get estimates for what ISPs pay for bandwidth, though conventional wisdom suggests prices between $0.05 and $0.10 per gigabyte. Using $0.05 as a cost per gigabyte, the cost to serve the Internet to China would be $13,608,000 per month, $163.3 million a year in pure bandwidth charges, not counting the costs of proxy servers, routers, system administrators, customer service. Faced with a bill of that magnitude, the $45 million US senators are asking Clinton to spend quickly looks pretty paltry.

There’s an additional complication – we’re not just talking about running an ISP – we’re talking about running an ISP that’s very likely to be abused by bad actors. Spammers, fraudsters and other internet criminals use proxy servers to conduct their activities, both to protect their identities and to avoid systems on free webmail providers, for instance, which prevent users from signing up for dozens of accounts by limiting an IP address to a certain number of signups in a limited time period. Wikipedia found that many users used open proxies to deface their system and now reserve the right to block proxy users from editing pages. Proxy operators have a tough balancing act – for their proxies to be useful, people need to be able to use them to access sites like Wikipedia or YouTube… but if people use those proxies to abuse those sites, the proxy will be blocked. As such, proxy operators can find themselves at war with their own users, trying to ban bad actors to keep the tool useful for the rest of the users.

I’m skeptical that the US State Department can or wants to build or fund a free ISP that can be used by millions of simultaneous users, many of whom may be using it to commit clickfraud or send spam. I know – because I’ve talked with many of them – that the people who fund blocking-resistant internet proxies don’t think of what they’re doing in these terms. Instead, they assume that proxies are used by users only in special circumstances, to access blocked content.

Here’s the problem. A nation like China is blocking a lot of content. As Donnie Dong notes in a recent blogpost, five of the ten most popular websites worldwide are blocked in China. Those sites include YouTube and Facebook, sites that eat bandwidth through large downloads and long sessions. Perhaps it would be realistic to act as an ISP to China if we were just providing access to Human Rights Watch – it’s not realistic if we’re providing access to YouTube.

Proxy operators have dealt with this question by putting constraints on the use of their tools. Some proxy operators block access to YouTube because it’s such a bandwidth hog. Others block access to pornography, both because it uses bandwidth and to protect the sensibilities of their sponsors. Others constrain who can use their tools, limiting access to the tools to people coming from Iranian or Chinese IPs, trying to reduce bandwidth use by American high school kids who’ve got YouTube blocked by their school. In deciding who or what to block, proxy operators are offering their personal answers to a complicated question: What parts of the internet are we trying to open up to people in closed societies? As we’ll address in a moment, that’s not such an easy question to answer.

Let’s imagine for a moment that we could afford to proxy China, Iran, Myanmar and others’ international traffic. We figure out how to keep these proxies unblocked and accessible (it’s not easy – the operators of heavily used proxy systems are engaged in a fast-moving cat and mouse game) and we determine how to mitigate the abuse challenges presented by open proxies. We’ve still got problems.

Most internet traffic is domestic. In China, we estimate (Hal’s got a paper coming out shortly) that roughly 95% of total traffic is within the country. Domestic censorship matters a great deal, and perhaps a great deal more than censorship at national borders. As Rebecca MacKinnon documented in “China’s Censorship 2.0“, Chinese companies censor user-generated content in a complex, decentralized way. As a result, a good deal of controversial material is never published in the first place, either because it’s blocked from publication or because authors decline to publish it for fear of having their blog account locked or cancelled. We might assume that if Chinese users had unfettered access to Blogger, they’d publish there. Perhaps not – people use the tools that are easiest to use and that their friends use. A seasoned Chinese dissident might use Blogger, knowing she’s likely to be censored – an average user, posting photos of his cat, would more likely use a domestic platform and not consider the possibility of censorship until he found himself posting controversial content.

In promoting internet freedom, we need to consider strategies to overcome censorship inside closed societies. We also need to address “soft censorship”, the co-opting of online public spaces by authoritarian regimes, who sponsor pro-government bloggers, seed sympathetic message board threads, and pay for sympathetic comments. (Evgeny Morozov offers a thoroughly dark view of authoritarian use of social media in How Dictators Watch Us On The Web.)

We also need to address a growing menace to online speech – attacks on sites that host controversial speech. When Turkey blocks YouTube to prevent Turkish citizens from seeing videos that defame Ataturk, they prevent 20 million Turkish internet users from seeing the content. When someone – the Myanmar government, patriotic Burmese, mischievous hackers – mount a distributed denial of service attack on Irrawaddy (an online newspaper highly critical of the Myanmar government), they (temporarily) prevent everyone from seeing it.

Circumvention tools help Turks who want to see YouTube get around a government block. But they don’t help Americans, Chinese or Burmese see Irrawaddy if the site has been taken down by DDoS or hacking attacks. Publishers of controversial online content have begun to realize that they’re not just going to face censorship by national filtering systems – they’re going to face a variety of technical and legal attacks that seek to make their servers inaccessible.

There’s quite a bit publishers can do to increase the resilience of their sites to DDoS attack and to make their sites more difficult to filter. To avoid blockage in Turkey, YouTube could increase the number of IP addresses that lead to the webserver and use a technique called “fast-flux DNS” to give the Turkish government more IP addresses to block. They could maintain a mailing list to alert users to unblocked IP addresses where they could access YouTube, or create a custom application which disseminates unblocked IPs to YouTube users who download the ap. These are all techniques employed by content sites that are frequently blocked in closed societies.

YouTube doesn’t take these anti-blocking measures for at least two reasons. One, they’ve generally preferred to negotiate with nations who filter the internet to try to make their sites reachable again than to work against them by fighting filtering. (This attitude may be changing now that Google has announced their intention not to cooperate with Chinese censorship.) Second, YouTube doesn’t really have an economic incentive to be unblocked in Turkey. If anything, being blocked in Turkey (and perhaps even in China) may be to their economic advantage.

Sites that enable user-created content are supported by advertising traffic. Advertisers are generally more excited about reaching users in the US (who’ve got credit cards, more disposable income and are inclined to buy online) than users in China or Turkey. Some suspect that the introduction of “lite” versions of services like Facebook are designed to serve users in the developing world at lower cost, since those users rarely create income. In economic terms, it may be hard to convince Facebook, YouTube and others to continue providing services to closed societies, where they have a tough time selling ads. And we may need to ask more of them – to take steps to ensure that they remain accessible and useful in censorious countries.

In short:
- Internet circumvention is hard. It’s expensive. It can make it easier for people to send spam and steal identities.
- Circumventing censorship through proxies just gives people access to international content – it doesn’t address domestic censorship, which likely affects the majority of people’s internet behavior.
- Circumventing censorship doesn’t offer a defense against DDoS or other attacks that target a publisher.

To figure out how to promote internet freedom, I believe we need to start addressing the question: “How do we think the Internet changes closed societies?” In other words, do we have a “theory of change” behind our desire to ensure people in Iran, Burma, China, etc. can access the internet? Why do we believe this is a priority for the State Department or for public diplomacy as a whole?

I think much work on internet censorship isn’t motivated by a theory of change – it’s motivated by a deeply-held conviction (one I share) that the ability to share information is a basic human right. Article 19 of the Universal Declaration of Human Rights states that “Everyone has the right to freedom of opinion and expression; this right includes freedom to hold opinions without interference and to seek, receive and impart information and ideas through any media and regardless of frontiers.” The internet is the most efficient system we’ve ever built to allow people to seek, receive and impart information and ideas, and therefore we need to ensure everyone has unfettered internet access. The problem with the Article 19 approach to censorship circumvention is that it doesn’t help us prioritize. It simply makes it imperative that we solve what may be an unsolvable problem.

If we believe that access to the internet will change closed societies in a particular way, we can prioritize access to those aspects of the internet. Our theory of change helps us figure out what we must provide access to. The four theories I list below are rarely explicitly stated, but I believe they underly much of the work behind censorship circumvention.

The suppressed information theory: if we can provide certain suppressed information to people in closed societies, they’ll rise up and challenge their leaders and usher in a different government. We might choose to call this the “Hungary ’56 theory” – reports of struggles against communist governments around the world, reported into Hungary via Radio Free Europe, encouraged Hungarians to rebel against their leaders. (Unfortunately, the US didn’t support the revolutionaries militarily – as many in Hungary had expected – and the revolution was brutally quashed by a Soviet invasion.)

I generally term this the “North Korea theory”, because I think a state as closed as North Korea might be a place where un-suppressed information – about the fiscal success of South Korea, for instance – could provoke revolution. (Barbara Demick’s beautiful piece in the New Yorker, “The Good Cook“, gives a sense for how little information most North Koreans have about the outside world and how different the world looks from Seoul.) But even North Korea is less informationally isolated than we think – Dong-A Ilbo reports an “information belt” along the North Korea/China border where calls on smuggled mobile phones are possible from North to South Korea. Other nations are far more open – my friends in China tend to be extremely well informed about both domestic and international politics, both through using circumvention tools and because Chinese media reports a great deal of domestic and international news.

It’s possible that access to information is a necessary, though not sufficient, condition for political revolution. It’s also possible that we overestimate the power and potency of suppressed information, especially as information is so difficult to suppress in a connected age.

The Twitter revolution theory: if citizens in closed societies can use the powerful communications tools made possible by the Internet, they can unite and overthrow their oppressors. This is the theory that led the State Department to urge Twitter to put off a period of scheduled downtime during the Iran elections protests. While it’s hard to make the case that technologies of connection are going to bring down the Iranian government (see Cameron Abadi’s piece in FP on the limitations of using Facebook to organize in Iran), good counterexamples exist, like the role of the mobile phone in helping to topple President Estrada in the Philippines.

There’s been a great deal of enthusiasm in the popular press for the Twitter revolution theory, but careful analysis reveals some limitations. The communications channels opened online tend to be compromised quickly, used for disinformation and for monitoring activists. And when protests get out of hand, governments of closed societies don’t hesitate to pull the plug on networks – China has blocked internet access in Xinjiang for months, and Ethiopia turned off SMS on mobile phone networks for years after they were used to organize street protests.

The public sphere theory: Communication tools may not lead to revolution immediately, but they provide a new rhetorical space where a new generation of leaders can think and speak freely. In the long run, this ability to create a new public sphere, parallel to the one controlled by the state, will empower a new generation of social actors, though perhaps not for many years.

Marc Lynch made a pretty persuasive case for this theory in a talk last year about online activism in the Middle East. It’s possible to make this case by looking at samizdat (self-published, clandestine media) in the former Soviet Union, which was probably more important as a space for free expression than it was as a channel for disseminating suppressed information. The emergence of leader like Vaclav Havel, whose authority was rooted in cultural expression as well as political power, makes the case that simply speaking out is powerful. But the long timescale of this theory makes it hard to test.

The theory we accept shapes our policy decisions. If we believe that disseminating suppressed information is critical – either to the public at large or to a small group of influencers – we might focus our efforts on spreading content from Voice of America or Radio Free Europe. Indeed, this is how many government forays into censorship circumvention began – national news services began supporting circumvention tools so their content (painstakingly created in languages like Burmese or Farsi) would be accessible in closed societies. This is a very efficient approach to anticensorship – we can ignore many of the problems associated with abusing proxies and focus on prioritizing news over other high-bandwidth uses, like the video of the cat flushing the toilet. Unfortunately, we’ve got a long track record that shows that this form of anticensorship doesn’t magically open closed regimes, which suggests that increasing our bet on this strategy might be a poor idea.

If we adopt the Twitter Revolution theory, we should focus on systems that allow for rapid communication within trusted networks. This might mean tools like Twitter or Facebook, but probably means tools like LiveJournal and Yahoo! Groups which gain their utility through exclusivity, allowing small groups to organize outside the gaze of the authorities. If we adopt the public sphere approach, we want to open any technologies that allow public communication and debate – blogs, Twitter, YouTube, and virtually anything else that fits under the banner of Web 2.0.

What does all this mean in terms of how the State Department should allocate their money to promote Internet Freedom? My goal was primarily to outline the questions they should be considering, rather than offering specific prescriptions. But here are some possible implications of these questions:

- We need to continue supporting circumvention efforts, at least in the short term. But we need to disabuse ourselves of the idea that we can “solve” censorship through circumvention. We should support circumvention until we find better technical and policy solutions to censorship, not because we can tear down the Great Firewall by spending more.

- If we want more people using circumvention tools, we need to find ways to make them fiscally sustainable. Sustainable circumvention is becoming an attractive business for some companies – it needs to be part of a comprehensive internet freedom strategy, and we need to develop strategies that are sustainable and provide low/zero cost access to users in closed societies.

- As we continue to fund circumvention, we need to address usage of these tools to send spam, commit fraud and steal personal data. We might do this by relying less on IP addresses as an extensive, fundamental means of regulating bad behavior… but we’ve got to find a solution that protects networks against abuse while maintaining the possibility of anonymity, a difficult balancing act.

- We need to shift our thinking from helping users in closed societies access blocked content to helping publishers reach all audiences. In doing so, we may gain those publishers as a valuable new set of allies as well as opening a new class of technical solutions.

- If our goal is to allow people in closed societies to access an online public sphere, or to use online tools to organize protests, we need to bring the administrators of these tools into the dialog. Secretary Clinton suggests that we make free speech part of the American brand identity – let’s find ways to challenge companies to build blocking resistance into their platforms and to consider internet freedom to be a central part of their business mission. We need to address the fact that making their platforms unblockable has a cost for content hosts and that their business models currently don’t reward them for providing service to these users.

- The US government should treat internet filtering – and more aggressive hacking and DDoS attacks – as a barrier to trade. The US should strongly pressure governments in open societies like Australia and France to resist the temptation to restrict internet access, as their behavior helps China and Iran make the case that their censorship is in line with international norms. And we need to fix US treasury regulations make it difficult and legally ambiguous for companies like Microsoft and projects like SourceForge to operate in closed societies. If we believe in Internet Freedom, a first step needs to be rethinking these policies so they don’t hurt ordinary internet users.

The danger in heeding Secretary Clinton’s call is that we increase our speed, marching in the wrong direction. As we embrace the goal of Internet Freedom, now is the time to ask what we’re hoping to accomplish and to shape our strategy accordingly.

Thanks to Hal Roberts, Janet Haven and Rebecca MacKinnon for help editing and improving this post. They’re responsible for the good parts – you can blame the rest on me.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

01/13/2010 (3:45 pm)

Four possible explanations for Google’s big China move

Yesterday, Google announced a major change in their policy in engaging with China – they will no longer censor search results on Google.cn to comply with Chinese policy. This almost certainly means that Google.cn will be blocked by the Great Firewall and that Google will no longer be able to operate in China.

While this aspect of Google’s announcement is sparking a great deal of conversation online, it comes at the end of a bombshell of an announcement – Google’s decision follows what appears to be a coordinated act of espionage aimed at its servers by Chinese attackers. The attack resulted, Google reports, in a theft of their intellectual property. They also report that a goal of the attack was to access the GMail accounts of Chinese human rights activists and supporters of Chinese human rights around the world. MacWorld reports that the attack targeted an internal system that Google had built to comply with search warrant requests for information on users. When it became clear that this internal system – evidently set up for the benefit of Chinese authorities – was being attacked and used to compromise Google’s internal networks, Google began discussions about disengaging from the world’s largest internet market.

There’s at least four ways to read Google’s decision:

Google decided to stop being evil.
Google has received reams of bad press from their decision to comply with Chinese government regulations and censor search results for Chinese users. It’s never been entirely clear to me why Google’s received more criticism than Microsoft – who admit they censored Chinese bloggers, and whose Chinese-language tools prevent posting of articles about human rights and democracy – or Yahoo, who turned over information on user Shi Tao to Chinese authorities that led to ten years imprisonment for “leaking state secrets”. I suspect we want to hold Google to a higher standard because they’ve put forth an informal motto: “Don’t be evil”, and compromising with the Chinese government looks like a violation of that stance.

Google’s taken steps to minimize the exposure of user data in China – services like Gmail, which contain sensitive personal data, or which permit publishing, like Blogger, are hosted in the US, not China. (This has made it harder for these tools to achieve market share against Chinese competitiors.) They censored in a more transparent fashion than some of their competitors, displaying a message at the bottom of each page, stating that sites had been removed from the results to comply with regulations. Google is a founding member of the Global Network Initiative, a partnership between industry, academia and the nonprofit community designed to develop best practices for engaging in closed societies like China.

In my opinion – shaped, no doubt, by the fact that I’ve got a lot of friends within Google and have worked closely with the company in a couple of contexts – Google was a lot less evil than some of its competitors. But continued involvement in China continued to be a thorn in the side of Google on the PR front, and I know many people within the company questioned whether engaging in China was worth the compromises it entailed. The move to leave the Chinese market may be an example of Google returning to its core values and demonstrating an unwillingness to compromise.

Google retreated from a very tough market.
Google wasn’t doing all that well in the Chinese search market – they were a distant second to Baidu, and faced extreme challenges in gaining market share. Google’s main properties – google.com and related sites – are frequently inaccesible through the Great Firewall, and Google’s Chinese site – google.cn – was subject to a great deal of scrutiny from the Chinese press and from regulators. CCTV ran an “exposé” on Google.cn, demonstrating – horror of horrors! – that the internet includes links to pornography – this story led to increased oversight of Google’s Chinese site. Friends within Google tell me that it was a constant struggle to respond to complaints from Chinese regulators, and that they believed competitors like Baidu were reporting Google’s alleged violations to regulators, increasing scrutiny on the company.

The situation within Google China was already quite complicated. Kai-Fu Lee, Google’s China chief, quit in September, giving no clear reasons for his departure. His departure started speculation that Google might be discovering that they couldn’t be competitive in a Chinese market without making even larger compromises to corporate ideals.

It’s hard to imagine Google walking away from a market as potentially lucrative as China, even if they were in a tough battle for second place. And they certainly didn’t walk away quietly. By (obliquely) accusing the Chinese government of involvement in corporate espionage and challenging the government to shut the company down for providing uncensored search, “Google has taken the China corporate communications playbook, wrapped it in oily rags, doused it in gasoline and dropped a lit match on it.” (Those evocative words are from top Chinablogger Imagethief.) This isn’t a temporary strategic retreat – this is a retreat where you detonate the bridges behind you.

Google abandoned Chinese users.
Despite its second place in the market behind Baidu, there are millions of dedicated Google users in China, and many of them are deeply disappointed today and worried about losing access to services they’ve grown to depend on. Reading their comments in translation on Global Voices, thanks to Bob Chen, it’s clear the frustration is less with Google than with the Chinese authorities. One translated tweet is especially poignant:

The sin of facebook is that it helps people know who they wanna know. The sin of Twitter is that it allows people to say what they wanna say. The sin of Google is that it lets people find what they wanna find, and Youtube let us see what we wanna see. So, they are all kicked away.

Bob also shares a joke about China in the years after Google’s departure:

People born in 90s: Today I stepped out of the Great Firewall and saw a foreign website named Google. Shit, it is all but a copy of Baidu.
Born in 00s: What do you mean by stepping out of Great Firewall?
Born in 10s: What do you mean by website?
Born in 20s: What is “foreign”?

Perhaps most striking is a campaign to lay flowers in front of Google’s headquarters in Beijing. Rebecca MacKinnon reports that Tsinghua University’s security department has banned students from taking flowers to Google headquarters without permission.

(Here’s a sympathetic view of Google’s decision to pull out from Chinese activist Michael Anti, who’s been censored in the past by Microsoft.)

Google is about to join the front lines of the anticensorship wars.
Hal Roberts, John Palfrey and I published a study of tools designed to subvert and circumvent internet censorship a few months back, based on research we conducted over the course of three years. In the course of that research, we ended up with a simple realization about the design of censorship circumvention software:

A robust anti-censorship system has, at minimum, three components:
- Lots of non-contiguous IP addresses, making it difficult for censors to block the entry points into the system
- Huge amounts of bandwidth that can access the public internet, as a censorship circumvention system is basically an ISP
- Multiple methods to feed fresh IP addresses to your users

This isn’t a complete definition, of course – good anticensorship systems use SSL encryption to prevent keyword blocking, but that’s a solved problem. The three components above tend to be very hard for small anti-circumvention projects to solve. It’s very hard to obtain lots and lots of IP addresses, and very expensive to provision sufficient bandwidth… unless you’re Google, in which case, these obstacles should be trivial. There’s still lots of work that needs to be done ensuring that users of circumvention systems get fresh IP addresses, but a Google-backed anticensorship system (perhaps operated in conjunction with some of the smart activists and engineers who’ve targeted censorship in Iran and China?) would be massively more powerful (and threatening!) than the systems we know about today.

These tools would have a built-in market – the millions of users who were enjoying Google’s tools from within China – and could radically change the landscape of the internet freedom field. An emphasis on internet freedom tools would allow Google to engage with a smaller Chinese market, but would allow them to maintain a toe in the waters while maintaining a stance of disengagement with the Chinese government.

Is Google going to do this? I have no idea. I hope so. They could have done so previously, but it would have been viewed as a shot across China’s bow. Now that they’ve launched a torpedo, that shot across the bow seems more likely.

At Global Voices, we were thrilled that Google chose to partner with us and Thompson/Reuters in offering the Breaking Borders Award “to honor outstanding web projects initiated by individuals or groups that demonstrate courage, energy and resourcefulness in using the Internet to promote freedom of expression.” It would be very exciting to see Google becoming one of those groups using their energy, resourcefulness and resources to combat censorship online… and it would certainly take some corporate courage on their part.

We’ll know a lot more about what Google’s doing in the next few days. Responses are already piling up online. Evgeny thinks Google is bluffing, or simply retreating from an unsuccesful market position. Jonathan Zittrain sees this as a masterstroke, aligning Google’s business with its values, and shares my hope that Google will dedicate major resources to censorship circumvention. Dharmishta Rood links to a bevy of reactions from around the web. I’m anxiously awaiting Rebecca’s analysis, which she promises when she finishes two other articles that are due. (Man, I know that feeling.)

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]
Next Page »