It’s not obvious from looking at me, but while I’m American, I’m deeply partisan towards the nation of Ghana. I moved to Accra, Ghana in 1993 to study xylophone music, and I’ve traveled back to the country almost every year since 2000. I ran a nonprofit organization in Ghana from 1999-2004 and I now work closely with a Ghanaian journalism nonprofit. This dual allegiance is a good thing: I have two teams to root for in the upcoming World Cup (unfortunately, they’ll see each other in the first round), and I take disproportionate pride in Ghana’s economic and political success over the past two decades.
Ghana has a lot to be proud of, in political terms. After almost twenty years of rule by a man who took power through a coup, Ghana democratically elected a President from the opposition NPP party in 2000. After eight years of his rule, they elected a President from the NDC, which had ruled for the previous decades. Political scientists call this a “double alternation”, and it’s considered the gold standard for stability in a democracy, evidence that an electoral system is free and fair enough that either of two major parties can win an election. Due to its clean elections and history of stability, demonstrated when the death of President Atta Mills in office led to a seamless transition to his vice-president John Mahama, Ghana has become the exemplar for democratic transition in West Africa. Ghanaian politicians and NGOs are now working to export models and best practices from Ghana to the region and the continent.
But there’s something uncomfortable about Ghana’s elections. Many of the politicians from the NPP party come from a single ethnic group, the Akan or Ashanti, and their close allies. The NDC has a broader ethnic base of support, but the Ewe are particularly powerful within the party. You can see these alliances in a map of electoral results – the NPP candidate won in the Ashanti and Eastern regions, the home of the Akan, while the NDC won elsewhere, but dominated in the Volta region, where the Ewe hail from. Some critics worry that Ghana’s free and fair elections may be masking elections that are less about political issues and more about ethnic allegiances.
Economist Paul Collier warns of this problem in his book “Wars, Guns and Votes”, He warns that we may be seeing a lot of elections in the developing world that are free, fair and bad. They are free and fair because we’ve gotten very good at monitoring elections for obvious signs of rigging and fraud, but they’re bad because they are decided for reasons other than political issues. In bad elections, Collier argues, people vote for a candidate because they expect some personal financial gain (a job, a handout) or because they see an electoral victory as a victory for their tribe or group. A good election is one in which people vote for a candidate because they expect he or she will make positive policy changes, benefiting a broader community and the nation as a whole.
Free, fair and bad elections happen because it’s hard to hold politicians accountable. We elect politicians because we share their aspirations and visions, but we also elect them because we hope they will ensure that tax dollars are distributed fairly and ensure that our communities benefit from those investments in schools, hospitals, roads and other essential infrastructures. But in many countries, it’s very hard to find out whether our politicians are doing a good or poor job.
Sometimes politicians don’t do a good job because they are corrupt, more interested in their personal gain than serving their communities. In most cases, politicians work hard and their shortcomings are the result of being constrained by finances, thwarted by bureaucracy or otherwise held in check. If we had better ways of tracking what governments do in their communities and documenting the progress of taxpayer-funded projects, we would have far more information we could use to hold our politicians accountable, to re-elect the best and oust the worst. This means a strong, free press is important, as are efforts at government transparency, and systems to ensure access to government information, like freedom of information laws.
In other words, if we want strong, responsive democracies, we can’t just fix electoral systems – we have to fix monitorial systems. And we can’t just establish a culture of clean elections, as Ghana has done – we need a culture of monitorial citizenship.
The idea of monitorial citizenship is one I’ve borrowed from journalism scholar Michael Schudson. Schudson argues that we often understand democracy in terms of “informed citizenship” – our job as citizens is to be informed about the issues and to vote, then let our elected representatives do their jobs. This model became popular in the United States during the progressive era of the early 20th century, and Schudson worries that the model may be out of date, not accurately representing how most people participate in democracies today. One of the models Schudson suggests to describe our current reality is monitorial democracy, where a responsibility as citizens is to monitor what powerful institutions do (governments, corporations, universities and other large organizations) and demand change when they misbehave. The press is a powerful actor in monitorial democracies, as demonstrated during the Watergate scandal and the end of the Nixon presidency in the US. And new media may broaden the potential for monitorial democracy, allowing vastly more citizens to watch, document and share their reports.
This year, my students and I have been experimenting with projects that connect monitorial democracy with the mobile phone. We’ve conducted small experiments locally, monitoring the on-time performance of subway trains and wait times in post offices, and examined what sorts of infrastructures in our local community are built and maintained by different government and private sector actors. And now we’re heading to Belo Horizonte and São Paulo, Brazil for the next round of our experiments.
We’ll work with community organizations in neighborhoods in both cities to identify promises local governments have made that citizens see as high importance. We’ll work with these volunteers to map a few, carefully chosen, infrastructures in their communities and to track the status of those infrastructures over time. And we’ll work with the community to figure out how we should reward governments that live up to their promises and challenge governments that fall short… all within the course of two three-day, student led workshops. (!?!)
Our core insight – that citizens can use mobile phones to document infrastructure and monitor government performance – is not a new one. We are inspired by a number of exciting projects that have demonstrated the potential and pitfalls of citizen monitoring and documentation, notably:
- Map Kibera, which has demonstrated the importance of mapping squatter cities and informal settlements to show both the deficiencies and the vitality of infrastructure in those communities
- Ushahidi, which shows that mobile phones combined with mapping can help individuals work together to map crises and opportunities with little central planning
- Fix My Street and related projects, which have helped citizens see governments as service providers, responsible for maintaining infrastructures, and capable of providing customer service to citizens
- Safecast, which has encouraged Japanese citizens to monitor radiation levels in the wake of the Fukushima disaster, helping create data sets citizens can use to lobby the government for better cleanup plans and responses
- The Earth Institute’s collaboration with the Government of Nigeria, to use citizen enumerators, armed with mobile phones, to monitor schools, hospitals and other government-procured infrastructure to establish the country’s progress towards meeting Millenium Development goals
We hope to learn from these projects and push our work in a slightly different direction. Our system, Promise Tracker, starts from promises government officials (local, state and federal) have made to a community, and then helps communities track progress made on those promises by monitoring infrastructures like power grids, roads, schools and hospitals. The use case for Promise Tracker is simple: if the mayor of a city makes an electoral promise that roads in a neighborhood will be paved during her time in office, Promise Tracker helps the local community collect data on the condition of the roads and monitor progress made on the promise over time. If the mayor meets her goal, Promise Tracker offers proof generated by the community that’s benefitted. If the government is in danger of falling short, Promise Tracker offers an open, freely shared data set that citizens and officials can use to consult on solving the problem.
It’s this idea of tracking promises that has led us to Brazil. I spoke about the Promise Tracker idea at the Media Lab’s fall sponsors meeting and had two transformative conversations with Brazilians who heard me speak. One conversation was with Oded Grajew, a celebrated Brazilian social entrepreneur and innovator, one of the founders of the World Social Forum, and founder of Rede Nossa Sao Paulo, “Our Sao Paulo Network”, a network of community organizations dedicated to transforming and improving that remarkable city. One of Grajew’s many achievements is a successful campaign to get the city of São Paulo to change its constitution and require the mayor to publish campaign promises, allowing citizens to monitor the government’s progress. Grajew invited my students to São Paulo to meet with his organization and see whether the tools we’re building could help his organization keep a close eye on the government’s performance.
The second conversation was more surprising: it was with the government of the state of Minas Gerais, specifically from Andre Barrence, CEO at the Office for Strategic Priorities, who is in charge of innovation in government and the private sector. Minas Gerais is a sponsor of the Media Lab and has been looking for partnerships where Media Lab students and faculty can work with residents of Belo Horizonte and other Minas Gerais communities. It’s not easy for a government to volunteer to open itself to citizen monitoring, and it’s a great credit to the innovative leaders in the Minas Gerais government that they’ve been working hard to find community organizations we can partner with to monitor the government’s progress and enter into a partnership to celebrate successes and work to fix potential failures.
In our workshops in Belo Horizonte and São Paulo, my students – Jude Mwenda, Alexis Hope, Chelsea Barbaras, Heather Craig and Alex Gonçalves – will work with community leaders to understand what promises politicians have made to the community, to identify promises the community is most concerned about, and to identify promises we can evaluate my monitoring infrastructure over time. We’re using codesign methods promoted by our friend and colleague Sasha Costanza-Chock, trying to ensure that what we monitor is what the community cares about, and that we build the tools with the community, who will be responsible for using them over the next few months or years. Our short-term goal is to collect data on a couple of infrastructures in a community, leverage some of Rahul Bhargava’s work on community data visualization to help our partners present data, and to open a conversation with local authorities about tracking an infrastructure over time.
Our long-term ambitions are broader. We hope to build a tool that communities can customize to their own needs and campaigns, but which centers on the idea that mobile phones can collect photographic data, cryptographically stamp it with location information and a timestamp, and release it to public repositories under a CC0 license. We hope we’ll see groups around the world use the tool to track everything from road and power grid condition to air and water quality, integrating low-cost sensors into the system and asking citizens to engage in environmental data collection as well as civic monitoring.
The key idea behind the project is a simple one: civic engagement is too important to be something we do only at elections.
I’ve been writing and speaking about the recognition that many people feel alienated from existing political processes and like there’s no good path for them to engage in decisionmaking about their communities. This alienation leads to disengagement, and can lead to more dramatic forms of dissent, including public protest. The work I’m trying to do on effective citizenship focuses on the idea that we need to engage in citizenship more than once every four years… and also more often than we take to the streets in protest. It’s my hope that helping people monitor powerful institutions and evaluate the successes and setbacks of their elected representatives will be a way people can engage in citizenship every day.
I’m writing this post while enroute to Belo Horizonte, and I’ll share a report on what happened in our workshops and how this idea has changed as I fly home. I’ll also add more links once I have better connectivity. The really good stuff will likely come from the trip report my students put together – I’ll share that as soon as they share it with me.
As I’ve mentioned in years past, Microsoft Research’s Social Computing Symposium is my favorite conference to attend, mostly because it’s a chance to catch up with dozens of people I love and don’t get to see every day. I wasn’t able to blog the whole conference, in part because I was moderating a session, but I wanted to post my notes on the event to share these conversations more widely. I’ve added some of my thoughts at the end as well. Many thanks to Microsoft Research for running this event and to all participants in the panel.
The session is titled “Data and its Discontents”, and it was curated by RIT’s Liz Lawley and MSR/NYU’s danah boyd. They decided not to focus on “big data” – the theme of virtually every conference these days – but data through different lenses: art and creative practice, an ethical perspective, a rights perspective and through a speculative perspective.
The opening speaker is professor and artist Golan Levin (@golan), who’s based at CMU. He’s spent the last year working on an open hardware project, so he’s exploring other work, not his own. His exploration is motivated by a tweet from @danmcquillan: “in the longer term, the snowden revelations will counter the breathless enthusiasm for #bigdata in gov, academia, humanitarian NGOs by showing that massive, passive data collection inevitably feeds the predictive algorithms of cybernetic social control”
Levin offers the idea of “the quantified selfie” and suggests we consider it as a form of post-Snowden portraiture. In a new landscape defined by drones, data centers and secret rendition, can these portraits jolt us into new understanding, or give us some comfort by letting us laugh at the situation we are encountering? He shows us John Lennon’s FBI file, and a self-portrait Lennon drew and argues that they are the same thing, “two different GUIs for a single database query.”
Artist Nick Felton is blurring the line between data portrait and portrait by offering data-driven annual reports of his life, analyzing his personal data for the year: every street he walked down in NYC, every plant killed. In honor of the Snowden revelations, he is preparing a 2014 edition that examines the uneasy relationship between data and metadata.
A more confrontational artwork comes from Julian Oliver and Danja Vasilev, called The Man in Grey. Two figures in grey suits carry mirrored briefcases. The suitcases are “man in the middle suitcases”, sniffing packets from local wireless and displaying what they find on the suitcase monitors. The artwork makes visible a form of surveillance that’s possible (and, as Kate Crawford will later explain, commercializable.)
If the ethical issues associated with street-based surveillance don’t give you some pause, consider Kyle McDonald, a Brooklyn-based new media artist who pushes the legal questions around these issues even further. He became interested in the inadvertent expressions he made when he used the computer. Seeking more imagery, he installed monitoring software on all computers in the Apple stores he could reach in New York City, and captured a single frame each minute (only when someone was staring at the screen), uploading it to Tumblr. The images reveal some of the stress and anxiety many of us face when we stare into the screen of a computer – McDonald’s photos reveal expressions from empty to confused, unhappy and unsure.
Apple was pretty unhappy with McDonald’s project, and he was forced to de-install the software, and is not able to show the photos he captured – instead, he shows watercolor versions of the images. But Levin notes that such surveillance isn’t hard to accomplish, and that the project “pushed the legal boundaries of public photography”.
A piece that pushes those boundaries even further is Heather Dewey-Hagborg’s “Stranger Visions”. The artist collects detritus from public places that could contain traces of DNA – cigarette stubs, chewing gum, pubic hairs from the seats of public toilets – and scans the DNA to measure 50 markers associated with physical appearance. Based on these markers, she constructs 3D models of the people she’s “encountered” this way. The portraits are less literal than McDonald’s, but transgressive in their own way, built from information inadvertently left behind.
And that’s the point, Levin argues – “These inadvertent, careless biometric traces and our constructed identities are creating entries in a database whose scope is breathtaking.” None of the art Levin features in his talk was made post-Snowden – surveillance is a theme many artists engage with – but they take on an especially sinister character when we consider the mass surveillance thats become routine in America, as revealed by Edward Snowden.
Kate Crawford (@katecrawford) is a professor based at Microsoft Research and MIT’s Center for Civic Media. She’s a media theorist who’s written provocatively about changing notions of adulthood, about gender and mobile technologies, about media and social change, and she’s now working on an examination of the promises, problems and ethics around “big data”. She notes that danah asked speakers on the panel to be provocative, so she offers a barnburner of a talk, titled “Big Data and the City: Ethics, Resistance and Desire”
Her tour of big data starts in the nation of Andorra, a tiny nation in the Pyrenees that’s been facing hard times in the European economic crisis. The government decided to try a novel approach to economic recovery: they decided to gather and sell the data of their citizens, including bus and taxi data, credit card usage data and anonymized telephony metadata. The package of data and the opportunity to study Andorrans is being marketed as a “real-world, living lab”, opening the possibility of a “smart nation” that’s even more ambitious than plans for smart cities.
These labs, Kate tells us, are being established around the world, and according to their marketing brochures, they look remarkably similar no matter where they are located. “There’s always a glowing city skyline, then shots of attractive urbanites making coffee and riding bikes.” But behind the scenes, there’s a different image: a dashboard, usually a map, that’s a metaphor for the central controller – a government agency? a retailer? – to examine the data. You leave a data trail, and someone else gathers and analyzes it. What we’re seeing, Kate offers, is the wholesale selling of data-based city management.
This form of pervasive data collection raises questions of the line between stalking and marketing. Turnstile, a corporation that has set up hundreds of sensors in Toronto – gathers the wifi signals of passing devices, mostly laptops and phones. If you have wifi enabled on your phone, you are traceable as a unique identifier, and if you sign onto Turnstile’s free wifi access points, the system will link your device to your realworld ID via social media, if possible. You don’t agree to this release of data – Turnstile simply collects it. They’re using it to provide behavioral data to customers – an Asian restaurant discovers that many of their customers like to go to the gym, so they create a workout t-shirt to market to their customers. This leads Kate to offer a slide of a man wearing a t-shirt that reads “My life is tracked 24/7 by marketers and all I got was this lousy t-shirt.”
Often this pervasive tracking is justified in terms of predictive policing, improving traffic flow, and generally improving life in cities. But she wonders what kind of ethical framework comes with these designs. What happens if we can be tracked offline as easily as we are online? How do we choose to opt out of this pervasive tracking? She notes that the shift towards pervasive tracking is happening out of sight of the less-privileged – some of the people affected by these shifts may be wholly unaware they are taking place.
Behind these systems is the belief that more data leads us to more control. She notes that Adam Greenfield, author of “Against the Smart City”, argues that the idea of the smart city is a manifestation of nervousness about the unpredictability of urbanity itself. The big data city is, ultimately, afraid of risk and afraid of cities.
When people react to these shifts by arguing for rights to privacy, Kate warns that we need to move beyond an analysis that’s so individualistic. The affects are systemic and societal, not just personal, and we need to consider implications for the broader systems. Not only do these systems violate reasonable expectations of privacy and control of personal data – “this would never get past an IRB – human data is taken without consent, with no sense of how long it will be held and no information on how to control your data” – it has a deeper, more corrosive effect on societies.
She quotes James Bridle, creator of the site-specific artwork “Under the Shadow of the Drone“, who notes one difficulty of combatting surveillance: “Those who cannot perceive the network cannot act effectively within it and are powerless to change it”. Quoting De Certeau’s “Walking in the City, she sees the “transparency” of big data as “an implacable light that produces this urban text without obscurities…”
Faced with this implacable light, we can design technologies to minimize our exposure. We can use pervasive, strong cryptography; we can design geolocation blockers. We can opt out or, as Evgeny Morozov suggests, participate in “information boycotts”. But while this is fine for certain elites, Kate postulates, it’s not possible for everyone, all the time. In the smart city, you are still being tracked and observed unless you are taking extraordinary measures.
What does resistance look like to these systems when opt-in and opt-out blur? Citing Bruce Schneier, Kate suggests that we need to analyze these systems not in terms of individual technologies, but in terms of their synergistic effects. It’s not Facebook ad targeting or facial recognition or drones we need to worry about – it’s the behaviors that emerge when those technologies can work together.
What do we lose when we lose a space without surveillance. Hannah Arendt warned of the danger to the human condition from the illumination of private space, noting “there are a great many things which cannot withstand the implacable, bright light of the constant presence of others on the public scene.”
Kate offers desire lines, the unpredictable shortcuts that emerge in public spaces, as a challenge to the smart city. We need a reflective urban unplanning, an understanding of the organic ways in how cities should work, the anarchy of the everyday. This is a vision of cities that values improvisation versus rigidity, communities versus institutions. In the process, we need to imagine a different ethical model of the urban, a model that allows us to change our minds and opt for something different altogether. We need a model that allows us to reshape, to make shortcuts and desire lines. We need a city that lets us choose, or we will be forever followed by whoever is most powerful.
Mark Latonero of USC Annenberg offers a possible counterweight and challenge to Kate’s concerns about big data. Latonero works at the intersection of data, tech and human rights, focusing on human trafficking. Human trafficking is common, and in severe cases, is a gross violation of human rights, sometimes involving indentured servitude or forced sex. It doesn’t have to involve transportation – he reminds us that human trafficking happens if someone is held against their will in Manhattan – and involves men, women, girls and boys.
His work has focused on human trafficking on girls and boys under 18 in the sex trade, a space where intervention is especially important as victims often experience severe psychological and physical trauma. (The children involved are also below the age of consent, which makes it easier in ethical terms – there are no considerations of whether a victim voluntarily chose to become a sex worker.)
Both victims and exploiters are using digital media, Mark tells us, if only mobile phones to stay in touch with family members. As a result, there are digital traces of trafficking behavior. Mark and colleagues are working to collect and analyze this data, including facial recognition as well as algorithmic pattern identification that could indicate situations of abuse. “It’s hard not to feel optimistic that this work could save a human life.”
But this work forces us to consider not only the promises of data and human rights, but the quagmires. This sort of work draws upon a kind of surveillance, and this kind of watching that’s intended for a social good that raises concerns about trust and control. “Gathering data in aggregate helps us monitor for human rights abuses, but intervention involves identifying and locating someone – a victim, or a perpetrator,” he explains. “Inevitably, there is a point where someone’s identity is revealed.” The question the human rights community has to constantly ask is “Is this worth it?”
Human rights work always involves data: data about humans, both about individual humans and aggregate data and statistics about groups of humans. At best, it’s a careful process relying on judgement calls made by human rights professionals. It’s worth asking whether it’s a process big data companies could help with. As we ask about the involvement of big data companies, we should ask about the balance between civil liberties risks and human rights benefits.
Despite those questions, the human rights community is moving head first into these spaces. Google Ideas, Palantir and Salesforce are assisting international human trafficking hotlines, analyzing massive data sets for patterns of behavior, hot spots where trafficking may be common. But all the questions we wrestle with when we consider big data – what are the biases in the data set? Whose privacy are we compromising and what are the consequences? – need to be considered in this space as well.
“Big data can provide answers, but not always the right ones,” Mark offers. One of the major issues for the collaboration between data scientists and human rights professionals is the need to work through issues of false positives and false negatives. Until we have a clearer sense of how we navigate these practical and ethical issues, it’s hard to know how to value initiatives like “data philanthropy”, where the private sector offers to share data for development or for protection of human rights.
There’s a growing community of data researchers who are able to bear witness to human rights violations. He shares Kate’s desire for an ethical framework, a way of balancing the risks and benefits. Is the appropriate model adopted from corporate social responsibility, which is primarily self-regulatory? Is it a more traditionally regulated model, based on pressure from NGOs, consumers and others? He references the “Necessary and Proportionate” document drafted by activists to demand limits to surveillance. If we could move towards an aspirational set of international principles on the use of big data to help human rights, we’d find ourselves in a proactive space, not playing catch up.
The session’s final speaker is Ramez Naam, a former Microsoft engineer who’s become a science fiction author. His talk, “Big Data 7000″ offers two predictions: big data will be big, and will cause big problems. The net effect is about the who, not the what, he offers. It’s about who has access to these technologies, who sets the policies for their use.
Ramez shows a snippet of DNA base pairs, a string of ATCGs on a screen. “This is someone’s genome, probably Craig Ventner’s, and as promised, once we sequenced the genome, we ended all health problems, cracked ageing and conquered disease.” It turns out that genes are absurdly complex – they turn each other on and off in complex and unpredictable ways. “We can barely grok the behavior of half a dozen genes as a network.” To really understand the linkages between genes and disease, we’d need to collect lots more genetic data. Fortunately, the cost of gene sequencing is dropping much faster than Moore’s law, and there’s now the long-promised $1000 gene sequencer. But to really understand genes and disease, we’d need to collect behavioral and trait data about people whose genomes were sequenced – what was the person like, what diseases did they suffer, did they have high blood pressure, what was their IQ?
Personal monitoring tools like Fitbit generate lots of individual value, and potentially lots of societal value, by helping us understand what behavioral and diet interventions are most helpful. Will you get fitter on the paleo diet? Or will red meat kill you? Our data about behavior and health is so sparse that we don’t know which is true, despite one third of health spending on weigh loss and fitness programs and tools.
Is Nest a $3 billion distraction for Google? Or the first step towards the Google-powered smart electrical grid. Enormous financial and environmental benefits could come from a smart grid – if we could manipulate electrical usage we might be able to take thousands of “peaker” plants, plants that run for only a few hours a day, offline.
Given the field, we can imagine situations where more data would be helpful. Education? Sure – if we had more rigorous understandings of what teaching techniques work and fail, what makes a good teacher and a poor one, could we potentially transform that critical field?
Ramez pivots to the problems. There will be accidental disclosures of data. He suggests we look at two stories with Target, one where they accidentally revealed a daughter’s pregnancy to a distraught father by sending her coupons for baby supplies, and the recent leak where Target lost 70 million credit card numbers (including mine.) It could have been worse, and it probably will, Ramez argues. It could have been data about where you go, your SMS messages, your email – they will inevitably be released.
Anonymization of data sets doesn’t really work, he reminds us. Chevron recently lost a massive lawsuit in Ecuador and has sued to identify the activists who sought charges against the company. It’s very scary to consider what might happen to those activists, Ramez tells us.
“The NSA is not the worst abuse of surveillance we’ve seen,” he points out. J. Edgar Hoover bugged Martin Luther King Jr’s hotel rooms with the approval of JFK and RFK, who were worried that MLK was a communist sympathizer. In the process, Hoover discovered that MLK was having an affair, and sent threatening letters to him promising to reveal the secret if MLK didn’t commit suicide. This is heinous abuse, on a scale that’s not been revealed in recent revelations. But if the current abuses are significantly more minor, the scale is massive, with millions of individuals potentially at risk of blackmail.
Still, what’s critical to consider is not the what, but the who. There are checks and balances between we the people, corporations, government. There are conflicts between all of these. We vote within a democracy, Ramez argues, and we can vote with our feet and with our dollars. Sometimes corporations and governments are in collusion – sometimes they’re in conflict. Sometimes government does the right thing, as with the Church Committee, which investigated intelligence activities and helped curb abuses. We may need to consider the legacy of the Committee closely as we examine the current situation with the NSA.
There’s some hope. Ramez reminds us that “leaking is asymmetric.” As a result, conspiracies are hard, because it’s hard to keep secrets. “If you’re doing something heinous, it’s going to get out,” he says, and that’s a check.
His talk is called Big Data 7000 and he closes by imagining big data 7 millennia ago, showing an image of a clay tablet covered with cuneiform. “When the Sumerians began writing in linear A – that was a dystopian period of big data.” Writing wasn’t empowering to the little people, Ramez tells us – the use of written language created top-heavy, oppressive civilizations. It’s the model Orwell had in mind when he wrote 1984. That image of the control of technology in one mighty hand, not distributed, is at the root of our technological fears.
But technology can be liberating – the rise of the printing press put technology into many hands, allowing for the spread of subversive ideas including civil rights . The future of the net, he hopes, is in from big data as something in the hands of the very few to data in the hands of the very many.
Hi, Ethan here again.
What I really appreciated about this panel was a move beyond rhetoric about big data that is purely at the extremes: Big data is the solution to all of life’s mysteries! Big data is an inevitable path to totalitarianism! What’s complicated about big data is that there’s both hype and hope, reasons to fear and reasons to celebrate.
The tensions Mark Latonero identifies between wanting surveillance to protect against human rights abuses, and wanting to protect human rights from surveillance are ones that every responsible big data scientist needs to be exploring. I was surprised to find, both at this event and in a recent series of conversations at Open Society Foundation, that these are tensions the human rights community is addressing head on, in part due to enthusiasm for the idea that better documentation of human rights abuses could lead to better interventions and prosecutions.
The smartest phrase I’ve heard about big data and ethics comes from my friend Sunil Abraham of the Bangalore Center of Internet and Society, who was involved with those conversations at OSF. He offers this formulation: “The more powerful you are, the more surveillance you should be subject to. The less powerful you are, the more surveillance you should be protected from.” In other words, it’s reasonable to both demand transparency from elected officials and financial institutions, while working to protect ordinary consumers or, especially, the vulnerable poor. Kate Crawford echoed this concern, tweeting a story by Virginia Eubanks that makes the case that surveillance is currently separate and unequal, more focused on welfare recipients and the working poor than on more privileged Americans.
There’s no shortcut to the hard conversation we need to have about big data and ethics, but the insights of these four scholars and those they cite is a great first step towards a richer, more nuanced and smarter conversation.
The other day, I had coffee with a friend who works for the New York Times. Early in the conversation, I admitted to him that I’ve developed a love/hate relationship with the Times. I love much of the paper’s content (though I share Greenwald’s wish that the Times would call torture “torture”) and find that many of the most interesting stories I read in a week come from the Times. But I am getting really sick of the Times’s efforts to nickle and dime me as a digital subscriber. Despite paying for access to the paper’s excellent content, they somehow make me feel like a piker if I’m not a subscriber to the print edition at nearly a thousand dollars a year.
I can access the Times through MIT, but decided that I read the paper often enough on other devices and outside of MIT’s network that I should become a digital subscriber. For a couple of weeks, I was a satisfied customer, reading far more than my allotment of ten free stories in my browser, and flipping through the paper on my phone when in transit or waiting on lines. But then the Times implemented its new “three articles a day” plan for mobile readers of the paper. My digital subscription – which costs $240 a year – includes a tablet and web version of the newspaper, but getting unlimited access via my phone costs an additional $180 a year.
Because the Times evidently takes its business cues from the widely despised cable TV industry, they like to bundle their content. As a result, the best way to get digital access is to purchase it bundled with the paper edition of the newspaper… which the Times won’t deliver to my rural address. I could also downgrade my bundle from web and iPad to web and phone, but it seems bizarre to me that digital data paid for in one place can’t be used in another.
And so I’ve found myself in the space of Times hacking, looking for ways to get content I want to read for a less exorbitant price than the Times wants to charge. (My current strategies: I am using my web subscription to dump articles to Instapaper, which I then read on my phone. Take that, Big Media!) Here, I join a large cadre of people who proudly post their tips for defrauding the Times so they can continue reading for free.
Let’s compare this situation to another media organization many New York Times readers rely on: public radio. No one writes articles bragging about how they avoided donating to NPR or how they get podcasts for free. In part, that’s because we don’t have to – public radio, for technical and historical reasons associated with the challenge of monetizing broadcast radio, is free by default, supported by voluntary donations. But there’s another reason: people love public radio, want to support it and feel guilty when they don’t.
I don’t intend to argue that the New York Times should become member supported. But I do want to make the case that they would benefit from thinking about the relationship public radio stations and shows have built with their members.
There’s a diagram that often gets drawn on napkins or whiteboards when media people get together: a Pareto, or long-tail curve, where the Y axis represents how engaged with content your readers are, and the X axis represents your reader population. Near the origin of the graph, the curve is very high – those are the small set of users who are deeply engaged with your content. Farther from the origin, as the curve flattens out, we have the majority of readers, who engage with your content occasionally. For the New York Times, it’s key to turn the folks on the left of the graph into subscribers and to make money from the right of the graph through ads. And as we head towards Peak Ad, it’s increasingly important to move readers across the paywall that separates the left and right side of the graph.
Public radio stations, producers and podcasters face a similar graph. In their case, it’s critical to get the left side of the graph to become members or make donations. But instead of dropping a paywall, they use a combination of gratitude and guilt to persuade their most engaged listeners to support their programming. When they do it well, their listeners feel terrific: Ira Glass urges listeners to defray WBEZ’s bandwidth costs for delivering This American Life online, telling us that if we could give more than $5, we’d pay not only our costs but those of listeners who don’t donate. And if we don’t donate? We feel guilty, but not criminal. The New York Times, which reminds me how many of my three free stories I’ve read on my phone, makes me feel like there’s a security guard trailing me to make sure I don’t stuff an extra New York Times article down my pants.
I suspect the business folks at the Times are operating under the assumption that there are only two places to be on their subscriber/revenue curve – you can be a subscriber and pay $300-800 a year, or you can be an outsider and cover a tiny fraction of your free riding with ad views. But there’s another option: the Times could start thinking of its readers in terms of subscribers, fans and passers-by. The Times won’t monetize passers-by, except through ads – these are folks who stumble onto the site occasionally and may not even realize they are reading Times content. That’s frustrating, but that’s how the web works. And the Times should certainly cultivate subscribers and encourage more fans to become subscribers. But they might do a better job of that by courting their fans, instead of locking them out.
Fans could be encouraged to support content on the Times not through a threat of locking them out, but by encouraging them to support the paper, and especially, the parts of the paper they value the most. When I donate to WNYC, I always take the opportunity to tell WNYC that I’m not a customer of the station as a whole, but of On The Media, my favorite outlet for smart media criticism. I have to think that some Times readers would love the opportunity to give to the paper and say, “Please don’t give this to Maureen Dowd. I’m giving in the hope of more Ta-Nehisi Coates op-eds.”
A New York Times that courted its fans would help fans track how much content they access from the Times, and perhaps, from other sources as well. It would take a suggestion from Doc Searls’s ideas about tracking usage of public radio and allowing users to donate to stations or programs that they listen to often. It might recognize that fans of the Times are fans of other publications, like The Guardian, the Christian Science Monitor or Planet Money, and band together with some of those other outlets to build a common tracking, membership and recommendations platform. (It would be very interesting for New York Times fans to discover they’re deeply dependent on the site’s content… or that they actually read other sources more than the times.) It could start treating fans who choose to subscribe as members, thanking them for making media accessible to others rather than making it clear that their content is only for those who pay.
Making news accessible for non-readers as well as readers is critical. News organizations have two bottom lines: they need to make enough money to keep the presses running, and they need to have a civic impact, holding the powerful responsible and giving citizens the information they need to participate in a democracy. As ad revenues decline, there’s a tendency for paywalled news sites to provide information only to the small group of people who subscribe to the paper. In the process, it’s possible that newspapers will lose their broader civic impact. If sites could find a way to get support from non-subscribers as fans, they could open their content to a broader audience and have more civic influence.
This would require some serious rethinking for the Times, and it’s quite possible they can support their reporting without making this change in the short term. But if we’re moving to a world where people are less dependent on a single media source, like the Times, and more inclined to pick and choose news from multiple sites, the Times will need to realize that fans can’t pay $300 to each content provider they want to support. Perhaps it’s time for the Times to start embracing and celebrating those fans, instead of alienating them.
It’s the 75th anniversary of the Nieman Foundation, and the Harvard-based program is bringing back generations of its fellows, mid-career journalists brought to Cambridge to study for a year, back to honor and celebrate the institution. One of the events on the program is the “Ninety Minute Nieman”, where organizers have invited a set of Cambridge-based professors to offer a taste of what happens in their classes in 10 minutes each. (7 professors, 10 minute lectures + the inevitable shuffling of papers = a 90 minute Nieman.)
I’m lucky enough to be one of those presenters. My friends at Nieman encouraged me to be provocative, as that’s the role I seem to have every year when I come and talk to new Nieman fellows. As someone with no experience in conventional newsrooms (save the summer I was the sports reporter for the Lewisboro Ledger at age 16) and as the co-founder of a global citizen media project, I tend to embody the anxieties and fears some mid-career journalists hold when they spend a year considering the future of their profession.
So why fight it? I decided to use my time on stage to make an argument I passionately believe: that journalism needs to help people be effective and engaged civic actors, and if it doesn’t, it shouldn’t expect to survive financially or in terms of influence. In the event that I’m struck by a flying shoe thrown by a journalist or editor in the audience and laid low, I’ve posted the text of what I hope to say.
I teach a class at MIT’s Media Lab called “News and Participatory Media” that’s become popular with Nieman scholars. I designed it as a class for engineers and software developers – the sorts of folks we expect to find at the Media Lab – with the goal of exposing them to different reporting problems so they understand some of the challenges journalists face, before working to build new tools for use in traditional and non-traditional newsrooms. Nieman fellows find it interesting, I think, because it exposes them both to different ways to think about reporting, and to students who think about news very differently. This leads to some interesting collaborations: a business reporter for one of Nigeria’s most prominent newspapers sought out one of my doctoral students for help scraping UK property databases to identify assets owned by kleptocratic Nigerian governors. We have a pretty good time.
Because the class includes reporters, who tend to be very passionate about the future of journalism, and geeks, who tend to be very passionate about social media and pretty skeptical about the current state of journalism, we have some interesting arguments over the course of a semester. I enjoy stoking these arguments, so I often bring in provocations to get us started. Which led me to bring in a remarkable column from Swiss novelist Rolf Dobelli.
Dobelli was pitching a new book, “The Art of Thinking Clearly” – which purports to use neuroscientific and cognitive science research to explain why it’s so hard to think clearly, and thinking as clearly as I can muster, I can’t recommend the book. But there’s one section, which was excerpted in The Guardian, titled “News is Bad for You” that’s a very worthwhile read. Dobelli claims that he gave up reading news four years ago and is a happier man for it: he describes news as a drug, a time-wasting habit that gives us the brief sense that we’re doing something productive and positive, but actually breaks our focus and distracts us while failing to explain the world in deep and meaningful ways or give us anything we, as readers, can do about what we read.
I figured this would spark a great conversation in our class: who would rise to defend the importance of staying informed in order to be an effective civic actor? To my great surprise, most of the class – the hacks and the hackers – were in agreement that Dobelli was more right than wrong. In part, this is because we all agreed that there’s a lot of badly written news out there – news that provides little background or context – and several of Dobelli’s critiques call out decontextualized, shallow. But the idea that had the most resonance for the class, and for me, was this: news is seldom connected to decisions we have to make as individuals, and that consuming news about situations we can’t influence will ultimately instill a sense of learned helplessness.
This is a particularly tricky argument for me, as my schtick for the past decade has been to argue that Americans need more information about the rest of the world. I just wrote a book that makes the case that we should rewire both news and social media to help us get a more cosmopolitan view of the world, so we can find connection and inspiration and solve global problems. But Dobelli has a point: the stories I’ve been trying to get Americans to pay attention to through Global Voices – repression and rebellion in Sudan, revolution in Tunisia, the rise of an African middle class – aren’t stories where readers have much agency. And part of the reason it’s so damned hard to get people to pay attention to events and voices that are geographically far away is that people rightly ask, “Why does this matter to me? Is this going to change how I work? How I shop? Even how I vote?” And the answer is, “probably not”.
I thought of Dobelli’s questions this summer, when I read Michael Schudson’s book, The Good Citizen. Schudson argues that the expectation for what a good citizen of America does has changed as our country has changed. In the post-independence period, the good citizen elected voted to elect the worthiest members of society to represent them – it was democracy by assent, largely noncompetitive. In the 19th century, good citizens were members of political parties not because of ideology, but largely because of personal and professional ties, and those parties, while competitive, competed on personality and organization, not on issues
Citizenship as many of us think about it is a product of the progressive era, Schudson argues. To overcome the party machines, progressives promoted the model of the informed citizen model, where muckraking newspapers uncovered malfeasance, where newspapers and magazines informed citizens on the issues of the day, where informed voters didn’t just elect representatives but voted directly on legislation through the ballot initiative process.
Schudson has at least two issues with the model of the informed citizen – he thinks it’s aspirational at best and farcical at worst, and he thinks its time passed somewhere around the 1960s. It’s impossible for a citizen to be informed on the range of issues that affect society – here he’s echoing some of Walter Lippman’s concerns from “Public Opinion” – and that’s not how the vast majority of us vote. And, he argues, since the 1960s, civic engagement that’s focused on making lasting change has focused on the courts, not on the ballot box – we’ve got a model of citizenship where lawsuits to establish rights and regulatory agencies that protect them are where much of the work of citizenship gets done
What I find helpful about Schudson’s argument is not his vision of rights-based citizenship, but the idea that the shape of citizenship can change over time. I think we’re experiencing one of those changes – I think we’re seeing a new form of civics that focuses on agency, on participation, and on trying to make an impact even at a very small scale.
It’s a version of citizenship that’s suspicious of opaque systems and institutions and is highly focused on seeing where effort and money goes – think of Kiva, which encourages people to loan money directly to developing world entrepreneurs, or Donors Choose, where people give to specific schools in need. Think of crowdfunding, where people support individual pieces of art they want to see made, rather than supporting arts institutions. Think of people who are politically engaged in campaigns on single issues – to arrest George Zimmerman or Joseph Kony – rather than to elect a party or a person. This is a version of citizenship that’s highly personal, highly decentralized, pointillist rather that sweeping in scope. It’s a vision of citizenship consistent with the changes we’ve seen with media, where everyone is creating media, whether it’s a Facebook update for friends or a blogpost that acts as an oped.
And just as we’ve discovered how difficult it is to navigate news and media in this space – how do we triangulate reports on Twitter and Facebook and government statements in a crisis like the Westgate mall attacks, especially when it turns out that official government sources are often less accurate than citizen sources – we’re discovering that it’s really hard to navigate this civic space. When tens of millions of American teens suddenly start demanding that the US put military forces in Uganda to arrest a warlord in the Central Africa Republic, do we treat this as a teen fad or as a serious policy concern? Do we use this as a chance to bring Ugandan voices into the dialog, or do we focus on the personal story of Jason Russell and his nervous breakdown? The KONY 2012 campaign gained an enormous amount of attention and, for a few weeks, shifted public debate – we need help figuring out whether it had impact, a question we should be asking both of campaigns that seek change by marshaling attention, and of journalism as a whole: what’s the civic impact?
This is a place where the news can help. We’re seeing a generation that’s not apathetic – they’re desperate to have impact. When we see them shying away from party politics, it’s not because they’re selfish or self-obsessed, it’s because they have a very hard time seeing how writing to their congressperson is going to change anything when congress lurches from shutdown to shutdown and passes historically few laws. People want to have impact, and the news can help.
We can help people understand where and how they can have impact. We can build on what David Bornstein is calling solutions journalism, featuring not just problems but the people and organizations trying to solve them – and we can do this in a way that probes at whether the solutions are as good as promised. We can link news stories to groups and campaigns trying to address the issues in those stories, as the Christian Science Monitor is doing in partnership with Shoutabout on their DC Decoder section. When we report on a crisis like Superstorm Sandy, we’re unafraid to drive readers to the Red Cross to help out – when my publication Global Voices reports on Kenya, we can point to ways people can help in the wake of the Westgate shooting, whether that’s to groups providing assistance to families, or to civil rights organizations now organizing to protect Kenya’s Somali population against an inevitable backlash.
What we cannot do is keep reporting news that keeps our readers informed but ineffective. There’s just too much else for them to pay attention to, whether it’s entertainment content or self-reinforcing, comfortable, partisan opinion. We’re losing the news not just because the financial models have changed, but because the civic models have changed. I doubt there’s a person in this room who got into the business for the money – everyone I know is motivated by a vision of public service. I worry that we’re failing to do public service because the way the public participates has changed. If we’re stuck in a paradigm where we inform citizens, then declare our work done, we’re failing in our public service duties.
By now, there’s any number of people in the audience waiting to ask the question, “Isn’t this advocacy journalism?” Since our forum doesn’t let you ask questions, let me go ahead and answer that for you: hell yes. And we should get used to it, because we’re all already doing advocacy journalism. Now that it’s incredibly easy to produce and disseminate information, what’s scarce is attention. Whenever we make a news judgement to put a story on our front page or deep inside our papers or sites, when we decide to cover a story in Malawi or in Mattapan, we’re doing advocacy journalism. We’re a part of a complicated ecosystem where everyone – activists promoting a cause, companies promoting a product, reporters delivering news – are competing for attention, and news organizations are very powerful actors within that system.
We’re advocating for the idea that what we’re covering is worth someone’s attention, and is worth more attention that something we’ve chosen not to cover. What Laura and Chris Amico have done with Homicide Watch is advocacy journalism at its very best – they advocate for the idea that everyone killed in DC or Chicago deserves to be reported on, whether they were black or white, rich or poor. When Godwin Nnanna reports on Nigerian governors buying mansions in Mayfair with money looted from Nigeria’s treasury, he’s committing advocacy journalism, just as he should, demanding that kleptocrats be held responsible for their crimes. When the Guardian puts Glen Greenwald on the front page, asking hard questions about government surveillance gone mad, it’s most certainly advocacy journalism and it’s advocacy journalism that we need if we want journalism to survive in the face of unconstitutional actions that changed forever our ability to assure sources that their identities will remain confidential and that they can talk to the press without fear of losing their jobs.
The problem isn’t journalism that advocates – it’s journalism that advocates a sadly limited set of options: vote for this guy or for that guy. We need journalism that helps us understand how we can participate and be effective, whether it’s through an election, a petition, a boycott, a new business model or technology. We need to ask whether our stories are teaching our readers to be helpless, or helping them become effective citizens.
My goal in teaching is to help my students see things from a different perspective, whether or not they end up agreeing with me – I aim to provoke more than persuade and hope I’ve done the former if not the latter. Please keep sending me Niemans.
More than a billion people a month visit YouTube to watch videos.
Sometimes, those billion people watch the same video. More often, they don’t.
YouTube shares information about what videos are popular in different cities and different countries, and for the US, offers a tool to see what videos are popular with different age groups and genders.
We were interested in seeing what videos were popular in different countries, and especially, what videos were popular in more than one country. For the past six months, we’ve gathered data from YouTube to understand What We Watch. The videos we feature are videos that appear on YouTube’s Trends dashboard. These are the videos trending in any of 61 countries – they are not necessarily the most popular of all time, or even most popular that month, but they are receiving a lot of attention in a short period of time. (Gilad Lotan’s explanation of trending topics on Twitter is useful for understanding that distinction.)
What We Watch is a browser for popular YouTube videos, built by Ed Platt, Rahul Bhargava and Ethan Zuckerman at MIT’s Center for Civic Media. (Rahul did data acquisition, Ed did visualization and Ethan waved his hands and requested features inappropriately late in the design process.)
Click on a country, and you’ll get a list of videos that have trended in that country, and a map that shows other countries that watch the same videos. Click a tab, and you can see videos popular just in that country, and not in other countries. Click on a second country, and you’ll see what top videos the countries have in common. Click a video itself, and you’ll get the video itself and a map of the countries where it was popular.
The results are often surprising. The US has more trending videos in common with Germany and the Netherlands than with near neighbors Canada and Mexico. One of the US’s top videos is a Punjabi music video that’s also got an audience in India and Germany. And a 90 second ad for Google Hangouts is surprisingly popular around the world… though hasn’t trended in the US, it’s apparent target market.
While What We Watch is a fun way to navigate the wealth of content available on YouTube, there are serious research questions behind the project as well. In Rewire, I argue that a network that connects computers throughout the globe doesn’t guarantee that content – like videos – will spread across borders of language, culture and nation. Some of what we’re finding on What We Watch supports that contention, and some challenges it.
The music video for “Roar” by Katy Perry offers evidence that some videos find truly global audiences – the video is has trended from Peru to the Philippines, and one of the top videos in Turkey and Saudi Arabia. Other videos find regional, but not global audiences – take P-Square’s “Personally”, which was in the top 10 in Nigeria for 17% of dates we tracked, and is popular in Ghana, Uganda, Kenya, and Senegal… but no where outside of sub-Saharan Africa. And some videos never leave home: Brazil’s top trending video, a humorous ad for a phone company that requires no translation, doesn’t show up on the top charts for any other country.
I’ve been deeply influenced by Pippa Norris’s work on the spread of culture and values across national borders, specifically her book “Cosmopolitan Communications” with Ronald Inglehart. They argue that people tend to overestimate the Katy Perry effect in which US culture sweeps the globe, leveling everything in its path. In some cases, people encounter another culture and reject it violently (the Taliban model), shape it and incorporate it into a new hybrid (the curry model) or simply decide it’s not for them (the firewall theory.) We see evidence for three of the four in our data – it’s hard to see the Taliban model because violent rejection would likely mean banning YouTube, which gives us no data to measure.
We also get some hints on what countries have videos in common. Language matters: countries in Latin America tend to have videos in common with other Spanish-speaking countries. But Brazil and Portugal don’t share much content (and Brazil’s viewing habits have little overlap with anyone, offering another theory: if you have a big enough domestic internet, you may develop your own, insular internet culture, as in Japan as well.)
We got very interested in countries that share content with lots of other countries. To identify these countries, we used a metric called “betweenness centrality“. Imagine the countries as nodes on a graph, connected by links that represent videos in common. If you calculate paths from each node of the graph to each other, nodes that many paths move through have high betweenness centrality – they are bridges through the network.
The countries with highest betweenness centrality are United Arab Emirates and Singapore. Both have lots of weak ties to other countries, which means they may act as cultural bridges between unconnected countries – we can imagine a video popular in India making its way to Yemen through the United Arab Emirates. It’s interesting to note that Singapore and UAE both have massive populations of expatriates and “guest workers” (over 90% of the population in UAE and over 40% in Singapore). Culture travels with people, and it’s no surprise that Indians in the UAE would want to watch videos from home, or that Poles living in the UK mean there are Polish-language videos in the UK’s top ten.
What we don’t know yet is whether videos spread through the networks: i.e., does a video made in India spread to Yemen through UAE, for example? To test that, we’ll need to watch how a popular video spreads over time, and, ideally, we’d want to know where a video originates. That’s harder than you might think. We’ve looked at the possibility of hand-coding the videos as to their nation of origin, so we can see whether a UK video might appear on the charts first in Australia or Poland. But we’re flummoxed by the fact that many of the popular videos aren’t easily pinned down to one nation or another – take this ad, popular in both Russia and Ukraine. It’s a Nike ad about street soccer, which suggests we should attribute it to the US, where the company is based… but the ad’s in Russian, clearly aimed at urban audiences in Eastern Europe and not for a US market. Do we code it as US, Russian or global?
And then, of course, there’s this ad for Google Hangouts. It’s a sweet and sappy 90 second story about a girl who moves to the big city and stays in touch with her dad via Hangouts. The accents are American and it appears to be an ad designed for the US market, but it has trended around the world, including in many countries with high rates of emigration for work or education. Google may have wanted to encourage American twenty-somethings to connect with their parents, but the message seems to resonate for people around the world.
Please experiment with What We Watch and let us know what you think – you can post comments here about anything interesting you discover, or research questions you think we should ask. The code and data behind the system is available on GitHub should you wish to build your own, or to see what we did. One caution for researchers – we are not showing videos that have been taken down by Google, for copyright or other reasons. In some cases, this means we’re removing many videos from top lists. We hope, in the long run, to show the metadata of those videos, but for now, they’re just not in the set, which means the data is not entirely representative of what we’ve collected.