The Polyglot Internet
Prepared for the World Economic Forum Global Agenda Council on the Future of the Internet by Ethan Zuckerman, October 30, 2008
The first wave of the Internet revolution changed expectations about the availability of information. Information that was stored in libraries, locked in government vaults or available only to subscribers was suddenly accessible to anyone with an internet connection. A second wave has changed expectations about who creates information online. Tens of millions of people are contributing content to the modern Internet, publishing photos, videos and blogposts to a global audience.
The globalization of the Internet has brought connectivity to almost 1.3 billion people. The Internet that results from globalization and user-authorship is profoundly polyglot. Wikipedia is now available in more than 210 languages, which implies that there are communities capable of authoring content in those tongues. Weblog search engine Technorati sees at least as many blogposts in Japanese as in English, and some scholars speculate that there may be as much Chinese content created on sites like Sina and QQ as on all English-language blogs combined.
A user who joins the Internet today is far more likely to encounter content in her own language than had she joined ten years ago. But each internet user is able to participate in a smaller percentage of the total interactions and conversations than an English-speaking internet user could in 1997 when English was the dominant language of the net.
There’s a danger of linguistic isolation in today’s internet. In an earlier, English-dominated internet, users were often forced to cross linguistic barriers and interact in a common language to share ideas with a wider audience. In today’s internet, there’s more opportunity for Portuguese, Chinese, or Arabic speakers to interact with one another, and perhaps less incentive to interact with speakers of other languages. This in turn may fulfill some of the predictions put forth by those who see the Internet acting as an echo-chamber for like-minded voices, not as a powerful tool to encourage interaction and understanding across barriers of nation, language and culture.
For the the Internet to fulfill its most ambitious promises, we need to recognize translation as one of the core challenges to an open, shared and collectively governed internet. Many of us share a vision of the Internet as a place where the good ideas of any person in any country can influence thought and opinion around the world. This vision can only be realized if we accept the challenge of a polyglot internet and build tools and systems to bridge and translate between the hundreds of languages represented online.
Machine translation will not solve all our problems. While machine translation systems continue to improve, they are well below the quality threshold necessary to enable readers to participate in conversations and debates with speakers of another languages. The best machine translation systems still have difficulty with colloquial and informal language, and are most reliable in translating between romance languages. The dream of a system that creates fully-automated, high-quality translations in important language pairs like English/Hindi still appears long off.
While there is profound need to continue improving machine translation, we also need to focus on enabling and empowering human translators. Professional translation continues to be the gold standard for the translation of critical documents. But these methods are too expensive to be used by web surfers simply interested in understanding what peers in China or Colombia are discussing and participating in these discussions.
The polyglot internet demands that we explore the possibility and power of distributed human translation. Hundreds of millions of internet users speak multiple languages; some percentage of these users are capable of translating between these. These users could be the backbone of a powerful, distributed peer production system able to tackle the audacious task of translating the internet.
We are at the very early stages of the emergence of a new model for translation of online content - “peer production” models of translation. Yochai Benkler uses the term “peer production” to describe new ways of organizing collaborative projects beyond conventional arrangements like corporate firms. Individuals have a variety of motives for participation in translation projects, sometimes motivated by an explicit interest in building intercultural bridges, sometimes by fiscal reward or personal pride. In the same way that open source software is built by programmers fueled both by personal passion and by support from multinational corporations, we need a model for peer production translation that enables multiple actors and motivations.
To translate the internet, we need both tools and communities. Open source translation memories will allow translators to share work with collaborators around the world; translation marketplaces will let translators and readers find each other through a system like Mechanical Turk enhanced with reputation metrics; browser tools will let readers seamlessly translate pages into the highest-quality version available and request future human translations. Making these tools useful requires building large, passionate communities committed to bridging in a polyglot web, to preserving smaller languages and to making tools and knowledge accessible to a global audience.
If we do not address the problems of the polyglot internet, we introduce another possible way our shared internet can fragment. There are competing - and likely incompatible - visions for future governance of the internet. As the internet becomes less of a global, shared space and more of a Chinese or Arabic or English space, we lose incentives to work together on common, compatible frameworks and protocols. We face the real possibility of the internet becoming multiple internets, divided first by languages, but later by values, norms and protocols.
The internet is the most powerful tool created by humans to allow connection, collaboration and understanding between people of different nations, races and cultures. For the internet to reach its potential in bridging human differences, we need to make the problems of language and translation center to our conversations about the future of the internet.









November 1st, 2008 at 2:30 pm
[...] all those languages are well represented on the Internet. Wonder what you’re missing? Me too. So I’ll link to an English version as well, in case you’re [...]
November 1st, 2008 at 10:30 pm
You nailed it, again.
Many thanks, Ethan.
November 2nd, 2008 at 1:49 pm
Great essay, and it sparked me to consider one unavailable opportunity — taking advantage of the Google Books Search settlement, at http://snurl.com/4xw20 [blogs_lib_berkeley_edu] . The blog discusses the possibility of creating multiple rough machine translations of the Google Book Search content for search and discovery (as opposed to presuming the ability to create new authorized translations without the rightsholders’ permissions).
November 10th, 2008 at 7:01 pm
[...] Fractures are slightly more subtle. They’re issues that if left unchecked might cause the single, unified internet we know and love to split into multiple internets. These include incompatibilities between the mobile and wired web, the immobility of content trapped in the “walled gardens” of companies like Facebook which make it challenging to migrate content, as well as more social issues, like the fragmentation of public space online (the possibility of echo chambers ala Cass Sunstein) and the danger of fragmentation by language, culture and local laws, my current obsession. [...]
November 12th, 2008 at 5:45 pm
[...] “Svetskog ekonomskog foruma” (World Economic Forum). On navodi da je Internet prvo omogućio dostupnost znanja a zatim promenio autore informacija jer su alati za objavljivanje postali [...]
January 2nd, 2009 at 5:58 pm
[...] Zuckerman’s culture- and language-driven splintered Internet, or [...]
January 24th, 2009 at 5:33 am
do you think that maybe images, i mean photographs & videos, will gain then more and more place?
February 4th, 2009 at 11:24 am
[...] across an essay by Ethan Zuckerman of the Berkman Center for Internet and Society at Harvard called The Polyglot Internet in which he discusses the problems of machine translation (see previous post) and the danger of [...]
February 6th, 2009 at 3:34 pm
[...] Of course, TED now reaches far more than the people who can come to the conferences. TED Talks on video have reached millions of viewers, and they’re going o reach even more, as June Cohen announces that TED Talks will now have subtitles in 25 languages, including Hindi, Swahili and Tamil. The exciting next step is allowing open translation, which will let anyone translate talks into any language - a wonderful approach to buiding bridges in the polyglot internet. [...]
March 17th, 2009 at 3:21 pm
[...] are others who have eloquently stated the need for this, e.g. here and here. These thoughts are being echoed across the globe and it is likely, that the change agents [...]
May 1st, 2009 at 3:31 pm
[...] I am looking forward to seeing how far we’ve come since 2007. Global Voices Lingua translation has expanded tremendously under the leadership of Leonard and Portnoy. Meedan is starting to really get going. dotSUB has partnered with TED to make their talks available in multiple languages. And perhaps most exciting of all is this news from Chris Salzberg about a new tool called Minna no Honyaku which will soon be released as open source code with the aim of translating as much online content as possible. Why translate as much online content as possible? Ethan Zuckerman makes a good argument. [...]
May 8th, 2009 at 12:46 pm
[...] been thinking a lot about social translation over the past few years. Language, after all, is the most difficult and most time-consuming barrier standing in the way of global conversation. Chris Salzberg, who just recently published his Ph.D. [...]
May 11th, 2009 at 12:41 pm
[...] The idea for a translation exchange as a parallel and complimentary project to Lingua began in response to the larger challenge of the polyglot internet: that, with over 1.3 billion Internet users, any one of us is only seeing a small slice of existing content, based on our language capacities. Ethan Zuckerman captures the phenomenon in this post - and in its English translation. [...]
May 11th, 2009 at 12:41 pm
[...] of news and information across multiple languages to support the polyglot Internet - read this post by Global Voices co-founder Ethan Zuckerman for background. The specific objectives are to research [...]
May 11th, 2009 at 12:42 pm
[...] of news and information across multiple languages to support the polyglot Internet - read this post by Global Voices co-founder Ethan Zuckerman for background. The specific objectives are to research [...]
May 11th, 2009 at 4:15 pm
[...] The Lingua initiative, unvoluntary nearly every by the enthusiasm, creativity, and the efforts of move translators, demonstrates the power for a accord of like-minded translators and writers to denture module barriers to deal stories and information, supported on a simple, untechnical platform. Lingua points to the continuance of manlike try and the grandness of society and accord in choosing what to translate. It has also demonstrated the continuance of diffuse manlike movement as a effectuation of apace translating a super abstraction of underway and topical information. The intent for a movement mercantilism as a nonconvergent and gratis send to Lingua began in salutation to the super contest of the someone internet: that, with over 1.3 1000000000 cyberspace users, whatever digit of us is exclusive sight a diminutive swing of existing content, supported on our module capacities. The supply was addressed during our 2008 Summit in Budapest, both in presentations by members of the Lingua aggroup (see http://globalvoices.blip.tv/file/1070249) and in numerous conversations on the side. Ethan Zuckerman captures the phenomenon in this post - and in its arts translation. [...]
May 11th, 2009 at 4:16 pm
[...] programme and aggregation crossways binary languages to hold the someone cyberspace - feature this post by Global Voices co-founder Ethan Zuckerman for background. The limited objectives are to [...]
May 13th, 2009 at 1:11 pm
[...] I think there are a lot of lessons in the tool and thinking behind it for anyone who hopes to make the polyglot internet more comprehensible, and for anyone thinking about online [...]
May 14th, 2009 at 8:42 pm
[...] I think there are a lot of lessons in the tool and thinking behind it for anyone who hopes to make the polyglot internet more comprehensible, and for anyone thinking about online [...]
May 15th, 2009 at 4:00 am
[...] I think there are a lot of lessons in the tool and thinking behind it for anyone who hopes to make the polyglot internet more comprehensible, and for anyone thinking about online [...]
May 15th, 2009 at 5:24 pm
[...] وذلك من أجل دعم الانترنت متعدد اللغات – راجع هذه التدوينة التي كتبها إيثان زكرمان - أحد مؤسسي الأصوات العالمية [...]
May 17th, 2009 at 5:02 pm
[...] phrase “polyglot internet” comes from an essay I wrote late last year as a thought piece for a discussion in Dubai hosted by the World Economic [...]
May 19th, 2009 at 4:04 pm
[...] phrase “polyglot internet” comes from an essay I wrote late last year as a thought piece for a discussion in Dubai hosted by the World Economic [...]
May 19th, 2009 at 4:14 pm
[...] phrase “polyglot internet” comes from an essay I wrote late last year as a thought piece for a discussion in Dubai hosted by the World Economic [...]
May 19th, 2009 at 4:16 pm
[...] phrase “polyglot internet” comes from an essay I wrote late last year as a thought piece for a discussion in Dubai hosted by the World Economic [...]
May 19th, 2009 at 4:22 pm
[...] phrase “polyglot internet” comes from an essay I wrote late last year as a thought piece for a discussion in Dubai hosted by the World Economic [...]
May 19th, 2009 at 4:31 pm
[...] phrase “polyglot internet” comes from an essay I wrote late last year as a thought piece for a discussion in Dubai hosted by the World Economic [...]
May 20th, 2009 at 6:25 am
[...] phrase “polyglot internet” comes from an essay I wrote late last year as a thought piece for a discussion in Dubai hosted by the World Economic [...]
June 1st, 2009 at 8:53 pm
[...] of us arguing that we’re entering a world where massive social translation is neccesary, the polyglot internet, it would be awfully helpful to have a sense for whether English is still the majority language on [...]
June 22nd, 2009 at 12:06 am
[...] http://www.ethanzuckerman.com/blog/the-polyglot-internet/ * [...]
June 26th, 2009 at 4:54 pm
[...] those of us who think the Internet is a powerful tool for international understanding, language is a challenge we need to confront, a complex set of problems we need to address. I just had the chance to join a small band of people [...]