The survival of languages in a digital age

I’m heading to Tanzania in a few weeks for the TED global conference, and I’d like to improve my Swahili before I go. (This wouldn’t be hard, as I only know half a dozen words.) Search for Swahili resources online and you’re bound to find the Kamusi Project, a remarkable online Swahili-English dictionary that’s been built by paid staff and volunteer contributions over the past dozen years. Dr. Martin Benjamin, the chief editor of the project, sees Kamusi as a possible model for “living dictionary” projects to document all African languages. Kamusi is open to contributions from Swahili speakers and scholars all throughout the world, but these contributions are compiled and edited into a fact-checked, well-indexed resource that’s become indispensible for Swahili scholars.

There’s one major problem: Kamusi is broke, and development of the project has slowed as a result. Benjamin and his compatriots are trying to raise money through text ads on the site and sales of a “Swahili clock”, which tells the time in terms of hours after dawn, rather than hours after midnight. But to serve as a comprehensive Swahili resource, and to expand to document thousands of other African languages – or even twenty, which is their intermediate target – Kamusi would require substantial foundation, corporate or academic funding.

It’s an uphill battle to bring African languages onto the Internet. While there are lively communities on Wikipedia preserving European languages like Welsh or Frisian, most of the speakers of minority African languages, like Ewe or Bambara, have little net access and less net expertise. There’s the very real concern that some of these languages may die out before their native speakers start writing online.

Duane Bailey’s work on Translate.org.za helps explain why it’s important to bring languages online. In its post-Apartheid constitution, the Republic of South Africa enshrined 11 official languages. Duane has been working to ensure that South Africans have software, including applications and operating systems, that are in their native languages.

Why? Imagine learning how to use a computer in your second or third language. A native Setswana speaker, learning to use Microsoft Office, has the challenge of learning new software compounded by having to read dialogs and menus in a less familiar language. Educators believe that people learn to read more quickly when learning in their native language – it’s reasonable to believe that new users learning computers would benefit from computers with interfaces in their native tongues. Bailey has had great success localizing Open Office and other open source products into many South African languages, and is now approaching the larger question of building a framework to localize software for as many African languages as possible.

Rich dictionaries are a critical ingredient in building localized software. To write a spellchecker, you need a word list for a language with definitive, proper spelling. To localize the interface and dialogs of a program, translators may need to create new words for concepts that don’t otherwise exist in the language. (It’s certainly possible that concepts like “menu” or “icon” won’t translate neatly in Wolof, for instance.) Creating this new vocabulary requires close study of the existing language to create terms that are sensible, pronounceable and not confusing within the language – a rich dictionary goes a long way towards making that work possible.

There’s a tendency, I think, to believe that the spread of the Internet and the desktop computer is inherently connected to the global spread of the English language. (That was certainly my assumption fifteen years ago as I played with early internet systems.) But we’re starting to discover that this is a fallacy. There are now more blog posts per day in Japanese than in English, and there may be even more Chinese bloggers. (While Technorati does a great job of counting blogs that contact pingservers to let them know about updated blogs, many Chinese blogs don’t use these services and tend to get undercounted.) As I wrote about last week, when a large number of users who speak a particular language come online, they seem to start talking to each other in their native tongue, rather than in a second tongue.

But the slow spread of the Internet in many African nations suggests that it may be a while before Wolof speakers are writing in that language instead of in French. And the smaller the language, the longer it takes to establish a community online… and, generally speaking, the higher the chance that most speakers of the language don’t have regular internet access. Some African languages will not survive in a digital era.

E.O. Wilson’s Encyclopedia of Life project invites the world to help in documenting the rich variety of species in the natural world. The idea behind Benjamin’s work is a bit less audacious, but still incredibly ambitious – document every language on the African continent before it dies out. Species can be lost forever, and with it, possible cures for disease, insights into the history of evolution, critical members of ecosystems. But something is lost when languages die as well – the knowledge held by that community of speakers, much of which may not exist – and sometimes may not be able to be expressed, in other languages.

Maybe it’s too much to ask for a global, participatory encyclopedia of language. But a good start would be helping find people and funders to support projects like Kamusi which are working hard to make sure that Swahili is a language of the future as well as of the past and present.


David Sasaki’s got a great post, in part in response to this post, which points out the Swahili will almost certainly survive in the digital age, given the rich community of Swahili authors online, and that Swahili is likely squashing other smaller languages…

This entry was posted in Africa. Bookmark the permalink.

14 Responses to The survival of languages in a digital age

  1. “It’s an uphill battle to bring African languages onto the Internet.”

    Yes, but the bigger problem, as I understand it, is that few indigenous African languages are commonly written. I know that in the case of Wolof, an extremely small number of Senegalese are literate in Wolof, and an even smaller number use it in everyday writing. There are no Wolof-language newspapers. If you read online Senegalese forums, most of them are in French with bits of Wolof mixed in. I’m sure the same is true for many other languages.

    So the question becomes, not how do we write software or web sites in these languages, but rather, how do we get these people who already speak them, to become literate in their native language?

  2. Pingback: mawazo na mawaidha » lugha na intaneti

  3. you forgot kasahorow.com, bound to be the most comprehensive resource for African languages. thanks for the post.

  4. Pingback: El Oso, El Moreno, and El Abogado » Blog Archive » Language and the Internet

  5. Ian says:

    One of the major problems is that so many of these projects are working in isolation, and that their data is not available in any kind of standard format. I’m working on helping the smaller languages get going in Wikipedia by translating basic info, such as country templates (http://greenman.co.za/translate). My scope is quite limited – I’m not trying to translate text, just to help populate smaller wikipedias so as to give their own communities a boost. So much duplication, so many incompatible formats, so much closed data!

    Anyone wanting to help with Wikipedia transalation please contact me at wiki AT greenman {{Dot}} co {{DOt}} za. I am particularly looking for people to help translate specific template strings in smaller languages. With very limited translation work, basic country info can be rolled out into Wikipedia quite quickly.

  6. Matt Berg says:

    We’ve been working hard on to release a new version of moulin, our effort to make Wikipedia accessible offline. While we realize this only addresses one small aspect of the problem – accessibility — this new version will support the ability to offer multiple language packs. We hope to eventually offer a version (or language pack) of moulin for all the major African language wikipedia projects that are currently serving as more then a placeholder.

    Obviously, as your post points out, the major problem is lack of content. As Cyrus points, out many educated Africans (in West Africa at least) have never formally been taught how to read much less write in their maternal language. How many different ways have you seen “nanga def” or “in’ce” spelled for example? It’s a shame since Senegal has not only a rich oral but written history in Wolof thanks to early Muslim scholars who wrote in Wolof using Arabic script. I could be wrong, but I am quite sure that there is a Wolof newspaper in Senegal though.

    Ethan, please make sure to meet up with Moussa Keita (who came up through Geekcorps Mali) who will be attending TED as an African Fellow.

  7. Carl Hinton says:

    Hi,

    I have a good knowledge of Nyakusa (spoken mainly in Tanzania) and would willingly supply words, phraises and their equivalents in Kiswahili and English. Anyone interested?

    Regards

  8. kisiki says:

    Hongera na tunakungojea hapa Tanzania. Karibu

  9. Said Hassan says:

    Bw Ethan,
    Karibu sana Tanzania, hutopata shida sana kujua misamimiati ya Kiswahili kama hutokuwa mvivu kuongea na kuuliza.

  10. Richard says:

    Kwa taarifa yako kiswahili ni lugha ya kistaarabu sana, na kama una kuja Tanzania huku utapata kile cha asilia

    karibu sana,

  11. Alto Raymond Mtewele says:

    I left Africa 20 years ago. As I go back to meet my relatives, always I notice that so many thingys are diappearing in language and culture in all. For example my mother language in areas south of Tanzania, Njobe where we speak Kibena.Only in villages we can say that they still speak it in real. What I think is that, there should be an awareness in us and our countries in saving these cultures by launching projects for PhD-s etc. That is the only way left before it is too late.

    Alto

  12. I am glad to say that there are more Malagasy bloggers writing in unadulterated Malagasy now. And we are hoping that the newly created GVO amin’ny teny malagasy will encourage more to use Malagasy instead of English or French. Malagasy is one of those rare African languages that is both written and read by the whole Malagasy nation.
    It will be interesting to see how well we fare in the digital age and if Malagasy language can survive it.
    In the words of the Malagasy Academy of Sciences “Andrianiko ny teniko, ny an’ny hafa koa feheziko”, translated as “I respect my language while I speak foreign ones”.

  13. Pingback: Global Voices Online » Blogging in Neo Patwa

  14. anita macwilliam says:

    I lived and worked in Tanzania for nearly 40 years. I am a linguist and anthropologist.During my time there I began a language school for expatriates coming to Tanzania – the teachers were all Tanzanians. We taught four languages: Kiswahili, Kisukuma, Luo and Kikuria. We (i.e. the teachers and others under my guidance – wrote the lessons).
    I found the work fascinating and really quite fun – because we all went out walking after formal classes and met the people in their villages. Later I was hired by the Univ. of Dar es Salaam in the Institute of Kiswahili Research where we developed an English-Swahili dictionary and a Swahili-English dictionary. We also compiled three scientific dictionaries for Biology, Chemistry and Physics.

Comments are closed.