Author

I am Joannes Vermorel, founder at Lokad. I am also an engineer from the Corps des Mines who initially graduated from the ENS.

I have been passionate about computer science, software matters and data mining for almost two decades. (RSS - ATOM)

Meta
Monday
Nov302009

## Continuous Localization or l10n 2.0

There is nothing is easier to sell globaly than software. Yet, it's still surprising to me to see how few efforts are made on average by small software companies toward localization.

Disclaimer: I am not saying that selling software anywhere is easy. Some places are really tough. I am just saying that selling about anything else worldwide level is just 10x harder.

### Translation is (relatively) cheap

Localization (l10n in short) is easier and cheaper than you think. In a past project of mine, a few years ago, I managed to translate a web app (freelance marketplace) in 13 languages for less than $2000. Yes, that's right, it was roughly$150 per language.

The first thing to understand here is that freelance translation is cheaper than usually thought. Generic freelance considerations apply, but compared to the massive efforts needed to actually develop and maintain any small piece of software, translation is just dirty cheap.

### ... but management is not ...

Yet, if translation is unexpensive, managing translators is not. Each time I took over localization works, I relealized that managing half a dozen of remote translators on an ongoing basis was nearly requiring a full-time commitment from my side.

If think, most companies realize this effect intuitively. The bottom line result is that localization is typically performed in big batches.

Big batches seem to be the archetype of the non-agile process. Every two years, package all documents and hand them over to some translation agency. Wait for two months. Publish the (already outdated) documents, and wait more. Two years later, once documents are desperatly outdated, repeat.

Although, I can't blame the community doing it that way, as I was doing no better. Yet, this process felt wrong. Since localization is such a big time-consuming mess, we do it only once in a while and meantime prospects and customers suffer outdated materials on an ongoing basis, which, somehow, is even worse than poor-quality translation.

### ... hence Continuous Localization

Among all good practices in software development, I have found continuous integration to be one of the few breakthrough that have significantly improved agility in project management. The core idea being continuous integration is that integration becomes part of your daily process.

Instead of updating the deployment logic once every 18months, you do it on an ongoing basis, so that the software is already ready to ship. Yet, continuous integration comes with a gotcha: you can't do it by hand. It takes another layer of automation: the integration server.

Thinking about it, localization is similar to integration

The simple idea of incremental localization without some automation seemed doomed as it would require insane communication efforts between manager and translators.

Let see how the localization process could be automated:

1. (automated) Get all source documents and incremental updates.
2. (automated) Map updates to every target languages.
3. (manual) Apply corresponding incremental updates to target documents.
4. (automated) Keep track of the amount of work made by each translator.
5. (automated) Keep track of work batch to get translators paid.

Obviously, the one step that cannot be automated this translation operation itself; but then all other steps can be vastly automated.

This idea has been the starting point of a project codenamed Lokad.Translate. This project is nothing more than a webapp playing the role of a localization server and providing all the automation that we can get to speed-up the localization process - both on the management side, but on the translator side as well.

Tech note: Lokad.Translate is ASP.NET MVC + NHibernate on top of Azure.

Since we did not want to reinvent the wheel, we decided to leverage the capabilities of the wiki powering our own company website. In particular, in order to retrieve the list of incremental changes, we using nothing else but the web feed generated by the wiki (RSS in present case, although it does not matter much) that represents recent changes. The nice thing about web feeds is that most webapps are already providing one (think blogs).

Then, concerning document management (both originals and translations), there is a gotcha: there is no need to manage documents themselves, as managing the URLs pointing to the documents is enough. Once the URL is known, if a REST API is provided by the wiki, all other commands (view/diff/edit/...) could be inferred with simple REGEX.

My objective would be to achieve a near continuous localization of the content posted on our website with say, no more than 2 weeks of delay between initial post and its localization with minimal overhead both on the management and translator. We will soon start deploying Lokad.Translate for our internal need, we will see how it goes.

Then, depending of community, we will probably provide release Lokad.Translate one way or another. Stay tuned for more (and don't hesitate to contact me if you're interested).

Continuous l10n is a great idea indeed, and I wrote about it a few months ago at http://mrooney.blogspot.com/2009/07/launchpad-is-now-automatic-magical.html . I'm using Launchpad.net to achieve a fairly ridiculous level of automation; all I have to do is commit a translation template and Launchpad commits translations back on a daily basis, all for $0 a language! Not to mention Launchpad itself is free and open-source, and handles many other aspects of projects as well. November 30, 2009 | Mike Rooney Well, I guess you can't get$0 translation unless your app in open source :-) . But, then thanks for the links. I got an email from Cedric Savarese (Veer West) who also had the exact same idea months ago. Still weird to see there has been no convergence in the software industry so far. One could have expected such a problem to be solved a decade ago.