I18n sanity

From Creative Commons
Revision as of 22:20, 15 June 2011 by Christopher Webber (talk | contribs) (Adding the rest of the plan.)
Jump to: navigation, search

The problem

We use logical keys for our translations for historical reasons (years ago when our tools were written in TCL that's how you did it, and Zope allows for the option).

Logical keys aren't bad on their own... the problem is that in general the rest of the world doesn't seem to use them. Thusly we've had to write our own tools to move our translation files around, sync things, etc; these tools require a lot of specialized knowledge to maintain and tend to be fairly fragile. It would be significantly easier on ourselves if we could just move away from them.

To do that, we need to move from logical to english keys. Given the complexity of our existing system this won't be easy... one attempt at moving was made before and was later given up on. Here's to success on the second try! A relevant plan is laid out below.

The plan!

The general plan is that we'll have a script that can go and extract everything automatically from our sets of tools that use cc.i18n.

Switch back cc.i18n to Babel and just wrap writing in polib

I've half fixed some recent issues we've found with the polib, but not completely. It might not be worth finishing fixing these if we're going to tear this all down anyway. It might be worth just switching things back to Babel and wrapping the writing calls in polib to handle the formatting pains we were having with Transifex and git without having to worry about things working properly otherwise.

Moving cc.engine templates to Jinja2... why?

Why Jinja2? Why move templating systems at all?

Actually I like Zope Page Templates, and it's actually possible to not use logical keys with them. However certain types of translations require *manually editing* the extracted strings (basically ones with special xml attributes), which simply won't work in a glorious future of automatic-extraction.

Jinja2 seems to be the most popular templating system right now and is almost exactly like Django's templating system with a few improvements. (You can pass in arguments to functions! Etc.) There are a few other templating systems perhaps worthy of consideration (Mako, etc) but they don't look like they have anything better to offer, particularly in the area of translation stuff.

The Template:Cctrans() intermediary step

Jinja has a kind of translation system we can use with {% trans %} tags, but jumping straight into jinja with {% trans %} tags probably won't work because of several things that end up in a kind of chicken-egg problem:

  • Jinja2's variables-in-translations setup is different, using %(foo)s variable style rather than ${foo} style we currently use
  • Moving templates from one system to another is non-trivial. We're going to need to make sure our templates actually *work* at the same time that we're trying to migrate to the same format. That's not easy to do both.

To fix this, we should use an interim cctrans() function inside the templates, kind of like:

 Template:Cctrans("logical key name", var=foo.bar(), blah=baz)

Later on these can be auto-translated into *real* trans tags like:

 {% trans var=foo.bar(), blah=baz %}
   text content of trans goes here!  Figured out by reading from the
   logical keys.
 {% endtrans %}

Metamorphosis tools

./gregor.py ? :)

template metamorphosis

We need to move our templates from Template:Cctrans()->{% trans %}, as described above. The tools should be smart enough to make the ${foo} to %(foo)s transformation as described below also.

${foo} to %(foo)s metamorphosis

We'll also need to convert all ${foo} to %(foo)s in the .po files and alert translators that this is the new format. (Also, will translators be okay with this confusing "s" after %(foo) in %(foo)s? Will that confuse them?)

Move our other programs over.

Probably cc.license, cc.api, cc.i18n, cc.deedscraper scraper might all need some (comparatively small) adjustments for this format.

Write script to auto-extract everything

Not only do we need to auto-extract things from all projects, we need to combine them into one big file. We should run a check to make sure we aren't "missing" anything and if we are if it's still a string in use.

Moving files and Transifex considerations?

We'll also be moving the structure of files around, so maybe we'll need to change our layout? Or maybe we won't, maybe we can keep the old layout and just change our programs to expect the new one with the po-style files in the same place.

Anyway, does this affect Transifex? We'll need to think carefully about this.