Difference between revisions of "Translation tooling"

From Creative Commons
Jump to: navigation, search
m (Corrected the link to Babel)
 
(3 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
= Adding or changing strings =
 
= Adding or changing strings =
  
For details on this, read the "structure of our translation toolchain"
+
For details on this, read the section "Structure of our translation toolchain".
  
 
== Extracting translations ==
 
== Extracting translations ==
Line 31: Line 31:
 
== Where are our translations? ==
 
== Where are our translations? ==
  
First of all, we maintain our translations on Transifex.  Our
+
First of all, we maintain our translations on Transifex - https://www.transifex.com/nkinkade/CC/deeds-choosers/.  Our affiliates mostly handle our translations.
affiliates mostly handle our translations.
 
 
 
https://www.transifex.net/projects/p/CC/resource/deeds-choosers/
 
  
 
== What tools do we use? ==
 
== What tools do we use? ==
Line 56: Line 53:
 
work?"
 
work?"
  
The answer is pretty simple!  We use [http://babel.edgewall.org Babel]
+
The answer is pretty simple!  We use [http://babel.pocoo.org/en/latest/ Babel]
 
to extract strings.
 
to extract strings.
  
Line 87: Line 84:
  
 
One more thing to note: we have a translations statistics tool that's
 
One more thing to note: we have a translations statistics tool that's
run every time the sdist is built. It writes out a csv file that
+
run every time the sdist is built. It writes out a csv file that
 
keeps several bits of information, including percentages.  We have a
 
keeps several bits of information, including percentages.  We have a
 
translation threshold at the top of cc/i18n/util.py ... translations
 
translation threshold at the top of cc/i18n/util.py ... translations

Latest revision as of 14:12, 8 December 2016

Adding or changing strings

For details on this, read the section "Structure of our translation toolchain".

Extracting translations

Instead of providing a master.po file, the same information is pulled automagically from cc.engine's templates, in the content of the trans tags.

  1. Make modifications to cc.engine templates, commit, push, etc.
  2. In cc.i18n (using either buildout or virtualenv):
  3. git pull origin master
  4. run ./runcheckout.sh && ./extract.sh
  5. git add cc/i18n/po/en/cc_org.po
  6. git commit -m "Extracting new strings for translation"
  7. git push origin master

...done!

Push source file up to transifex

  1. ssh a7.creativecommons.org
  2. sudo su cronuser
  3. cd /home/cronuser/transifex.net_i18n_checkout/
  4. git pull
  5. tx push -s

That last command will push the source file (english .po file you committed) up to transifex.

Structure of our translation toolchain

Where are our translations?

First of all, we maintain our translations on Transifex - https://www.transifex.com/nkinkade/CC/deeds-choosers/. Our affiliates mostly handle our translations.

What tools do we use?

Translations are in gettext format.

We used to use zope's i18n toolchain for translations. Things used to be in "logical key" format, where there was a symbolic representation of each translation (almost like a variable that mapped to the string). We switched to english keys because that's what most of the world does, and doing otherwise required an insanely complex and fragile system that we spent a ton of time maintaining.

These days it's pretty simple... just mark a string for translation by wrapping it in gettext() or _() or whatever. Then we can auto-extract things.

If you read the "extracting translations" section above, or even ran the commands, you may have wondered, "Whoa, that ran like magic! All of these translations just got pulled out! How the hell did that work?"

The answer is pretty simple! We use Babel to extract strings.

When you run ./runcheckouts.sh, it checks out all the packages we extract translations from. And ./extract.sh extracts all the translations from them by reading babel.ini to find out all the stuff it should extract.

Most of the extractors are pretty standard (jinja2 and python are bundled by jinja2 and babel respectively), but we've defined our own for extracting from RDF in cc/i18n/tools/extractors.py (defined as an entry point in setup.py)

So anyway, transifex has a client that we use to push up the new translations with. Anyway, just see above for that.

There are actually two cronjobs that run, translations related. A few times an hour new translations are pulled down, and a new translation tarball is built. (They're currently separate scripts but maybe they could be merged?

These commands can be found in the cronuser crontab.

 # Pull changes from Tx.net and push them to our repos
 5 * * * * ~/bin/sync_i18n_with_transifex.sh > /dev/null
 10 * * * * ~/bin/sync_i18n-ccsearch_with_transifex.sh > /dev/null
 
 # Update cc.i18n tarball
 */15 * * * * /usr/bin/ionicer && nice -n 19 bash /var/www/staging.creativecommons.org/make_i18n_sdist.sh > /dev/null

One more thing to note: we have a translations statistics tool that's run every time the sdist is built. It writes out a csv file that keeps several bits of information, including percentages. We have a translation threshold at the top of cc/i18n/util.py ... translations have to be above this level to show up in the "available languages" box on various pages of cc.engine!