New kitchen release coming soon

[EDIT]For those who are curious, kitchen is a python module for miscellaneous code snippets. Things that people end up reimplementing via cut and paste because they seem to be too small to write a module for but are so useful that they need them in many places. Currently, it has code for i18n, compatibility modules for different python2 stdlib versions, a little bit of iterators, and a whole lot of code for unicode-byte string conversions.

Over the recent vacation I put the finishing (code) touches on a new kitchen release. I’ve scheduled the release of this code for next week on January 10, 2012. This is mainly since I just added the kitchen module on transifex.net and I’d like to see if any translations show up before next week. If anyone finds any bugs in the code on python-2.3.1 through python-2.7.x, please bring them up on the mailing list, on irc.freenode.net (I hang out in #fedora-admin and #fedora-python), or in the kitchen bug tracker so that I can address them before the release date.

The beta code is available from fedorahosted.org at: https://fedorahosted.org/releases/k/i/kitchen/kitchen-1.1.0b1.tar.gz

or from the bzr repository:

  bzr branch bzr://bzr.fedorahosted.org/bzr/kitchen/devel

The major changes are in the kitchen.i18n module. Previously, kitchen.i18n.*Translations objects guaranteed that they would return byte str when requested (via gettext(), ngettext(), lgettext(), and lngettext() methods) and unicode strings when requested (via ugettext() and ungettext()). The new code makes the additional guarantee that byte str's that are returned are valid in the requested output charset.

Here's an example of the old behaviour vs new behaviour:

   >>> from kitchen.i18n import get_translation_object
   >>> translations = get_translation_object('kitchen')
   >>> b_ = translations.lgettext
   >>> translations.set_output_charset('utf-8')
   >>> translations.input_charset = 'latin-1'
   >>> # This would be: 'Café does not exist in the message catalog'
   >>> print repr(b_('Caf\xe9 does not exist in the message catalog'))
   # Old behaviour =>
   'Caf\xe9 does not exist in the message catalog'
   # New behaviour =>
   'Caf\xc3\xa9 does not exist in the message catalog'

   # Example 2: with wrong input_charset =>
   >>> translations.input_charset = 'utf-8'
   >>> print repr(b_('Caf\xe9 does not exist in the message catalog'))
   # New behaviour yields valid utf-8 bytes even when input_charset is wrong =>
   'Caf\xef\xbf\xbd does not exist in the message catalog'

Notice that this is not a magical panacea. The second example, shows that if input_encoding does not match the byte encoding of the strings that are given, the output string will be mangled (replacement characters or garbage characters). However, all the bytes in the output string will be valid in the chosen encoding so you won't have to worry about exceptions if you attempt to transform the string again.

The other major change is that the kitchen.i18n.get_translation_object() function has been rewritten to be a drop in replacement for the stdlib's gettext.translations(). The behaviour changes from that include the code now attempting to discover translations in every message catalog that it finds in the paths given to it. Additionally, those code changes lead to bugs in the *Translations classes fallback code being discovered and squashed.

See the NEWS file for other changes.