kitchen 0.2.4 released

I realize I didn’t announce 0.2.3 so here’s the NEWS entries for both of 0.2.3 and 0.2.4:

0.2.4

  • Have easy_gettext_setup() return lgettext functions instead of gettext
    functions when use_unicode=False
  • Correct docstring for kitchen.text.converters.exception_to_bytes() — we’re
    transforming into a byte str, not into unicode.
  • Correct some examples in the unicode frustrations documentation
  • Correct some cross-references in the documentation

0.2.3

  • Expose MAXFD, list2cmdline(), and mswindows in kitchen.pycompat27.subprocess.
    These are undocumented, and not in upstream’s __all__ but google (and bug
    reports against kitchen) show that some people are using them. Note that
    upstream is leaning towards these being private so they may be deprecated in
    the python3 subprocess.

So what do these changes mean for you? Hopefully it’ll just be bugfixes for everyone. The subprocess changes in 0.2.3 make more of the subprocess interface public because some code uses those functions and variables. People using them are advised to stop using them as this upstream bug report shows that the python maintainers don’t intend them to be public and will be deprecating them in the future. Since I had to dig into the code to look into this, I’ll also note that if your code is using list2cmdline() it it’s likely that it’s buggy in corner cases. From thesubprocess documentation: “list2cmdline() is designed for applications using the same rules as the MS C runtime.” That means that it’s not intended for dealing with Unix shells or even the MS-DOS command prompt. It’s only intended for the MS C runtime itself.

The 0.2.4 changes to easy_gettext_setup() changes behaviour so there is a potential to break code although I still classify it as a bugfix. easy_gettext_setup() is intended to return the gettext functions needed to translate an application. Since python has both byte str and unicode string types that can be used, there are gettext functions that return one or the other of those. easy_gettext_setup() takes a parameter, use_unicode to know whether to return a set of functions that works with byte str or a set of functions that work with unicode strings. There’s only one set of functions that return unicode so when unicode is requested the code returns the ugettext() and ungettext() functions as expected. When byte str is requested, however, things are a little messier as there’s two sets of function to choose from: gettext()/ngettext() or lgettext()/lngettext().

Prior to 0.2.4, easy_gettext_setup() returned gettext() and ngettext(). The gettext functions do return byte strings. However, the byte strings they return are in the encoding that was saved in the message catalogs on the filesystem. So, if the translators used utf-8 to encode their strings, you’d get utf-8 output; if they used latin-1, you’d get latin-1 output and so forth. This works fine as long as you’re using the same encoding as the translators were. However, when the translator uses a different encoding than you, you get mojibake.

In 0.2.4, we’ve switched to returning the lgettext functions to address this. lgettext and lngettext take the byte strings and the encoding information from the message catalog that the translator provided and use that to re-encode the strings in the desired encoding. That way if you have a locale setting of ja_JP.EUC_JP you get text encoded in EUC_JP and if you have a locale setting of ja_JP.UTF8 your text is encoded in UTF8.

Results from a program before and after updating the kitchen.i18n.easy_gettext_setup() function to use lgettext. In both terminals, the terminal is set to display characters using the EUC_JP encoding. The terminal on the right displays mojibake because the earlier version of easy_gettext_setup() uses the gettext() function which returns the characters in utf8 (the encoding that the translator used). The terminal on the right displays correctly because lgettext reencodes the strings as EUC_JP.

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.