I realize I didn’t announce 0.2.3 so here’s the NEWS entries for both of 0.2.3 and 0.2.4:
0.2.4
- Have
easy_gettext_setup()
returnlgettext
functions instead ofgettext
functions whenuse_unicode=False
- Correct docstring for
kitchen.text.converters.exception_to_bytes()
— we’re
transforming into a bytestr
, not intounicode
. - Correct some examples in the unicode frustrations documentation
- Correct some cross-references in the documentation
0.2.3
- Expose
MAXFD
,list2cmdline()
, andmswindows
inkitchen.pycompat27.subprocess
.
These are undocumented, and not in upstream’s__all__
but google (and bug
reports against kitchen) show that some people are using them. Note that
upstream is leaning towards these being private so they may be deprecated in
the python3subprocess
.
So what do these changes mean for you? Hopefully it’ll just be bugfixes for everyone. The subprocess changes in 0.2.3 make more of the subprocess interface public because some code uses those functions and variables. People using them are advised to stop using them as this upstream bug report shows that the python maintainers don’t intend them to be public and will be deprecating them in the future. Since I had to dig into the code to look into this, I’ll also note that if your code is using list2cmdline()
it it’s likely that it’s buggy in corner cases. From thesubprocess documentation: “list2cmdline() is designed for applications using the same rules as the MS C runtime.” That means that it’s not intended for dealing with Unix shells or even the MS-DOS command prompt. It’s only intended for the MS C runtime itself.
The 0.2.4 changes to easy_gettext_setup() changes behaviour so there is a potential to break code although I still classify it as a bugfix. easy_gettext_setup()
is intended to return the gettext functions needed to translate an application. Since python has both byte str
and unicode
string types that can be used, there are gettext functions that return one or the other of those. easy_gettext_setup()
takes a parameter, use_unicode
to know whether to return a set of functions that works with byte str
or a set of functions that work with unicode
strings. There’s only one set of functions that return unicode
so when unicode
is requested the code returns the ugettext()
and ungettext()
functions as expected. When byte str
is requested, however, things are a little messier as there’s two sets of function to choose from: gettext()/ngettext()
or lgettext()/lngettext()
.
Prior to 0.2.4, easy_gettext_setup()
returned gettext()
and ngettext()
. The gettext functions do return byte strings. However, the byte strings they return are in the encoding that was saved in the message catalogs on the filesystem. So, if the translators used utf-8 to encode their strings, you’d get utf-8 output; if they used latin-1, you’d get latin-1 output and so forth. This works fine as long as you’re using the same encoding as the translators were. However, when the translator uses a different encoding than you, you get mojibake.
In 0.2.4, we’ve switched to returning the lgettext
functions to address this. lgettext
and lngettext
take the byte strings and the encoding information from the message catalog that the translator provided and use that to re-encode the strings in the desired encoding. That way if you have a locale setting of ja_JP.EUC_JP
you get text encoded in EUC_JP
and if you have a locale setting of ja_JP.UTF8
your text is encoded in UTF8
.

Results from a program before and after updating the kitchen.i18n.easy_gettext_setup() function to use lgettext. In both terminals, the terminal is set to display characters using the EUC_JP encoding. The terminal on the right displays mojibake because the earlier version of easy_gettext_setup() uses the gettext() function which returns the characters in utf8 (the encoding that the translator used). The terminal on the right displays correctly because lgettext reencodes the strings as EUC_JP.