Dear Lazyweb, how would you nicely bundle python code?

I’ve been looking into bundling the python six library into ansible because it’s getting painful to maintain compatibility with the old versions on some distros. However, the distribution developer in me wanted to make it easy for distro packagers to make sure the system copy was used rather than the bundled copy if needed and also make it easy for other ansible developers to make use of it. It seemed like the way to achieve that was to make an import in our namespace that would transparently decide which version of six was needed and use that. I figured out three ways of doing this but haven’t figured out which is better. So throwing the three ways out there in the hopes that some python gurus can help me understand the pros and cons of each (and perhaps improve on what I have so far).

To be both transparent to our developers and use system packages if the system had a recent enough six, I created a six package in our namespace. Inside of this module I included the real six library as Then I created an with code to decide whether to use the system six or the bundled So the directory layout is like this:

+ ansible/
  + compat/
    + six/
      + has two tasks. It has to determine whether we want the system six library or the bundled one. And then it has to make that choice what other code gets when it does import ansible.compat.six. here’s the basic boilerplate:

# Does the system have a six library installed?
    import six as _system_six
except ImportError:
    _system_six = None

if _system_six:
    # Various checks that system six library is current enough
    if not hasattr(_system_six.moves, 'shlex_quote'):
        _system_six = None

if _system_six:
    # Here's where we have to load up the system six library
    # Alternatively, we load up the bundled library

Loading using standard import
Now things start to get interesting. We know which version of the six library we want. We just have to make it available to people who are now going to use it. In the past, I’d used the standard import mechanism so that was the first thing I tried here:

if _system_six:
    from six import *
    from ._six import *

As a general way of doing this, it has some caveats. It only pulls in the symbols that the module considers public. If a module has any functions or variables that are supposed to be public and marked with a leading underscore then they won’t be pulled in. Or if a module has an __all__ = [...] that doesn’t contain all of the public symbols then those won’t get pulled in. You can pull those additions in by specifying them explicitly if you have to.

For this case, we don’t have any issues with those as six doesn’t use __all__ and none of the public symbols are marked with a leading underscore. However, when I then started porting the ansible code to use ansible.compat.six I encountered an interesting problem:

# Simple things like this work
>>> from ansible.compat.six import moves
>>> moves.urllib.parse.urlsplit('')
SplitResult(scheme='https', netloc='', path='/', query='', fragment='')

# this throws an error:
>>> from ansible.compat.six.moves import urllib
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named moves

Hmm… I’m not quite sure what’s happening but I zero in on the word “module”. Maybe there’s something special about modules such that import * doesn’t give me access to import subpackages or submodules of that. Time to look for answers on the Internet…

The Sorta, Kinda, Hush-Hush, Semi-Official Way

Googling for a means to replace a module from itself eventually leads to a strategy that seems to have both some people who like it and some who don’t. It seems to be supported officially but people don’t want to encourage people to use it. It involves a module replacing its own entry in sys.modules. Going back to our example, it looks like this:

import sys
if _system_six:
    six = _system_six
    from . import _six as six

sys.modules['ansible.compat.six'] = six

When I ran this with a simple test case of a python package with several nested modules, that seemed to clear up the problem. I was able to import submodules of the real module from my fake module just fine. So I was hopeful that everything would be fine when I implemented it for six.


>>> from ansible.compat.six.moves import urllib
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named moves

Hmm… same error. So I take a look inside of to see if there’s any clue as to why my simple test case with multiple files and directories worked but six’s single file is giving us headaches. Inside I find that six is doing its own magic with a custom importer to make moves work. I spend a little while trying to figure out if there’s something specifically conflicting between my code and six’s code and then throw my hands up. There’s a lot of stuff that I’ve never used before here… it’ll take me a while to wrap my head around it and there’s no assurance that I’ll be able to make my code work with what six is doing even after I understand it. Is there anything else I could try to just tell my code to run everything that six would normally do when it is imported but do it in my ansible.compat.six namespace?

You tell me: Am I beating my code with the ugly stick?

As a matter of fact, python does provide us with a keyword in python2 and a function in python3 that might do exactly that. So here’s strategy number three:

import os.path
if _system_six:
    import six
    from . import _six as six
six_py_file = '{0}.py'.format(os.path.splitext(six.__file__)[0])
exec (open(six_py_file, 'r'))

Yep, exec will take an open file handle of a python module and execute it in the current namespace. So this seems like it will do what we want. Let’s test it:

>>> from ansible.compat.six.moves import urllib
>>> from ansible.compat.six.moves.urllib.parse import urlsplit
>>> urlsplit('')
SplitResult(scheme='https', netloc='', path='/', query='', fragment='')

So dear readers, you tell me — I now have some code that works but it relies on exec. And moreover, it relies on exec to overwrite the current namespace. Is this a good idea or a bad idea? Let’s contemplate a little further — is this an idea that should only be applied sparingly (Using sys.modules instead if the module isn’t messing around with a custom importer of its own) or is it a general purpose strategy that should be applied to other libraries that I might bundle as well? Are there caveats to doing things this way? For instance, is it bypassing the standard import caching and so might be slower? Is there a better way to do this that in my ignorance I jsut don’t know about?

Why sys.setdefaultencoding() will break code

I know wiser and more experienced Python coders have written to python-dev about this before but every time I’ve needed to reference one of those messages for someone else I have trouble finding one. This time when I did my google search the most relevant entry was a post from myself to the yum-devel mailing list in 2011. Since I know I’ll need to prove why setdefaultencoding() is to be avoided in the future I figured I should post the reasoning here so that I don’t have to search the web next time.

Some Background

15 years ago: Creating a Unicode Aware Python

In Python 2 it is possible to mix byte strings (str type) and text strings (unicode type) together to a limited extent. For instance:

>>> u'Toshio' == 'Toshio'
>>> print(u'Toshio' + ' Kuratomi')
Toshio Kuratomi

When you perform these operations Python sees that you have a unicode type on one side and a str type on the other. It takes the str value and decodes it to a unicode type and then performs the operation. The encoding it uses to interpret the bytes is what we’re going to call Python’s defaultencoding (named after sys.getdefaultencoding() which allows you to see what this value is set to.)

When the Python developers were first experimenting with a unicode-aware text type that was distinct from byte strings it was unclear what the value of defaultencoding should be. So they created a function to set the defaultencoding when Python started in order to experiment with different settings. The function they created was sys.setdefaultencoding() and the Python authors would modify their individual files to gain experience with how different encodings would change the experience of coding in Python.

Eventually, in October of 2000 (fourteen and a half years prior to me writing this) that experimental version of Python became Python-2.0 and the Python authors had decided that the sensible setting for defaultencoding should be ascii.

I know it’s easy to second guess the ascii decision today but remember 14 years ago the encoding landscape was a lot more cluttered. New programming languages and new APIs were emerging that optimized for fixed-width 2-byte encodings of unicode. 1-byte, non-unicode encodings for specific natural languages were even more popular then than they are now. Many pieces of data (even more than today!) could include non-ascii text without specifying what encoding to interpret that data as. In that environment anyone venturing outside of the ascii realm needed to be warned that they were entering a world where encoding dragons roamed freely. The ascii encoding helps to warn people that they were entering a land where their code had to take special precautions by throwing an error in many of the cases where the boundary was crossed.

However, there was one oversight about the unicode functionality that went into Python-2.0 that the Python authors grew to realize was a bad idea. That oversight was not removing the setdefaultencoding() function. They had taken some steps to prevent it being used outside of initialization (in the file) by deleting the reference to it from the sys module after Python initialized but it still existed for people to modify the defaultencoding in

The rise of the sys.setdefaultencoding() hack

As time went on, the utf-8 encoding emerged as the dominant encoding of both Unix-like systems and the Internet. Many people who only had to deal with utf-8 encoded text were tired of getting errors when they mixed byte strings and text strings together. Seeing that there was a function called setdefaultencoding(), people started trying to use it to get rid of the errors they were seeing.

At first, those with the ability to, tried modifying their Python installation’s global to make sys.setdefaultencoding do its thing. This is what the Python documentation suggests is the proper way to use it and it seemed to work on the user’s own machines. Unfortunately, the users often turned out to be coders. And it turned out that what these coders were doing was writing programs that had to work on machines run by other people: the IT department, customers, and users all over the Internet. That meant that applying the change in their often left them in a worse position than before: They would code something which would appear to work on their machines but which would fail for the people who were actually using their software.

Since the coders’ concern was confined to whether people would be able to run their software the coders figured if their software could set the defaultencoding as part of its initialization that would take care of things. They wouldn’t have to force other people to modify their Python install; their software could make that decision for them when the software was invoked. So they took another look at sys.setdefaultencoding(). Although the Python authors had done their best to make the function unavailable after python started up these coders hit upon a recipe to get at the functionality anyway:

import sys

Once this was run in the coders’ software, the default encoding for coercing byte strings to text strings was utf-8. This meant that when utf-8 encoded byte strings were mixed with unicode text strings, Python would successfully convert the str type data to unicode type and combine the two into one unicode string. This is what this new generation of coders were expecting from the majority of their data so the idea that solving their problem with just these few lines of (admittedly very hacky) code was very attractive to them. Unfortunately, there are non-obvious drawbacks to doing this….

Why sys.setdefaultencoding() will break your code

(1) Write once, change everything

The first problem with sys.setdefaultencoding() is not obviously a problem at first glance. When you call sys.setdefaultencoding() you are telling Python to change the defaultencoding for all of the code that it is going to run. Your software’s code, the code in the stdlib, and third-party library code all end up running with your setting for defaultencoding. That means that code which you weren’t responsible for that relied on the behaviour of having the defaultencoding be ascii would now stop throwing errors and potentially start creating garbage values. For instance, let’s say one of the libraries you rely on does this:

def welcome_message(byte_string):
        return u"%s runs your business" % byte_string
    except UnicodeError:
        return u"%s runs your business" % unicode(byte_string,

print(welcome_message(u"Angstrom (Å®)".encode("latin-1"))

Previous to setting defaultencoding this code would be unable to decode the “Å” in the ascii encoding and then would enter the exception handler to guess the encoding and properly turn it into unicode. Printing: Angstrom (Å®) runs your business. Once you’ve set the defaultencoding to utf-8 the code will find that the byte_string can be interpreted as utf-8 and so it will mangle the data and return this instead: Angstrom (Ů) runs your business.

Naturally, if this was your code, in your piece of software, you’d be able to fix it to deal with the defaultencoding being set to utf-8. But if it’s in a third party library that luxury may not exist for you.

(2) Let’s break dictionaries!

The most important problem with setting defaultencoding to the utf-8 encoding is that it will break certain assumptions about dictionaries. Let’s write a little code to show this:

def key_in_dict(key, dictionary):
    if key in dictionary:
        return True
    return False

def key_found_in_dict(key, dictionary):
    for dict_key in dictionary:
        if dict_key == key:
            return True
    return False

Would you assume that given the same inputs the output of both functions will be the same? In Python, if you don’t hack around with sys.setdefaultencoding(), your assumption would be correct:

>>> # Note: the following is the same as d = {'Café': 'test'} on
>>> #       systems with a utf-8 locale
>>> d = { u'Café'.encode('utf-8'): 'test' }
>>> key_in_dict('Café', d)
>>> key_found_in_dict('Café', d)
>>> key_in_dict(u'Café', d)
>>> key_found_in_dict(u'Café', d)
__main__:1: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal

But what happens if you call sys.setdefaultencoding('utf-8')? Answer: the assumption breaks:

>>> import sys
>>> reload(sys)
>>> sys.setdefaultencoding('utf-8')
>>> d = { u'Café'.encode('utf-8'): 'test' }
>>> key_in_dict('Café', d)
>>> key_found_in_dict('Café', d)
>>> key_in_dict(u'Café', d)
>>> key_found_in_dict(u'Café', d)

This happens because the in operator hashes the keys and then compares the hashes to determine if they are equal. In the utf-8 encoding, only the characters represented by ascii hash to the same values whether in a byte string or a unicode type text string. For all other characters the hash for the byte string and the unicode text string will be different values. The comparison operator (==), on the other hand, converts the byte string to a unicode type and then compares the results. When you call setdefaultencoding('utf-8') you allow the byte string to be transformed into a unicode type. Then the two text strings will be compared and found to be equal. The ramifications of this are that containment tests with in now yield different values than equality testing individual entries via ==. This is a pretty big difference in behaviour to get used to and for most people would count as having broken a fundamental assumption of the language.

So how does Python 3 fix this?

You may have heard that in Python 3 the default encoding has been switched from ascii to utf-8. How does it get away with that without encountering the equality versus containment problem? The answer is that python3 does not perform implicit conversions between byte strings (python3 bytes type) and text strings (python3 str type). Since the two objects are now entirely separate comparing them via both equality and containment will always yield False:

$ python3
>>> a = {'A': 1}
>>> b'A' in a
>>> b'A' == list(a.keys())[0]

At first, coming from python2 where ascii values were the same this might look a little funny. But just remember that bytes are really a type of number and you wouldn’t expect this to work either:

>>> a = {'1': 'one'}
>>> 1 in a
>>> 1 == list(a.keys())[0]

Pattern or Antipattern? Splitting up initialization with asyncio

“O brave new world, That has such people in’t!” – William Shakespeare, The Tempest

Edit: Jean-Paul Calderone (exarkun) has a very good response to this detailing why it should be considered an antipattern. He has some great thoughts on the implicit contract that a programmer is signing when they write an __init__() method and the maintenance cost that is incurred if a programmer breaks those expectations. Definitely worth reading!

Instead of spending the Thanksgiving weekend fighting crowds of shoppers I indulged my inner geek by staying at home on my computer. And not to shop online either — I was taking a look at Python-3.4’s asyncio library to see whether it would be useful in general, run of the mill code. After quite a bit of experimenting I do think every programmer will have a legitimate use for it from time to time. It’s also quite sexy. I think I’ll be a bit prone to overusing it for a little while ;-)

Something I discovered, though — there’s a great deal of good documentation and blog posts about the underlying theory of asyncio and how to implement some broader concepts using asyncio’s API. There’s quite a few tutorials that skim the surface of what you can theoretically do with the library that don’t go into much depth. And there’s a definite lack of examples showing how people are taking asyncio’s API and applying them to real-world problems.

That lack is both exciting and hazardous. Exciting because it means there’s plenty of neat new ways to use the API that no one’s made into a wide-spread and oft-repeated pattern yet. Hazardous because there’s plenty of neat new ways to abuse the API that no one’s thought to write a post explaining why not to do things that way before. My joke about overusing it earlier has a large kernel of truth in it… there’s not a lot of information saying whether a particular means of using asyncio is good or bad.

So let me mention one way of using it that I thought about this weekend — maybe some more experienced tulip or twisted programmers will pop up and tell me whether this is a good use or bad use of the APIs.

Let’s say you’re writing some code that talks to a microblogging service. You have one class that handles both posting to the service and reading from it. As you write the code you realize that there’s some time consuming tasks (for instance, setting up an on-disk cache for posts) that you have to do in order to read from the service that you do not have to wait for if your first actions are going to be making new posts. After a bit of thought, you realize you can split up your initialization into two steps. Initialization needed for posting will be done immediately in the class’s constructor and initialization needed for reading will be setup in a future so that reading code will know when it can begin to process. Here’s a rough sketch of what an implementation might look like:

import os
import sqlite
import asyncio

import aiohttp

class Microblog:
    def __init__(self, url, username, token, cachedir):
        self.auth = token
        self.username = username
        self.url = url
        loop = asyncio.get_event_loop()
        self.init_future = loop.run_in_executor(None, self._reading_init, cachedir)

    def _reading_init(self, cachedir):
        # Mainly setup our cache
        self.cachedir = cachedir
        os.makedirs(cachedir, mode=0o755, exist_ok=True)
        self.db = sqlite.connect('sqlite:////{0}/cache.sqlite'.format(cachedir))
        # Create tables, fill in some initial data, you get the picture [....]

    def post(self, msg):
        data = dict(payload=msg)
        headers = dict(Authorization=self.token)
        reply = yield from aiohttp.request('post', self.url, data=data, headers=headers)
        # Manipulate reply a bit [...]
        return reply

    def sync_latest(self):
        # Synchronize with the initialization we need before we can read
        yield from self.init_future
        data = dict(per_page=100, page=1)
        headers = dict(Authorization=self.token)
        reply = yield from aiohttp.request('get', self.url, data=data, headers=headers)
        # Stuff the reply in our cache

if __name__ == '__main__':
    chirpchirp = Microblog('', 'a.badger', TOKEN, '/home/badger/cache/')
    loop = asyncio.get_event_loop()
    # Contrived -- real code would probably have a coroutine to take user input
    # and then submit that while interleaving with displaying new posts
    asyncio.async(' '.join(sys.argv[1:])))

Some of this code is just there to give an idea of how this could be used. The real question’s revolve around splitting up initialization into two steps:

  • Is yield from the proper way for sync_latest() to signal that it needs self.init_future to finish before it can continue?
  • Is it good form to potentially start using the object for one task before __init__ has finished all tasks?
  • Would it be better style to setup posting and reading separately? Maybe a reading class and a posting class or the old standby of invoking _reading_init() the first time sync_latest() is called?

git commit doesn’t commit? (GitPython bug)

Mostly posting this to remind myself of the fix the next time I run into this but htis might help some other people as well.

Every once in a while I’ll be working on a git repo in the fedora packages repository and when I git commit -a it, I’ll end up with an empty commit and the files with changes aren’t actually committed. Other intuitive variations of this like git add FILE && git commit have the same buggy behaviour.

The reason this is occurring has something to do with the GitPython library which is used by fedpkg to add some changes to your clone of the git repo when you add new source files. It’s somehow changing the index in a way that causes this behaviour. To get out of this there’s a few simple but non-intuitive things you can try:

git reset FILE && git add FILE

git stash && git stash pop

After running one of those pairs of commands you should once more be able to git commit -a.

Details in this GitPython bug report

Porting Kitchen to Python3: Part 1 — Detecting string types

I’ve spent a good part of the last week working on the python3 port of kitchen. It’s now to the point where I’ve reviewed all of the code and got the unittests passing. I still need to add some deprecation warnings and a gettext object that mirrors the python3 API instead of the python2 API. Then it’ll be ready for an alpha release. Still a lot of work to do before a final release. Most of the documentation will need to be updated to change from unicode + str to str + bytes and the best practices sections will need a major overhaul since a lot of the problems with python2 and unicode have either been fixed, mitigated, or moved to a different level.

It was both an easy and hard undertaking. The easy part was that kitchen is largely a collection of dependent but unrelated functions. So it’s reasonably easy to pick a set of functions, figure out that they don’t depend on anything else in kitchen, and then port them one by one.

The hard part is that a lot of those functions deal with things that are explicitly unicode and things that are explicitly byte strings; an area that has both changed dramatically in python3 and that 2to3 doesn’t handle very well. Here’s a couple of things I ended up doing to help out:

Detecting String Types

Kitchen has several places that need to know whether an object it’s been given is a byte string, unicode string, or a generic string. The python2 idioms for this are:

if isinstance(obj, basestring):
    pass # object is any of the string types
    if isinstance(obj, str):
        pass # object is a byte string
    elif isinstance(obj, unicode):
        pass # object is a unicode string
    pass # object was not a string type

In python3, a couple things have changed.

  • There’s no longer a basestring type as byte strings and unicode strings are no longer meant to be related types.
  • Byte strings now have an immutable (bytes) and mutable (bytearray) type.

With these changes, the python3 idioms equivalent to the python2 ones look something like this:

if isinstance(obj, str) or isinstance(obj, bytes) or isinstance(obj, bytearray):
    pass # any string type
    if isinstance(obj, bytes) or isinstance(obj, bytearray):
        pass # byte string
    elif isinstance(obj, str):
        pass # unicode string

There’s two issues with these changes:

  • code that needs to do this needs to be manually ported when moving from python2 to python3. 2to3 can correctly change all occurrences of isinstance(obj, unicode) to isinstance(obj, str) but occurrences of isinstance(obj, basestring) and isinstance(obj, str) will also be rendered as isinstance(obj, str) in the 2to3 output. This is correct for a lot of normal python2 code that is trying to separate strings from ints, floats, or other types but it is incorrect for code that’s trying to explicitly separate bytes from unicode. So you’ll need to hand-audit and fix your code wherever these idioms are being used.
  • This is more prolix and tedious to write than the python2 version and if your code has to do this sort of differentiation in many places you’ll soon get bored of it.

For kitchen, I added a few helper functions into kitchen.text.misc that encapsulate the python2 and python3 idioms. For instance:

def isbasestring(obj):
    if isinstance(obj, str) or isinstance(obj, bytes) or isinstance(obj, bytearray):
        return True
    return False

and similar for isunicodestring() and isbytestring(). [In case you’re curious, I broke with PEP8 style for these function names because of the long history of is* functions and methods in python and other programming languages.] By pushing these into functions, I can use if isbasetring(obj): on both python2 and python3. I only have to change the implementation of the is*string() functions in a single place when porting from python2 to python3.

Now let’s mention some of the caveats to using this:

  • In python, calling a function (isbasestring()) is somewhat expensive. So if you use this in any hot inner loops, you may want to benchmark with the function and with the expanded version to see whether you take a noticable loss of speed.
  • Not every piece of code is going to want to define “string” in the same way. For instance, bytearrays are mutable so maybe your code shouldn’t include those with the “normal” string types.
  • Maybe your code can be changed to only deal with unicode strings (str). In python3 byte strings are not as ubiquitous as they were in python2 so maybe your code can be changed to stop checking for the type of the object altogether or reduced to a single isinstance(obj, str). The language has evolved so when possible, evolve your code to adapt as well.

Next time: Literals

My first python3 script

I’ve been hacking on other people’s python3 code for a while doing porting and bugfixes but so far my own code has been tied to python2 because of dependencies. Yesterday I ported my first personal script from python2 to python3. This was just a simple, one file script that hacks together a way to track how long my kids are using the computer and log them off after they’ve hit a quota. The kind of thing that many a home sysadmin has probably hacked together to automate just a little bit of their routine. For that use, it seemed very straightforward to make the switch. There were only three changes in the language that I encountered when making the transition:

  • octal values. I use octal for setting file permissions. The syntax for octal values has changed from "0755" to "0o755"
  • exception catching. No longer can you do: except Exception, exc. The new syntax is: except Exception as exc.
  • print function. In python2, print is a keyword so you do this: print 'hello world'. In python3, it’s a function so you write it this way: print('hello world')
  • The strict separation of bytes and string types. Required me to specify that one subprocess function should return string instead of bytes to me

When I’ve worked on porting libraries that needed to maintain some form of compat between python2 (older versions… no nice shiny python-2.7 for you!) and python3 these concerns were harder to address as there needed to be two versions of the code (usually, maintained via automatic build-time invocation of 2to3). With this application/script, throwing out python2 compatibility was possible so switching over was just a matter of getting an error when the code executed and switching the syntax over.

This script also didn’t use any modules that had either not ported, been dropped, or been restructured in the switch from python2 to python3. Unlike my day job where urllib’s restructuring would affect many of the things that we’ve written and lack of ported third-party libraries would prevent even more things from being ported, this script (and many other of my simple-home-use scripts) didn’t require any changes due to library changes.

Verdict? Within these constraints, porting to python3 was as painless as porting between some python2.x releases has been. I don’t see any reason I won’t use python3 for new programming tasks like this. I’ll probably port other existing scripts as I need to enhance them.

Kitchen 1.1.0 release

As mentioned last week a new kitchen release went out today. Since last week some small changes were made to the documentation and a few changes to make building kitchen easier were implemented but nothing has changed in the code. Here’s the text of the release announcement:

== Kitchen 1.1.0 has been released ==

Kitchen is a python library that brings together small snippets of code that you might otherwise find yourself reimplementing via cut and paste. Each little bit is useful and important but they usually feel too small and too trivial to create a whole module just for that one little function. However, experience has shown that any code implemented by copying will inevitably be shown to have bugs. And when you fix those bugs, you’ll wish you had created the module so you could fix the bug in one place rather than two (or five.. or ten…). Kitchen aims to be that one place.

Kitchen currently has code for easily setting up gettext so it won’t throw UnicodeErrors in corner cases, compatibility modules for different python2 stdlib versions (2.4, 2.5, 2.7), a little bit of iterators, and a whole lot of code for unicode-byte string conversion. In addition to the code, kitchen contains documentation that explains some of the common problems that arise when dealing with unicode in python2 and how to fix them through changes in coding practices and/or making use of functions from kitchen.

The 1.1.0 release enhances the gettext portion of kitchen. The major enhancements are:

  • get_translation_object can now be used as a drop in replacement for the stdlib’s gettext.translations() function.
  • If get_translation_object finds multiple message catalogs for the domain, it will setup the additional catalogs as fallbacks in case the message isn’t found in the first one.
  • The gettext and lgettext functions were reworked so that they guarantee that the string they return is both a byte str (this was present in previous kitchen releases) and is a valid sequence of bytes in the selected output_charset. This should prevent tracebacks if your code decodes and reencodes a value returned from the gettext and lgettext family of functions.
  • Several fixes to the way fallback message catalogs interacted with input and output charsets.

For the complete set of changes, see the NEWS file.