Flock Ansible Talk

Hey Fedorans, I’m trying to come up with an Ansible Talk Proposal for FLOCK in Rochester. What would you like to hear about?

Some ideas:

* An intro to using ansible
* An intro to ansible-playbook
* Managing docker containers via ansible
* Using Ansible to do X ( help me choose a value for X ;-)
* How to write your own ansible module
* How does ansible transform a playbook task into a python script on the remote system

Let me know what you’re interested in :-)

Hacky, hacky playbook to upgrade several Fedora machines

Normally I like to polish my code a bit before publishing but seeing as Fedora 21 is relatively new and a vacation is coming up which people might use as an opportunity to upgrade their home networks, I thought I’d throw this extremely *unpolished and kludgey* ansible playbook out there for others to experiment with:

https://bitbucket.org/toshio/local-playbooks/src/3d3ae76a56034784ab60fcfa1129221c59a40f3b/provisioning/fedora-upgrade.yml?at=master

When I recently looked at updating all of my systems to Fedora 21 I decided to try to be a little lighter on network bandwidth than I usually am (I’m on slow DSL and I have three kids all trying to stream video at the same time as I’m downloading packages). So I decided that I’d use a squid caching proxy to cache the packages that I was going to be installing since many of the packages would end up on all of my machines. I found a page on caching packages for use with mock and saw that there were a bunch of steps that I probably wouldn’t remember the next time I wanted to do this. So I opened up vim, translated the steps into an ansible playbook, and tried to run it.

First several times, it failed because there were unsolved dependencies in my packageset (packages I’d built locally with outdated dependencies, packages that were no longer available in Fedora 21, etc). Eventually I set the fedup steps to ignore errors so that the playbook would clean up all the configuration and I could fix the package dependency problems and then re-run the playbook immediately afterwards. I’ve now got it to the point where it will successfully run fedup in my environment and will cache many packages (something’s still using mirrors instead of the baseurl I specified sometimes but I haven’t tracked that down yet. Those packages are getting cached more than once).

Anyhow, feel free to take a look, modify it to suit your needs, and let me know of any “bugs” that you find :-)

Things I’ll probably do to it for when I update to F22:

Quick question for Fedora Planet

Hey Fedora Planet, quick survey: I’ve been doing some blogging on general python programming and on learning to use Ansible. Are either of these topics that you’d be interested in showing up on planet? Right now the feed I’m sending to planet only includes Fedora-specific posts, linux-specific, or “free software political” but I could easily change the feed to include the ansible or python programming posts if either of those are interesting to people.

Leave me a comment or touch base with me on IRC: abadger1999 on freenode.

Pattern or Antipattern? Splitting up initialization with asyncio

“O brave new world, That has such people in’t!” – William Shakespeare, The Tempest

Edit: Jean-Paul Calderone (exarkun) has a very good response to this detailing why it should be considered an antipattern. He has some great thoughts on the implicit contract that a programmer is signing when they write an __init__() method and the maintenance cost that is incurred if a programmer breaks those expectations. Definitely worth reading!

Instead of spending the Thanksgiving weekend fighting crowds of shoppers I indulged my inner geek by staying at home on my computer. And not to shop online either — I was taking a look at Python-3.4’s asyncio library to see whether it would be useful in general, run of the mill code. After quite a bit of experimenting I do think every programmer will have a legitimate use for it from time to time. It’s also quite sexy. I think I’ll be a bit prone to overusing it for a little while ;-)

Something I discovered, though — there’s a great deal of good documentation and blog posts about the underlying theory of asyncio and how to implement some broader concepts using asyncio’s API. There’s quite a few tutorials that skim the surface of what you can theoretically do with the library that don’t go into much depth. And there’s a definite lack of examples showing how people are taking asyncio’s API and applying them to real-world problems.

That lack is both exciting and hazardous. Exciting because it means there’s plenty of neat new ways to use the API that no one’s made into a wide-spread and oft-repeated pattern yet. Hazardous because there’s plenty of neat new ways to abuse the API that no one’s thought to write a post explaining why not to do things that way before. My joke about overusing it earlier has a large kernel of truth in it… there’s not a lot of information saying whether a particular means of using asyncio is good or bad.

So let me mention one way of using it that I thought about this weekend — maybe some more experienced tulip or twisted programmers will pop up and tell me whether this is a good use or bad use of the APIs.

Let’s say you’re writing some code that talks to a microblogging service. You have one class that handles both posting to the service and reading from it. As you write the code you realize that there’s some time consuming tasks (for instance, setting up an on-disk cache for posts) that you have to do in order to read from the service that you do not have to wait for if your first actions are going to be making new posts. After a bit of thought, you realize you can split up your initialization into two steps. Initialization needed for posting will be done immediately in the class’s constructor and initialization needed for reading will be setup in a future so that reading code will know when it can begin to process. Here’s a rough sketch of what an implementation might look like:

import os
import sqlite
import asyncio

import aiohttp

class Microblog:
    def __init__(self, url, username, token, cachedir):
        self.auth = token
        self.username = username
        self.url = url
        loop = asyncio.get_event_loop()
        self.init_future = loop.run_in_executor(None, self._reading_init, cachedir)

    def _reading_init(self, cachedir):
        # Mainly setup our cache
        self.cachedir = cachedir
        os.makedirs(cachedir, mode=0o755, exist_ok=True)
        self.db = sqlite.connect('sqlite:////{0}/cache.sqlite'.format(cachedir))
        # Create tables, fill in some initial data, you get the picture [....]

    @asyncio.coroutine
    def post(self, msg):
        data = dict(payload=msg)
        headers = dict(Authorization=self.token)
        reply = yield from aiohttp.request('post', self.url, data=data, headers=headers)
        # Manipulate reply a bit [...]
        return reply

    @asyncio.coroutine
    def sync_latest(self):
        # Synchronize with the initialization we need before we can read
        yield from self.init_future
        data = dict(per_page=100, page=1)
        headers = dict(Authorization=self.token)
        reply = yield from aiohttp.request('get', self.url, data=data, headers=headers)
        # Stuff the reply in our cache

if __name__ == '__main__':
    chirpchirp = Microblog('http://chirpchirp.com', 'a.badger', TOKEN, '/home/badger/cache/')
    loop = asyncio.get_event_loop()
    # Contrived -- real code would probably have a coroutine to take user input
    # and then submit that while interleaving with displaying new posts
    asyncio.async(chirpchirp.post(' '.join(sys.argv[1:])))
    loop.run_until_complete(chirpchirp.sync_latest())
    

Some of this code is just there to give an idea of how this could be used. The real question’s revolve around splitting up initialization into two steps:

  • Is yield from the proper way for sync_latest() to signal that it needs self.init_future to finish before it can continue?
  • Is it good form to potentially start using the object for one task before __init__ has finished all tasks?
  • Would it be better style to setup posting and reading separately? Maybe a reading class and a posting class or the old standby of invoking _reading_init() the first time sync_latest() is called?

Tales of an Ansible Newb: Ch.1 – Modules

In my last post I used the ansible command line to run an arbitrary shell command on several remote systems. Just being able to do that makes Ansible a useful tool in a sysadmin’s toolchest. However, that’s just the tip of what Ansible can do. Let’s go one step further today and explore some Ansible modules.

The command from the last post looked like this:

ansible '*' -i  'host1,host2' -a 'yum -y install tmux' -K --sudo

That let me run an ad hoc command on a set of hosts. You can run any command you want by specifying a different command line to -a:

$ ansible '*' -i 'host1,' -a 'hostname'
host1 | success | rc=0 >>
host1.lan

$ ansible '*' -i 'fedora20-test,' -a 'grep fedora /etc/os-release'
fedora20-test | success | rc=0 >>
ID=fedora
CPE_NAME="cpe:/o:fedoraproject:fedora:20"
HOME_URL="https://fedoraproject.org/"

Definitely useful. But what if you need to combine several commands together? For instance, let’s say we want to run one command if we’re talking to Fedora and a different one if we’re talking to Ubuntu? Well, we know how to use if-then-else in a shell script, maybe that will work here? Let’s give it a try!

$ ansible '*' -i 'fedora20-test,ubuntu14' -a \
> 'if test -f /etc/fedora-release ; then echo "Fedora!" ; else echo "Ubuntu!" ; fi'
ubuntu14 | FAILED | rc=2 >>
[Errno 2] No such file or directory

fedora20-test | FAILED | rc=2 >>
[Errno 2] No such file or directory

Nope that didn’t work. For a moment we might be puzzled why we get the error “No such file or directory” but then we realize that if is a shell builtin. If ansible isn’t creating a shell on the remote system then if won’t be found. So how can we do this?

Until now all of the commands I’ve given Ansible have been single programs and their command line parameters specified as a string to ansible’s -a parameter. But just what is the -a? Ansible’s man page has this information for us:

-a ‘ARGUMENTS’, –args=’ARGUMENTS’
The ARGUMENTS to pass to the module.

So what’s this module it’s talking about? Modules are small pieces of code that Ansible is able to run on a remote machine. By default, if no module is specified, Ansible runs the command module. The command module takes a single argument, a command line string which it then runs on the remote machine… with the important note that it directly invokes it so that we don’t have to worry about a remote shell interpreting any special characters . It does not pass it through the shell. But if we do need to use shell constructs (shell variables, redirection, builtins, etc) we can use the shell module instead. Woo hoo! Let’s try that:

$ ansible '*' -i 'fedora20-test,ubuntu14' -m shell -a \
'if test -f /etc/fedora-release ; then echo "Fedora!" ; else echo "Ubuntu!"
; fi'
ubuntu14 | success | rc=0 >>
Ubuntu!

fedora20-test | success | rc=0 >>
Fedora!

Excellent. So now we know how to use the full range of shell features if we need to. But I wonder what other modules Ansible ships with.

To find the answer to that, you can run:

$ ansible-doc -l

Which prints out a list of 240 or so modules which range from niche applications like managing specific brands of network devices to installing OS packages on various types of Linux distribution and copying files between the local system and remote hosts. You can find information about any one of these by running ansible-doc MODULE-NAME.

So let’s try out one of these modules. Let’s take our example of installing a package on several remote computers that we used last time and make it use the yum module instead of invoking yum via the command module. ansible-doc yum tells us that the yum module takes a few arguments of which name is mandatory and specifies a package name. state let’s us tell the module what state we want the package to be in when ansible exits. Let’s give it a try:

$ ansible '*' -i 'fedora20-test,' -m yum -a 'name=tmux,zsh state=latest' --sudo -K
sudo password: 
fedora20-test | success >> {
    "changed": true, 
    "msg": "", 
    "rc": 0, 
    "results": [
        "All packages providing tmux are up to date",.
        "Resolving Dependencies\nRunning transaction check\n
Package zsh.x86_64 0:5.0.7-1.fc20 will be updated\n
Package zsh.x86_64 0:5.0.7-4.fc20 will be an update\n
Finished Dependency Resolution\n\nDependencies Resolved\n\n
================================================================================\n
Package       Arch             Version                 Repository         Size\n
================================================================================\n
Updating:\n zsh           x86_64           5.0.7-4.fc20            updates           2.5 M\n\n
      Transaction Summary\n     
================================================================================\n
Upgrade  1 Package\n\nTotal download size: 2.5 M\nDownloading packages:\n
Not downloading deltainfo for updates, MD is 2.9 M and rpms are 2.5 M\n
Running transaction check\nRunning transaction test\nTransaction test succeeded\n
Running transaction (shutdown inhibited)\n
Updating   : zsh-5.0.7-4.fc20.x86_64                                      1/2 \n
Cleanup    : zsh-5.0.7-1.fc20.x86_64                                      2/2 \n
Verifying  : zsh-5.0.7-4.fc20.x86_64                                      1/2 \n  
Verifying  : zsh-5.0.7-1.fc20.x86_64                                      2/2 \n\n
Updated:\n  zsh.x86_64 0:5.0.7-4.fc20                                                     \n\n
Complete!\n"
    ]
}

Yep, as we expect telling ansible to use the yum module and specifying that the arguments to the module are name=tmux,zsh state=latest makes sure that the tmux and zsh packages are installed and at their latest available versions. But has using the yum module gained us anything? It’s about as long to type -m yum -a 'name=tmux,zsh state=latest as it is to type -a 'yum install -y tmux zsh and we already know the syntax for the latter. Is there a difference? For the yum module there isn’t much difference. The module and the single command do about the same thing. But there are other modules that do more. For instance, the git module.

Let’s say that you have a web application in a git repository and you want to deploy it directly from git. You need to clone the repository remotely and checkout a specific revision to the working tree because that’s what should be running in production, not the current HEAD of the master branch. You also need to checkout a few git submodules for projects that are hosted in different repositories. If you did this via the shell module, you’d have to use several different calls to git:

$ ansible '*' -i 'host1,' -a \
  'git clone http://git.example.com/project /var/www/webapp'
$ ansible '*' -i 'host1,' -m shell -a \
  'cd /var/www/webapp && git checkout fad30ad8'
$ ansible '*' -i 'host1,' -m shell -a \
  'cd /var/www/webapp && git submodule update --init'

The git module encapsulates git’s features in such a way that you can do all of this in one ansible call:

$ ansible '*' -i 'host1,' -m git -a \
  'repo=http://git.example.com/project state=present version=fad30ad8 recursive=yes dest=/var/www/webapp'

Kinda handy, right? Where modules really start to shine, though, is when we stop trying to run everything in a single ansible command line and start using playbooks to perform multiple steps in a single run. More on playbooks next time.

Tales of an Ansible Newb: Ch0.1 – What are these blog posts?

Last time I wrote about my first baby steps with Ansible, using it for ad hoc commands on multiple machines. When I first started working on Ansible three months ago, that was what 90% of my Ansible knowledge consisted of.

At work we were using Ansible to replace puppet as our configuration management system in Fedora Infrastructure but somehow I never understood the bigger picture of how it all fit together. Host_vars and group_vars and roles and inventory and playbooks that were included in other playbooks and playbooks that worked in conjunction with RHEL kickstart files to create a new host and playbooks that configured the hosts once they were created. And roles… where did roles fit into all of this?

Basically, the system at work was large and I had a hard time finding small enough pieces to master one at a time.

Now that I’m working on Ansible, I have an even larger need to understand how to use it. Code doesn’t exist in a vacuum; I need to know how people might use the code in order to better write the behaviours that they may want. So, on the theory that the best way to learn is by doing and having learned from the mistake of trying to understand too much all at once, I’ve started writing playbooks for my home network. Now, I’m not yet ambitious enough to try to replicate the production-quality or production goals that we had in an infrastructure like Fedora. I’m not (yet?) interested in tearing down my home network and provisioning it from scratch with Ansible. At the moment my ambition is simply to automate some repetitive tasks. As time goes on we’ll see what else happens :-)

Tales of an Ansible Newb: Ch.0 – The Hook

The main thing I like about Ansible is that it has a very easy onramp. If you’ve done even a little bit of administration of a *nix system you’ve been exposed to the shell to interact with the system and ssh to connect you to a remote computer where you can interact with the remote system’s shell. At its most basic level, Ansible is just building on those pieces of knowledge.

For instance, let’s say you have a few computers at your house for you and your wife and your kids and the neighbors’ kids and the cat to use as a warm bed when the sun isn’t shining. You want to install a new piece of software on all of them. How do you do that?

Pre-Ansible it might look like this:

$ ssh host1
$ sudo yum -y install tmux
$ exit
$ ssh host2
$ sudo yum -y install tmux
$ exit
[...]

If you were feeling luc^Wconfident in your shell scripting you might try to automate some of that:

$ for host in host1 host2 [...] ; do
for> ssh -t $host1 sudo yum -y install tmux
for> done
[sudo] password for badger: [types password]
[waits while stuff gets done]
Connection to host1 closed.
[sudo] password for badger: [types password again]
[waits while stuff gets done]
Connection to host2 closed.
[...]

A little better but we still have to type your sudo password prompt once for every host. You still have to wait between typing in your password. You still have the tasks running one-by-one on the remote hosts even though they could be done in parallel.

At this point, depending on what type of person you are you’ll be thinking one of three things:

  • Yay! Automation!
  • I bet I can do better… let me just crack open the expect manual, open up
    my text editor, and maybe rewrite this in perl….
  • You know, someone else must have written an application for this…

I’m a programmer by nature so once upon a time I probably would have found myself running down path 2 with nary a backwards glance. But if time, grey hairs, and the prodding of your more experienced sysadmin friends teaches you anything, it’s that you shouldn’t invent your own security sensitive wheel when someone else’s is perfectly serviceable. So let’s reach for Ansible and see what it can do here:

ansible '*' -i 'host1,host2,[...]' -a 'yum -y install tmux' -K --sudo
sudo password:
fedora20 | success | rc=0 >>
Resolving Dependencies
--> Running transaction check
[...]
Installed:
tmux.x86_64 0:1.9a-2.fc20

Complete!

katahdin | success | rc=0 >>
Loaded plugins: langpacks
Resolving Dependencies
--> Running transaction check
[...]
Installed:
tmux.x86_64 0:1.9a-2.fc20

Complete!

Nice! So now I can run ad hoc commands on multiple ad hoc hosts, using sudo if I want to. I type my sudo password just once at the very beginning and then go do other things while Ansible runs my task on all the hosts I specified! That certainly makes managing my home network easier. I think I’ll be using this tool more often….