I’ve been using Ansible for configuration management needs for a few years now. I have a couple of roles (in debian-ansible) that are reasonably re-usable: I use them from a variety of playbooks. I wrote them after getting tired of doing the same thing for pretty much every playbook.

Changing the roles has always been error prone in the usual way badly designed software without automated tests always is. I’d make a change in a role for one server, and then later notice the change broke things for another server. I’d fix the role for the other server, and later find out the fix broke the role for a third one. Repeat until frustrated.

I recently set up some automated testing for my roles. I wrote some acceptance tests using Subplot, and I now have test scenarios like this:

given a host running Debian
when I use role sane_debian_system
and I use variables from sane1.yml
and I run the playbook
then the host has the sudo package installed
and the host has the apt-transport-https package installed
and the host has the locales package installed
and the host has the ntp package installed
and the host has an empty /etc/apt/sources.list.d directory
and the host has hostname saneone
and the host has saneone in /etc/hosts for 127.0.1.1

Subplot generates a test program based on the scenario. The generated program does the following:

  • Create a new VM, which runs Debian.
  • Create a playbook using my sane_debian_system role, using specific values for variables used by the role.
  • Run the playbook against the VM.
  • Verify that the VM looks as expected.
  • Destroy the VM.

The VM is run using qemu-system, as this can be done locally. I didn’t want to write something that uses a variety of cloud or VPS providers to manage the VMs.

It’s not perfect, and it’s not fast. But it works much better than not having any tests at all.