I write a fair number of Unix command line programs, and I want to have test suites for them. A year and a half ago I wrote a tool to make that easier: cmdtest. However, it turned out to not be a particularly good one, primarily because it doesn’t make it easy to write good test code. Tests and test code should be obviously correct, so you don’t need a test suite for your test suite.

Earlier this year, Daniel and I started a slow, but long discussion about this. We’d talk it over during many a lunch, figuring out what we actually want. We eventually converged on an approach based on [BDD]: conceptually similar to Cucumber, the Ruby testing tool, but for Unix command line programs, and based on Markdown documents.

We then wrote that up as a README, and then took another month refining the README before writing any code. The naming discussion only took a week.

  • We borrowed the given/when/then structure from BDD, and the idea of using that structure so that tests are specified using lightly structured text in a way that allowsnon-developers to participate.

  • This borrowed structure gives a way to express sequences of things that happen, including setup and checks. We call these sequences test scenarios, and thus this approach to testing is called scenario testing.

  • We embed the test scenarios, and the code that implements them, as code blocks in Markdown documents. This makes it easy to explain what is going on in the tests, when that’s needed, right next to the tests. The Markdown files can easily be formatted as web pages or PDFs (among other formats), making it easy to share and discuss them with people who might not be comfortable accessing a git repository.

As a concrete example, here is a real test scenario, from Obnam, for testing its new FUSE plugin:

SCENARIO Browsing backups with FUSE plugin
ASSUMING user is in group fuse
GIVEN a live data directory
AND a 0 byte file called empty
AND a 1 byte file called one
AND a 4096 byte file called 4k
AND a 10485760 byte file called 10meg
AND a hardlink to 4k called 4k-hardlink
AND a symlink to 4k called 4k-symlink
AND a directory called some-dir
AND a symlink to ../4k called some-dir/other-symlink
WHEN live data is backed up
AND repository is fuse mounted
THEN latest generation can be copied correctly from fuse mount
FINALLY unmount repository

Some explanations:

  • SCENARIO is the name of the scenario.
  • ASSUMING is a condition for the entire scenario: the FUSE plugin can only be tested when the user can use FUSE.
  • GIVEN is a setup step, WHEN is the step that does the thing that is being tested, and THEN checks that the right thing happened during WHEN.
  • AND is an alias for the keyword on the previous line; it makes the test scenario seem more like normal English text
  • FINALLY is for cleaning up afterwards, whether the scenario passed or not

Test scenarios like this are obviously not directly executable. We need to provide the code that actually executs each step. For this, yarn provides an IMPLEMENTS section, of which I will show one example: the creation of the files.

IMPLEMENTS GIVEN a (\d+) byte file called (\S+)
dd if=/dev/zero of="$DATADIR/live/$MATCH_2" bs=1 count="$MATCH_1"

(The other IMPLEMENTS sections for that Obnam test scenario are also only about a line of shell. The longest one is four lines, because it has an if statement that I like to format in a fairly verbose way. I’m skipping the others to avoid needless repetition.)

The text after the keyword (GIVEN, WHEN, THEN, etc) in the test scenario is a single line of free-form text, which is matched by a regular expression in the IMPLEMENTS section. There must be exactly one matching IMPLEMENTS for each line in a test scenario.

The body of IMPLEMENTS is a shell script snippet. Yarn provides a temporary directory for each scenario, and the path to that is given in the DATADIR environment variable. Yarn also cleans up the directory afterwards, unless the user running yarn requests otherwise, so tests that only change things in the temporary directory don’t need any FINALLY steps. The above example uses FINALLY to unmount the FUSE filesystem at the end.

The regular expressions can capture parts of the text in the scenario step, and these are provided to the shell script using environment variables: MATCH_1 etc. This is a simple, but powerful mechanism for implementing very flexible steps, which also means the number of steps can be kept down.

The IMPLEMENTS sections are shared between all test scenarios. This makes it very easy to write new scenarios using already-existing steps, by choosing which ones to use, in which order, and making use of any variability in the steps that their implementations provide.

Yarn is a new tool, and I expect it to mature in the coming months. It’s included in the cmdtest Debian package, and is already in Debian jessie and unstable, for those who want to try it. (My personal apt repository has a .deb for wheezy as well; backports are welcome). The yarn scenario testing language itself is not set in stone, but Daniel and I both want to be careful about making changes to it. It’s all meant to be very simple: an obviously correct testing tool is better than an intricate one.

I’ve used yarn in several of my own little programs now, as has Daniel, and we’ve used it at work as well. Undoubtedly we’ll find bugs in the implementation, and we’re still feeling our way to figuring out how to write really excellent scenarios, and how to structure steps with the best possible taste. We’ll also start designing a shell library that yarn provides to the IMPLEMENTS sections, for doing things that we find ourselves doing in many projects.

It is already clear to me, however, that yarn is oh so much better than cmdtest could ever be. I’m even starting to express ideas for new tools as yarn test documents, so I’m clear what they should do. If and when I ever get to actually implementing the tools, I’ll have a test suite for them already. See, for example, first attempts at specifying how a fast file transfer tool, fling, which is to use Linux-only system calls to reach wire speeds over wired LAN, and a tool to recompress compressed files with another compression tool.

We hope yarn will prove to be useful for others as well. If you try it, and have suggestions or comments, please e-mail us.