Welcome to my web log. See the first post for an introduction. See the archive page for all posts, and comments for a feed of comments only. (There is an english language feed if you don't want to see Finnish.)
All content outside of comments is copyrighted by Lars Wirzenius, and licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License. Comments are copyrighted by their authors.
Russ Allbery has written up summaries of the research he's done in preparation of a Debian Technical Committee vote on default init system in Debian's next release. The mails are long, and well worth reading completely if the topic interests you at all, but I'd like to quote separately a few paragraphs that I found thoughtful, and inspiring, in the greater Debian context and not just for the init system discussion.
Here again, I think we have an opportunity for Debian to be more innovative and forward-looking in what we attempt to accomplish in the archive by adopting frameworks that let us incorporate the principles of least privilege and defense in depth into our standard daemon configurations.
And later, from the same mail:
I think we should make wise decisions about which areas we want to invest project effort. I dislike investing significant project effort in catch-up efforts that, when complete, merely get us back to where we would have been if we'd chosen a different solution. I don't think that's wise stewardship of project resources. I want to see Debian focus its efforts on places where we can make a real difference, where we can be leaders. That means adopting the best-of-breed existing solutions and building on top of them, not reinventing wheels and thereby starting from a trailing position.
It is not, in general, necessary to justify what you want to do in Debian. It doesn't matter if it's going to be used by thousands of people or two people. If you can do your work to the standards that we expect to be maintained across the archive, and without negative impact on Debian's other contributors, you can work on whatever you love. And, furthermore, we all support each other in our passions. Debian is built on a culture of deference to other people's work, and reasonable accomodation to the projects that other people want to work on.
Now, there is a fine balance here. Deference to other people's work does not mean a requirement to join their work. Reasonable accomodation does not mean that every Debian developer is required to test their software on non-Linux ports. The goal is that all of us should be empowered to work on the things we are passionate about, which implicitly includes being empowered to not work on the things that are not of interest to us. Therefore, for some efforts, and specifically for the non-Linux port efforts, the work is mostly born by the porters. They're expected to test, diagnose problems, and submit patches. The deference and reasonable accomodation that we expect of Debian contributors is to merge those patches when they are reasonable, to not undermine the porting effort when there is a reasonable approach that preserves it, and to be aware of the implications of their packaging for those ports in the broad sense (such as qualifying build-dependencies with [linux-any] where appropriate).
We do not expect Debian contributors to reject significant new upstream versions or significant new integrations because they will affect the non-Linux ports, or, for that matter, other projects in Debian. We do expect those changes to follow the standards that we expect of Debian as a whole, and that porting efforts will be treated with deference and reasonable accomodation.
I won't offer a comment on these quotes—I prefer to let them speak for themselves—but I will say I find these to be among the wisest things said within Debian all year.
Backups are fun and exciting! Restores are exciting, in a terrifying and stressful way. Fixing serious bugs is gratifying, in a depressing way.
I am glad to announce two software releases: Obnam version 1.6.1 and larch version 1.20131130. Obnam is my backup application, larch is a Python copy-on-write B-tree implementation that Obnam uses. I've uploaded the new versions to Debian unstable and to my own apt repository at [code.liw.fi] (also for wheezy). (I don't have the energy to upload to Debian backports: help is welcome there.)
These are mainly bug fix releases, and minor improvements. The larch release fixes serious problems, and everyone should be upgrading.
NEWS for larch:
Serious bug fixed: the "KeyError" crash for reference counts. This was false memory use optimisation, which triggered a rare bug in related code. Repeatable test case by Rob Kendrick, and helpful analysis by Itamar Turing-Trauring.
Serious bug fixed: another "node missing" bug. This crash was caused by a bug that overwrote on-disk reference count groups with zeroes. Repeatable test case by Rob Kendrick.
Fixes to fsck from Antoine Brenner.
NEWS for Obnam (combining versions 1.6 and 1.6.1, both released today):
Fix Debian package dependencies correctly.
Stop logging paramiko exceptions that get converted into another type of exception by the SFTP plugin in Obnam.
obnam-benchmarkcan now use an installed version of larch. Patch by Lars Kruse.
Obnam has been ported to FreeBSD by Itamar Turner-Trauring of HybridCluster.
Backup progress reporting now reports scanned file data, not just backed up file data. This will hopefully be less confusing to people.
list-toplevelscommands now obey a new option,
--key-details, to show the usernames attached to each public key. Patch by Lars Kruse.
--ssh-commandto set the command Obnam runs when invoking ssh. patch by Lars Kruse.
obnam clientscan now be used without being an existing client. Patch by Itamar Turner-Trauring.
--ssh-host-keys-checkto better specify how SSH host keys should be checked. Patch by Itamar Turner-Trauring.
"obnam list-toplevelsso it doesn't give an error when it's unable to read the per-client directory of another client, when encryption is used. Fix by Lars Kruse.
Fix the encryption plugin to give a better error message when it looks for client directories but fails to find them. Fix by Lars Kruse.
obnam list-toplevelsgot confused when the repository contained extra files, such as "lock" (left there by a previous, crashed Obnam run). It no longer does. Fix by Lars Kruse.
The SFTP plugin now handles another error code (EACCESS) when writing a file and the directory it should go into not existing. Patch by Armin Größlinger.
Obnam's manual page now explains about breaking long logical lines into multiple physical ones.
/~/path prefix in SFTP URLs works again, at least with sufficiently new versions of Paramiko (22.214.171.124 in Debian wheezy is OK). Reported by Lars Kruse.
The Nagios plugin to report errors in a way Nagios expects. Patch by Martijn Grendelman.
The Nagios plugin for Obnam now correctly handles the case where a backup repository for a client exists, but does not have a backup yet. Patch by Lars Kruse.
obnam lsnow handles trailing slashes in filename arguments. Reported by Biltong.
When restoring a backup, Obnam will now continue past errors, instead of aborting with the first one. Patch by Itamar Turner-Trauring.
This is an essay form of a talk I have given today at the Cambridge Mini-debconf. The talk was videoed, so will presumably show up in the Debconf video archive eventually.
Debian has a long and illustrious history. However, some of the things we do perhaps no longer make as much sense as they used to do. It's a new millennium, and we might find better ways of doing things. Things that used to be difficult might now be easy, if we dare look at things from a fresh perspective. I have been doing that at work, for the Baserock project, and this talk is a compilation of observations based on what I've learnt, concentrating on things that affect the development workflow of package maintainers.
Introduction and background
I have been a Debian developer since August, 1996 (modulo a couple of
retirements), and have used it a little bit longer. I have done a
variety of things for Debian, from maintaining PGP 2 packages to
piuparts to blathering excessively on the mailing lists.
My day job is to develop the Baserock system at Codethink. Baserock is a set of tools and workflows for developing embedded and appliance Linux systems. If you squint, it looks a bit like a source-based Linux distribution. I have worked on Baserock since September, 2011.
Some of Baserock's design has been influenced by my experience with Debian. With Baserock I have the chance to fix all the things that are wrong in Debian, and this talk is me giving back to Debian by pointing out some the things I feel should be fixed. I don't have solutions for these problems: this is a bug report, not a patch. It's also perhaps a bit of a rant.
I am specifically concentrating here on technical and tooling issues that affect the development process of Debian. I excluding social issues.
I am also not trying to get Debian to switch to Baserock. Baserock is targeting embedded and appliance systems, and makes simplifying assumptions based on those targets, which Debian does not get to do. I am pointing out problems, and I am outlining solutions as implemented in Baserock, when I think the concept carries over well to Debian.
Build tools should be intelligent, packaging should be dumb
In Debian, the fundamental tool for compiling the upstream code and
assembling a binary package is
dpkg-buildpackage. It uses
debian/rules, which is a Makefile with targets with specific names
and semantics. By executing the right targets in the right order,
dpkg-buildpackage tells the source package to build the binary
On the one hand, this is a nice design, because it abstracts away the
large variety of upstream build systems into one API for
dpkg-buildpackage to use. On the other hand, it puts all the
intelligence for how packages are built into the packages.
Making packaging intelligent, rather than the build tool, means packagers need to do more work, and there's more work to do when, say, the Debian policy changes, or when there are other changes that affect a large number of packages. If packaging is intelligent, then every package needs changes. If the build tool is intelligent, then you change the build tool and re-build everything.
In Baserock we put as much intelligence as we can into the Morph tool, which drives the build process. It turns out that today, unlike in 1995, most upstream projects use one of a handful well-known build systems: autotools, cpan, cmake, Python distutils, etc. With just a little extra logic in Morph we avoid having any delta against upstream in Baserock. This doesn't work for quite all upstream projects, of course, but we've spent the 2% (two, not twenty) of effort that solves 80% of the problem.
In recent years, the
dh approach to packaging has made a lot of
packages be able to have only a minimal, 3-line
This is excellent. Wouldn't it be nice if even that wasn't needed?
It'd save thousands of files in source packages across the archive, at
least. It would be easy to: if the file is missing, have
dpkg-buildpackage assume the 3-line version by default.
Getting rid of a single file is, of course, not a particularly big
win. The big win is the change in mindset: rather than dealing with
all new issues in development by adding yet more fields to
debian/control and more optional, competing tooling outside the core
toolset, if you improve the tools everyone uses, then everyone's
packages gets better.
The goal should, in my opinion, be that for the large number of
packages where upstream uses a well-known, well-behaved build system,
and uses it in a reasonably sensible way, the Debian source package
should not require anything added to make the package build. There
will still be a need to add some stuff, such as the
file, to make a good Debian package, but just getting the package to
build should require nothing extra. (Side note: wouldn't it be nice if
there was a well-known, widely used way to mark up copyright
information so that
debian/copyright could be constructed and
Configuration file handling on upgrades: add ucf to dpkg already
In the 1990s,
dpkg had excellent handling of configuration files and
merging local changes with changes from the new package version, but
it was excellent only because it tried to do that at all, and mostly
nothing else did. It hasn't changed much, since, and it's not
excellent on any absolute scale.
We have the
ucf tool, which can do a better job, but requires to be
added to each package wanting to use it. Why don't we make
smarter, instead? If
ucf is not not good enough to be merged
dpkg, let's write something better.
Making every package maintainer use
ucf manually is just wasteful.
This is not the kind of thing that should be done for each package
separately: the package manager should be smart so that packaging can
The goal should be that
dpkg is smart enough in its configuration
file handling that having the package do it is a very rare special
Clean building shouldn't be hard
The basic tool for building a package is
dpkg-buildpackage. It is
ever so slightly cumbersome to use, so there are some wrappers, most
debuild. However, if you're making a build intended to
be uploaded to the Debian archive, you should be doing a clean build.
This means having to learn, configure, and use yet more tools.
A clean build is important: security updates, development, debugging, quality control, porting, and user support become more difficult if we don't know how a package was built, and can't reproduce the build. It gets harder to keep build dependencies correct, making it harder for everyone to build things.
Luckily, Debian has solved the clean build problem. Unluckily, it has solved it multiple times, creating the problem of having to choose the clean building approach you want to use. The default way of building is not clean, so then you have to remember to use the non-standard way. You also get to spend time maintaining the clean build environments yourself, since that doesn't seem to be fully automated. None of this is hard, as such, but it's extra friction in the development workflow.
The primary approaches for clean building in Debian seem to be
sbuild. I happen to use
myself, because it's what's been around the longest, but I make no
claim of having made an informed choice.
That is part of the problem here: why should I have to spend the effort to become informed to make a choice well? Why is the default way of building not clean? Don't say performance: Morph sets up a clean staging area in seconds, and does not offer you a choice of not doing so. It's a chroot, with everything but the build tree hardlinked from cached, unpacked build dependencies, and protected using read-only bind mounts.
What's more, this approach avoids having to maintain the
base tarballs or
sbuild chroots manually. It's all automatic, and up
to date, for every build. Furthermore, the staging area contains only
the specified build dependencies, and not anything else, meaning a
build fails if a build dependency is missing, rather than succeeding
because it happens to be in the default set of packages the build tool
Mass building shouldn't be hard
Suppose you want to try out a large-scale change in Debian. It might
be trying out a new version of GCC, or using llvm's
clang as the
default C compiler, or updating glibc, or doing a large library
transition such as a new version of GTK+, or trying a new
version, or even something more exploratory such as changing the
default optimisation flags for all packages to see what breaks.
All of these will require you to at least rebuild all affected packages. Ideally you'd test the built packages as well, but let's concentrate on the building for now.
Here's what you do. You make the changes you want to try out, and build those packages. You create a dedicated APT repository, and upload your packages there. You configure your build environment to favour that APT repository. You make a list of all reverse build dependencies of the packages you changed. You write a script to build all of those, preferably so that if you change A, and rebuild B, then you also rebuild C which build depends on B. Each rebuilt package you also upload to your APT repository. You keep track of the build log, and success or failure, of each build.
The people in Debian who do this kind of stuff regularly presumably have tools for doing it. It shouldn't be a rare, special thing, though. If my package has reverse build dependencies, I should at least consider testing building when I'm making changes. Otherwise, it might take years until the reverse build dependencies are rebuilt, and the problem is only found then, making it harder to fix.
To be fair, building a lot of packages takes a lot of resources. It's not feasible to rebuild everything in Debian every time there's any change to, say, eglibc. However, it's feasible to do it, for large subsets of the archive, without huge hardware investments.
One VCS to rule them all
In 1995 there was really only one relevant version control system:
CVS. It was not a great tool. In 2000, another contender existed:
Subversion. It fixed some problems in CVS, but still wasn't a great
tool. In 2005, there was a great upheaval and distributed version
control systems started to become mainstream. There were a large
handful of them. In 2010, it was becoming pretty clear that
won. It's ugly, but it's powerful.
I'm not going to debate the relative merits of different version control systems. Until recently, I was a Bazaar boy, and all of my personal projects were kept in Bazaar. (I have recently switched everything to git.)
There are, however, strong benefits from everyone using the same system. Developers don't need to learn a dozen version control systems. Tools that operate on many repositories are easier to write and maintain. Workflows become simpler if one system can be assumed.
Debian has a strong historical tendency to choose every option. This is sometimes a good thing, and sometimes a bad thing. For keeping source packages in version control I believe it is to be a bad thing. The status quo is that a Debian source package may not be in version control at all, or it might be in any version control system.
This is acceptable when everyone only ever needs to maintain their own packages. However, in a distribution the size of Debian, that is not the case. NMUs, security support, archive wide transitions, and other situations arise when I might need to change yours, or you might need to change mine.
We try to work around this by having a complicated source package
quilt to maintain patches to upstream semi-manually in
debian/patches directory. This is an awkward workflow. It's a
workflow that trips up those that are not used to it. (I know
is a patch management system, not a version control system. I think
git does it much better anyway.)
It would be oh so much easier if everyone kept their source packages in the same, real version control system. I don't even care what it is, as long as it is powerful enough to handle the use cases we have.
Imagine a world where every Debian source package is kept, for
git, and everyone also uses the same layout and
roughly the same workflow to maintain it. What would this mean?
It would mean that if you want to inspect the history of your package, you know how to do that. If you want to merge in some bugfix from upstream code, you know how to do that, without having to figure out which of the several source package formats are in use.
It would make feasible the development of more powerful, more higher
level tooling. For example, it would allow Debian to have what we call
system branches in Baserock. In Debian we have stable, testing,
unstable, and experimental. We may get something like Ubuntu's PPAs,
or perhaps an improved version of those. These are very poor versions
of system branches, just like quilt is a poor way to manage patches
and deltas against upstream. For example, you can upload an
experimental version of
experimental, but then nobody else
can upload another experimental version. You can set up your own PPA
for this, but you'll still be affected by all the uploads to
unstable while you're working.
A Baserock system branch is a branch of the entire system, or the entire distribution in the case of Debian. It is isolated from other system branches. In branch in an individual repository is a well-known concept. A system branch is conceptually like branching every repository in the distribution at once. The actual implementation is more efficient, of course.
This would be possible to implement without standardising on one version control system, but it would be much harder to implement, and would have to live with the lowest common denominator for features. CVS and Subversion, for example, don't really to merges, where Bazaar, Mercurial, and git do. Possible does not mean feasible.
Any work you do in a system branch is isolated. Your work doesn't affect others, and theirs doesn't affect yours, until a merge happens. This is a simple, but very powerful tool.
Cheap system branches, and powerful merging, makes it possible to do experiments safely, with little fuss. Combine that with being able to build everything cleanly and quickly, and get into a situation where there's no need to have make technical decisions based on arguments on mailing lists, and instead they can be done by choosing by looking at working code.
I don't know how this could be implemented in Debian, but think about it. If Debian could have this, it might many archive-scale changes easier.
debian/rules clean: really?
One of the silliest things we require of packages is that they have a
debian/rules clean rule that cleans up after a build perfectly so
that we can do repeated builds in the same source tree.
Let's just use
git clean -fdx instead.
This is a problem that is superbly well suited for automation. There is no point whatsover making packagers do any manual work for this.
Large scale semi-mechanical changes require too much effort
About a decade ago, we decided to follow a new version of the
Filesystem Hierarchy Standard and transition from
/usr/share/doc. This was an almost entirely mechanical change: in
many cases, a mere rebuild would fix it, and in almost every other
case it was just a minor tweak to the packaging. A one-line change.
It took us seven years to do this.
Seven years. Think about it.
In a recent discussion about building R binary data files from source at package build time it was suggested that we take 2-3 release cycles to get this done. That's four to six years. Think about it.
These are not isolated cases. Every time we need to make a change that affects more than a small handful of packages, it becomes a major undertaking. Most of the time all the people involved in this are agreeable to the change, and welcome it. The change takes long because it requires co-ordinating a large number of people. Sometimes people are just busy. Sometimes they've left the project, but haven't properly orphaned their packages. Waiting for a timeout for answers about packages drags the process longer.
Mechanical changes, or semi-mechanical ones, which are very easy and very quick to do should not take years. They should take an evening, no more.
There's no end of changes we might want to do like this. In 2005 and
2006 I filed a few hundred bugs from failed
piuparts runs. They're
still being fixed, even when the fix is simple, such as adding a
postrm script to remove, when the package is purged, a configuration
file created by postinst, or starting the service with
rather than running an
init.d script directly. Mostly one-line
There's more mechanical changes that might happen. For example,
changing the value of the
VCS-Browser field when the Debian version
control server changes domain names.
It's not just about simple, mechanical changes, either. Transitions of important library packages, for example, which require changes to reverse dependencies due to a changed API are another source of pain. What should be a few evenings of build fixes can drag out to months of co-ordination.
This is caused partly by technical issues, and partly due to social issues. The main social issue is that we have quite a strong ownership of packages, and NMUs are to be done carefully, and only in some cases. This is both good and bad, and I won't discuss that aspect now. The technical issues are that our tools are primarily intended for maintaining individual packages, rather than all of them together, making it harder to make even simple changes in large number of packages.
In addition to easy mass building and system branches, as outlined above, large-scale changes would require testing tools so that you don't build the new package versions, but also test things automatically. Essentially, CI at the distribution level.
I've listed above a small variety of problems I see in the Debian development processes and tools. They're not the important part of this talk. The important part is that it we, the Debian developers, should look at our tools, and workflows, critically, and improve them when we can. Even small improvements are very useful, when they affect each of our twenty thousand source packages. The important change I argue for here is one in mindset, rather than any specific problem in any tool. We need a mindset of constant, incremental improvement for our tools.
I recently set up a Gitano instance as http://git.liw.fi/. Gitano is a very nice git server, which adds user and group management and access control in front of git itself, and keeps all configuration in git, where it's traceable and auditable nicely. It's also a command line based thing, rather than a slow, resource-hungry web application, and so much more to my liking.
Daniel, the Gitano upstream, has a "gitano-all" source tree for creating an unofficial Gitano Debian package, which includes cgit, a fast git web interface. This is not ever going to be accepted in Debian, of course, but it makes it easier to install Gitano on your server. This hanky-panky is needed because Gitano and cgit both use libgit2, and that's a library without a stable API at this time. This makes is difficult to package for Debian. cgit seems to embed other such projects as well.
Anyway, once you've installed the Gitano software (and cgit, if you want that), there's the matter of setting up a Gitano instance.
Each Gitano instance is its own Unix user, accessed over ssh. Thus, one machine can host any number of Gitano instances, and they'll be nicely isolated from each other by normal Unix setup. Each instance manages its own set of Gitano users and group, which only exist within that instance. Users are identified by ssh public keys: there are no passwords.
Here's my slightly edited checklist for setting up a Gitano instance. It assumes Gitano and cgit and their dependencies are installed.
Setup Gitano itself:
adduser --disabled-password git
- create the Unix user
- I chose
gitas the username, so that
firstname.lastname@example.org my Gitano instance
- copy your ssh public key to the system; you'll need
it for gitano-setup
- the key file needs to be readable by the Gitano instance Unix user
gitano-setupas the Gitano instance user
su - git
- answer questions: I chose defaults for most things
- if you screw this up, you can start over by deleting everything in the home directory
- from your own ssh account:
ssh git@host whoami
- this should produce some output telling you you're in the
- if that works, Gitano is correctly setup
- this should produce some output telling you you're in the
Setup a git daemon for public git repositories:
/etc/inetd.confto add (though all on one line, this is broken on several lines for display purposes):
git stream tcp nowait nobody /usr/bin/git git daemon --inetd --interpolated-path=/home/git/repos/%D /home/git/repos
Setup cgit and Apache:
# CGIT stuff DirectoryIndex /cgi-bin/cgit/cgit.cgi Alias /cgit.png /usr/share/cgit/htdocs/cgit.png Alias /cgit.css /usr/share/cgit/htdocs/cgit.css <Directory "/home/git/repos"> Options FollowSymlinks Indexes Allow from all AllowOverride none Order allow,deny </Directory>
# Enable caching of up to 1000 output entriess cache-size=1000 # Specify some default clone prefixes clone-prefix=git://testgit # Specify the css url css=/cgit.css # Specify the logo url logo=/cgit.png # Show extra links for each repository on the index page enable-index-links=1 # Show number of affected files per commit on the log pages enable-log-filecount=1 # Show number of added/removed lines per commit on the log pages enable-log-linecount=1 # Set the title and heading of the repository index page root-title=testgit root-desc=Lars's test git repositories # Allow download of tar.gz, tar.bz2 and zip-files snapshots=tar.gz #source-filter=/usr/lib/cgit/filters/syntax-highlighting.sh remove-suffix=1 enable-git-config=1 strict-export=git-daemon-export-ok scan-path=/home/git/repos ## ## List of common mimetypes ## mimetype.git=image/git mimetype.html=text/html mimetype.jpg=image/jpeg mimetype.pdf=application/pdf mimetype.png=image/png mimetype.svg=image/svg+xml
Finally, you should review, and possibly alter, Gitano access control rules.
git clone ssh://git@YOURHOST/gitano-admin
- to allow anonymous access to all repositories,
rules/core.lace, uncomment the last line: "allow "Anonymous access is okay" op_read !is_admin_repo"
- commit and push
- to allow anonymous access to all repositories, edit
Some Gitano commands:
ssh git@YOURHOST create foo
ssh git@YOURHOST ls
PS. I wrote a yarn test suite for my Gitano ACL, which may be interesting if you're new to Gitano.
I wrote a very short book on Getting Things Done for hackers, which I also put up for sale on Amazon as a Kindle book, in addition to allowing people to get it for free from my website. As expected, it didn't sell much, perhaps a few dozen copies in total, but I got a small kick out of having a book for sale.
However, Amazon now wants me to set up some kind of US tax account, which is all fine and good and legal and appropriate, but it's not worth my time to do that. I've removed the book from Amazon, to keep my life simple.
It continues to be available on its own website, including in a Kindle format. For free. Under a CC-BY-SA 3.0 license. Since I see the download statistics for it, I know how many people download it. The PDF form is most popular, as it happens. All good.
I have upgraded all of my own computers from Debian 6.0 (squeeze) now. I no longer use it myself at all. Some of my software already doesn't work on squeeze (particularly, yarn, which is part of cmdtest). I do not want to spend my time finding ways to support, say, the python-markdown library version that is in squeeze: I would rather make functional improvements than support ancient software versions.
I am hereby ending my support of squeeze in my own software. This means I will be removing squeeze from my apt repository on code.liw.fi, and will no longer test that my software works in squeeze, and will not be making packages for squeeze. (I am removing squeeze from the apt repository so that people don't wrongly assume it is being kept up to date.)
This change affects users of Obnam. Obnam was not part of the squeeze release of Debian, but I have maintained packages of it for squeeze on code.liw.fi.
If any of my software that is packaged in squeeze has security bugs, or bugs that would be considered release-critical by Debian, I will fix those, but only by making the minimal necessary change to the version in squeeze rather than backporting the current version. Debian is committed to support squeeze for several more months, and I will help Debian do that for my packages.
I write a fair number of Unix command line programs, and I want to have test suites for them. A year and a half ago I wrote a tool to make that easier: cmdtest. However, it turned out to not be a particularly good one, primarily because it doesn't make it easy to write good test code. Tests and test code should be obviously correct, so you don't need a test suite for your test suite.
Earlier this year, Daniel and I started a slow, but long discussion about this. We'd talk it over during many a lunch, figuring out what we actually want. We eventually converged on an approach based on for Unix command line programs, and based on Markdown documents.
We then wrote that up as a README, and then took another month refining the README before writing any code. The naming discussion only took a week.
We borrowed the given/when/then structure from BDD, and the idea of using that structure so that tests are specified using lightly structured text in a way that allowsnon-developers to participate.
This borrowed structure gives a way to express sequences of things that happen, including setup and checks. We call these sequences test scenarios, and thus this approach to testing is called scenario testing.
We embed the test scenarios, and the code that implements them, as code blocks in Markdown documents. This makes it easy to explain what is going on in the tests, when that's needed, right next to the tests. The Markdown files can easily be formatted as web pages or PDFs (among other formats), making it easy to share and discuss them with people who might not be comfortable accessing a git repository.
As a concrete example, here is a real test scenario, from Obnam, for testing its new FUSE plugin:
SCENARIO Browsing backups with FUSE plugin ASSUMING user is in group fuse GIVEN a live data directory AND a 0 byte file called empty AND a 1 byte file called one AND a 4096 byte file called 4k AND a 10485760 byte file called 10meg AND a hardlink to 4k called 4k-hardlink AND a symlink to 4k called 4k-symlink AND a directory called some-dir AND a symlink to ../4k called some-dir/other-symlink WHEN live data is backed up AND repository is fuse mounted THEN latest generation can be copied correctly from fuse mount FINALLY unmount repository
SCENARIOis the name of the scenario.
ASSUMINGis a condition for the entire scenario: the FUSE plugin can only be tested when the user can use FUSE.
GIVENis a setup step,
WHENis the step that does the thing that is being tested, and
THENchecks that the right thing happened during
ANDis an alias for the keyword on the previous line; it makes the test scenario seem more like normal English text
FINALLYis for cleaning up afterwards, whether the scenario passed or not
Test scenarios like this are obviously not directly executable. We need to provide the code that actually executs each step. For this, yarn provides an IMPLEMENTS section, of which I will show one example: the creation of the files.
IMPLEMENTS GIVEN a (\d+) byte file called (\S+) dd if=/dev/zero of="$DATADIR/live/$MATCH_2" bs=1 count="$MATCH_1"
(The other IMPLEMENTS sections for that Obnam test scenario are also
only about a line of shell. The longest one is four lines, because it
if statement that I like to format in a fairly verbose
way. I'm skipping the others to avoid needless repetition.)
The text after the keyword (GIVEN, WHEN, THEN, etc) in the test scenario is a single line of free-form text, which is matched by a regular expression in the IMPLEMENTS section. There must be exactly one matching IMPLEMENTS for each line in a test scenario.
The body of IMPLEMENTS is a shell script snippet. Yarn provides a
temporary directory for each scenario, and the path to that is given
DATADIR environment variable. Yarn also cleans up the
directory afterwards, unless the user running yarn requests otherwise,
so tests that only change things in the temporary directory don't need
any FINALLY steps. The above example uses FINALLY to unmount the FUSE
filesystem at the end.
The regular expressions can capture parts of the text in the scenario
step, and these are provided to the shell script using environment
MATCH_1 etc. This is a simple, but powerful mechanism for
implementing very flexible steps, which also means the number of steps
can be kept down.
The IMPLEMENTS sections are shared between all test scenarios. This makes it very easy to write new scenarios using already-existing steps, by choosing which ones to use, in which order, and making use of any variability in the steps that their implementations provide.
Yarn is a new tool, and I expect it to mature in the coming months. It's included in the cmdtest Debian package, and is already in Debian jessie and unstable, for those who want to try it. (My personal apt repository has a .deb for wheezy as well; backports are welcome). The yarn scenario testing language itself is not set in stone, but Daniel and I both want to be careful about making changes to it. It's all meant to be very simple: an obviously correct testing tool is better than an intricate one.
I've used yarn in several of my own little programs now, as has Daniel, and we've used it at work as well. Undoubtedly we'll find bugs in the implementation, and we're still feeling our way to figuring out how to write really excellent scenarios, and how to structure steps with the best possible taste. We'll also start designing a shell library that yarn provides to the IMPLEMENTS sections, for doing things that we find ourselves doing in many projects.
It is already clear to me, however, that yarn is oh so much better than cmdtest could ever be. I'm even starting to express ideas for new tools as yarn test documents, so I'm clear what they should do. If and when I ever get to actually implementing the tools, I'll have a test suite for them already. See, for example, first attempts at specifying how a fast file transfer tool, fling, which is to use Linux-only system calls to reach wire speeds over wired LAN, and a tool to recompress compressed files with another compression tool.
We hope yarn will prove to be useful for others as well. If you try it, and have suggestions or comments, please e-mail us.
I've just pushed out the release files for Obnam version 1.5, my backup application, and Larch, my B-tree library, which Obnam uses. They are available via my home page (http://liw.fi/). These versions have alos been uploaded to Debian unstable.
NEWS for Obnam:
- Terminal progress reporting now updated only every 0.1 seconds, instead of 0.01 seconds, to reduce terminal emulator CPU usage. Reported by Neal Becker.
- Empty exclude patterns are ignored. Previously, a configuration file line such as "exclude = foo, bar," (note trailing comma) would result in an empty pattern, which would match everything, and therefore nothing would be backed up. Reported by Sharon Kimble.
- A FUSE plugin to access (read-only) data from the backup repository has been added. Written by Valery Yundin.
NEWS for larch:
- Bug fix in how Larch handles partly-comitted B-tree journals in read-only mode. Previously, this would result in a crash if, say, a node had been removed, but the B-tree metadata hadn't been committed.
For more, see the archive.