Continuing my earlier musings about test driven distro development, and the tools that would require.

I imagine something like this:

  • create a virtual machine with a particular configuration of stable
  • run a test suite verifying the VM works
  • upgrade the VM to testing and reboot
  • re-run test suite, verifying things work in testing

We might also want this scenario:

  • install testing directly in a VM (instead of upgrading from stable)
  • run test suite

There might be other scenarios that would be useful to test as well. Even from these two, it's clear there's a need for at least three separate tools:

  • create a machine image with a particular configuration
    • might be based on stable or testing or unstable
    • might also be using some other package sources
    • might want particular packages installed
    • might want particular configuration settings
  • run a suite of tests against the running image
    • which tests to run will depend on the scenario
  • run various scenarios using the above two tools
    • needs reasonably easy ways to specify scenarios
    • needs to support tests specific to scenarios

I've written a rudimentary version of the first tool: vmdebootstrap. I've since learned there's a bunch of others that might also work. There's room for more: someone should write (or find) a tool to make a snapshot of a real system and create a VM image that mimicks it, for example. Anyway, for now, I'll assume that one of the existing tools is good enough to get started.

For the second tool, I wrote a quick-and-dirty proof-of-concept thing, see systest.py. Here's a sample of how it might be used:

liw@havelock$ ./systest -v --target 192.168.122.139 --user tomjon
test 1/6: cat
test 2/6: only-ssh-port
ERROR: Assertion failed: ['22/tcp', '139/tcp', '445/tcp'] != ['22/tcp']
[status 1]
liw@havelock$ ./systest -v --target 192.168.122.139 --user tomjon \
    cat ping6-localhost ping-localhost simple-dns-lookup ssh-login
test 1/5: cat
test 2/5: ping6-localhost
test 3/5: ping-localhost
test 4/5: simple-dns-lookup
test 5/5: ssh-login
liw@havelock$ 

The first run failed, because the VM I'm testing against has some extra ports open. Some of the tests will require logging into the machine, via ssh, and for that one needs to specify the user to use.

systool may overlap heavily on system monitoring tools, and possibly the production implementation should be based on those.

I think it's best to design such a tool for the more general purpose of testing whether a system currently works rather than as an integrated part of a more specific larger tool. This lets the tool be useful for other things than just testing specific things about Debian. (The production implementation would then need to not have all the tests hardcoded, of course. SMOP.)

The third tool I have not spent a lot of time on yet. One thing at a time.

Given these tools, one would then need to decide how to use them. The easiest way would be to use them like lintian and piuparts: run them frequently on whatever packages happen to be in testing or unstable or experimental, and put links to the test reports to the PTS, and hope that people fix things.

That is the easiest way to start things.

Once there's a nice set of test cases and scenarios, it may be interesting to think about more aggressive ways: for example, preventing packages from migrating to testing unless the test suite passes with them. If the tests do not pass, one of four things is broken:

  • the package or packages in question
  • other packages already in testing
  • the tests themselves
  • the test environment

If things are set up properly, the last one should be rare. The other three always require manual inspection: it is not possible to automatically know whether the test itself, or the code it tests, is at fault. it is, however, enough to know that something is wrong. If the tests are written well, they should be robust enough to not be the culprits very often.

(Someone wanting to make a rolling distrbution, or even better, making a monthly mini-release, might benefit from this sort of automated testing.)

Hi:

I think the ideas in your last posts about distribution testing are very interesting. When I read this post about systest.py I thought that I need talk you about our testing tool: atheist. Take a look if you can. Perhaps may be useful for your goals.

http://arco.esi.uclm.es/~david.villa/atheist/html/

Comment by David Wed Jun 8 14:12:07 2011

Hi,

I like this idea very much. I have setup some similar automated testing on http://openqa.opensuse.org/ The core perl test code also has a Debian section, that can do a fresh install of Debian/squeeze (and possibly later if not too much changed) using only keyboard input and VGA/video/image output. Of course you can not detect broken drivers for disk/net/graphics - but apart from that, it can keep the core software (bootloader, init-scripts, kernel, upgrade, X11) of a rolling release like openSUSE-Factory or Debian-testing in a useful state.

Ciao Bernhard M.

Comment by bmwiedemann Wed Jun 8 15:27:12 2011

Hi,

first of all your idea sounds good but it is not entirely new. What you suggest is a part of development strategy, called 'continuous integration'. This is already standard in industrial sector and in some open source projects. Anyway, keep on improving the quality of this great project!

Best regards

Martin

Comment by Martin Thu Jun 9 07:29:27 2011

@David: Thanks, atheist looks like a capable tool. I'm a bit worried that it wants me to use a limited DSL and a whole bunch of external scripts to write the tests. I'd prefer the full power of Python, I think, but we'll see.

@Bernhard: Cool, I'm very glad to hear opensuse does this already. I'll have a look at os-autoinst stuff, but since it requires running things under qemu and assumes a VGA, it might not be suitable for all the cases I need vmdebootstrap for. But we'll see.

@Martin: yes, continuous integration is one of the names for what I'm thinking about.

Comment by Lars Wirzenius Thu Jun 9 09:32:26 2011