Welcome to my web log. See the first post for an introduction. See the archive page for all posts, and comments for a feed of comments only. (There is an english language feed if you don't want to see Finnish.)

Archives Tags Recent Comments Moderation policy Main site

All content outside of comments is copyrighted by Lars Wirzenius, and licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License. Comments are copyrighted by their authors.


I have just released version 1.13 of Obnam, my backup program. See the website at http://obnam.org for details on what it does. The new version is available from git (see http://git.liw.fi) and as Debian packages from http://code.liw.fi/debian, and uploaded to Debian, and soon in unstable.

The NEWS file extract below gives the highlights of what's new in this version.

Version 1.13, released 2015-08-01

Bug fixes:

  • Lukáš Poláček found and fixed a repository corruption problem: if obnam forget was interrupted at the wrong moment, it might remove a chunk, but not the reference to it. This would case a future run of obnam forget to crash due to a missing chunk (error code R43272X). obnam forget will now ignore such a missing chunk, since it would've deleted it anyway.

    Lars Wirzenius then changed things so that chunk files are only removed once references to the chunks have been committed.

Improvements:

  • obnam forget now commits changes after each generation it has removed. This means that if the operation is committed, less work is lost. Suggested by Lukáš Poláček, re-implemented by Lars Wirzenius.
Posted Sat Aug 1 17:06:47 2015 Tags:

I've used Reddit for many years. I used it for many years without an account, but eventually I made one. The site has always had its share of unpleasantness, people who're more interested in tearing down than in building. In recent years, it's gotten worse, and getting out of hand.

During the fairly short reign of Ellen Pao as CEO, I found things to be getting better. The site was starting to make it clear that harrassment, for example, was unacceptable. Unsurprisingly, this made some of the nastier people quite upset.

Pao has now resigned, and a new CEO has started. He had an "Ask Me Anything" session yesterday, and made it clear that he's changing things. From my point of view, it's changing to the worse. He made it clear that as long as Reddit itself does not get into legal trouble, and harrassment isn't too overt or particularly public, it's OK now.

I've closed my Reddit account.

Posted Sun Jul 12 06:17:47 2015 Tags:

I have just released version 1.12 of Obnam, my backup program. See the website at http://obnam.org for details on what it does. The new version is available from git (see http://git.liw.fi) and as Debian packages from http://code.liw.fi/debian, and uploaded to Debian, and soon in unstable.

The NEWS file extract below gives the highlights of what's new in this version. It includes the changes for version 1.11, which was a bug fix for 1.10 and not announced separately.

Version 1.12, released 2015-07-08

Bug fixes:

  • Steven Monai reported that using --one-file-system would crash, and it turned out to be a missing import.

  • Jan Niggemann reported that --exclude-caches no longer worked. This was due to a bug introduced when the option was moved to its own plugin (for cleaner code). The bug was masked by another bug, in the Yarn test suite. Both bugs have now been fixed.

Improvements:

  • Jan Niggemann translated the Obnam manpage to German. Due to cliapp not supporting other languages than English yet, the manual page lacks option descriptions.

Version 1.11, released 2015-07-02

  • The 1.10 release failed to correctly include the Green Albatross code, due to a missing line in setup.py. This has been fixed.
Posted Wed Jul 8 14:46:14 2015 Tags:

I have just released version 1.10 of Obnam, my backup program. See the website at http://obnam.org for details on what it does. The new version is available from git (see http://git.liw.fi) and as Debian packages from http://code.liw.fi/debian, and uploaded to Debian, hopefully soon in unstable.

The NEWS file extract below gives the highlights of what's new in this version.

Version 1.10, released 2015-07-01

Major bug fixes:

  • Lars Wirzenius fixed the obnam backup command to lock the whole repository, the same way as obnam forget does, when it removes checkpoint generations. This means that during checkpoint removal, no other client can make a backup, which is unfortunate. To avoid that, set leave-checkpoints = yes in the configuration. That will prevent obnam backup from removing checkpoints.

Minor new features:

  • Lars Wirzenius added the obnam list-formats command to list all repository formats.

  • The default value for the upload-queue-size setting is now 1024, chosen based on some benchmarking made by Lars Wirzenius to balance speed and memory use.

  • An EXPERIMENTAL new repository format, green-albatross, as been introduced. It is not ready for actual use, and is only added so that its code doesn't diverge far from the main line of development.

  • Teemu Hukkanen reported that the Synology NAS device returns EACCES instead of ENOENT when user tries to remove a non-existent file. Obnam now copes with either error code.

Minor fixes:

  • python setup.py build no longer formats the manual page into plain text. This is now done in python setup.py docs instead. The latter is an optional build step, and probably only works on Debian.

  • obnam restore --to=DIR now requires that the directory DIR either doesn't exist, or it is empty when the restore starts. This is to prevent users from restore on top of a running system.

Posted Thu Jul 2 05:10:32 2015 Tags:

Acceptable estimations for software development:

  • Almost certainly doable in less than a day.
  • Probably doable in less than a day, almost certainly not going to take more than three days.
  • Probably doable in less than a week, but who knows?
  • Certainly going to take longer than a week, and nobody can say how long, but if you press me, the estimate is between two weeks and four months.

Reality prevents better accuracy.

Posted Tue May 19 19:50:38 2015

There will be a gathering of Debian people to celebrate the release of jessie this Saturday in Helsinki. For details, see the wiki page. Welcome, everyone.

Posted Wed Apr 22 16:48:53 2015 Tags:

The Debian Project Leader electsions are going on. This is the yearly election for the leader, where members of the project vote for a new leader for a year. The debate this year seemed to me to be quite quiet, and voting activity seems to not be very high, either. Pity. Many years ago, the election period used to be quite energetic, bringing up some quite good viewpoints.

There seems to also not have been the usual repeat of the voting announcement, not sure what's going there. There's time until next Tuesday midnight (in the UTC time zone) to vote. Below are links to the vote page (with instructions for voting) and the (corrected) initial announcment.

I voted for Neil as my top candidate. I think he's got the best background and personality for being the leader of this project of ours.

Posted Thu Apr 9 15:51:40 2015 Tags:

It is with great pleasure and satisfaction that I release version 4.1 of Obnam, my backup program. This version includes a radically innovative approaches to data compression and de-duplication, as well as some other changes and bug fixes.

Major user-visible changes:

  • Obnam now recognises most common image types, and de-duplicates them by substituting a standard picture of a cat or a baby. Statistical research has shown that almost all pictures are of either cats of babies, and most people can't tell cats or babies apart. If you have other kinds of pictures, use the --naughty-pictures option to disable this new feature.

  • Obnam now compresses data by finding a sequence in the value of pi (3.14159...) that matches the data, and stores the offset into pi and the length of the data. This means almost all data can be stored using two BIGNUM integers, plus some computation time to compute the value of pi with necessary precision. The extreme compression level is deemed worth the somewhat slower speed. To disable this new feature, use the --i-like-big-bits-and-i-cannot-lie option.

  • Obnam now uses one-time pad encryption in the repository. It is a form of encryption that is guaranteed to be unbreakable. Given the large amounts of data Obnam users have, the infinitely long value of the mathematical constant e is used as the encryption pad, since it would be bad security practice to use a pad that's shorter than the data being encrypted. To disable this new feature and use the old style encryption using GnuPG, use --i-read-schneier.

Minor user-visible changes:

  • There is a new subcommand obnam resize-disk, which resizes the filesystem on which the backup repository resides. In this version, it works on LVM logical volumes and RAID-0, RAID-5, and RAID-6 drive arrays using mdadm. The subcommand optionally arranges more space by deleting live data files and reducing corresponding LV sizes to make more space for backups. If live data is deleted, the backup generations containing the data is tagged as un-removeable so it's not lost. In the future, the subcommand may get support for purchasing more disk space from popular online storage providers.

  • To reduce unnecessary bloat, the obnam restore subcommand has been removed. It was considered unnecessary, since nobody ever reported any problems with it.

  • Obnam now has a new repository option, --swap-in-repository, which starts a daemon process that holds all backup data in memory. Once the process grows enough, this will result in most of the data to be written to the swap partition. This makes excellent use of the excessively large swap partitions on many Linux systems. This feature does not work on Windows.

Bug fixes:

  • The obnam donate command to send the Obnam developers some money now again works with Bitcoin. There was a bug that broke Obnam's built-in Bitcoin mining software from working.

  • The obnam help command again speaks the user's preferred language (LC_MESSAGES locale setting), rather than Finnish, despite pressure from the Finnish government's office for language export.

Posted Wed Apr 1 04:01:41 2015 Tags:

I have just released version 1.9 of Obnam, my backup program. See the website at http://obnam.org for details. The new version is available from git (see http://git.liw.fi) and as Debian packages from http://code.liw.fi/debian. Due to the freeze of Debian for the jessie release, I've not uploaded this version to Debian yet (not experimental and not backports).

This is the first Obnam release since May 13, 2014, 313 days ago. That's a long time. I make no excuses: Obnam is a hobby project, which I work on when I have the time and energy. The past year has been very /interesting/ year for me, in all sorts of stressful ways: I've changed jobs, moved to another country, and dealt with the loss of a close relative. Because of this, I've not been able to spend as much time on Obnam as I'd like.

The NEWS file extract below gives the highlights of what has happened to Obnam during this time. There's been a lot of things, actually.

My plans for Obnam next are mainly centered around performance. This will require developing a new repository format, to allow things that are not possible with the current format. For example, the current format stores each data chunk in its own file in the repository, and that is quite wasteful when live data files (and therefore their chunks) are quite small.

As preparation for this work, the silly-looking "simple" format has been added, mostly to make sure the internal code infrastructure is ready to support multiple repository formats in the same Obnam version.

Those interested in discussing ways to make Obnam fast should join the obnam-dev mailing list.

Version 1.9, released 2015-03-22

New features:

  • James Vasile changed Obnam so it can backup an individual file, instead of an entire directory.

  • James Vasile added the --include option to Obnam, allowing one to include files that would otherwise be excluded (see --exclude).

  • Carlo Teubner changed obnam fsck to remove unused chunks, if the --fsck-fix or --fsck-rm-unused settings are used. He also made it not check for unused chunks when it's useless to do so, because of various --fsck-skip settings are used.

  • A start of a French translation of the manual by pedrito2.

  • Ian Cambell provided a new Obnam command, obnam kdirstat, which makes the KDE k4dirstat utility be able to show graphically which parts of a backup generation use most space.

  • Lars Wirzenius added the simple repository format, which is for demonstration only. It is much too simplistic to be used for real.

Minor changes:

  • The manual page and obnam --help are now clearer that the --root setting and command line arguments to obnam backup can be SFTP URLs. Thanks to Simone Piccardi for reporting the issue.

  • David Fries filled in the displayed file permission mode bits.

  • Grammar and typo fixes for the obnam.1 manual page, from Jean Jordaan.

  • Tom Chiverton suggested a clarification to the manual page for "obnam mount" to say that each generation is a subdirectory.

  • David Fries changed restore to set the group ownership if possible even when not root. No warnings are issued if the attempt fails.

  • Jan Niggemann added a little to the German translation of the Obnam manual.

  • Lars Wirzenius added the path to the error message about a missing chunk (R43272X).

  • Lars Wirzenius made the message at the end of a backup report more statistics about transfers during the backup.

Bug fixes:

  • The Obnam SFTP plugin would loop infinitely if it lost the connection to the SSH server while creating a temporary file. Itamar Turner-Trauring provided a fix for this.

  • Will Dyson fixed a bug about locking while removing checkpoint generations.

  • Michel Alexandre Salim fixed a Python 2.6 compatibility problem in the unit tests (use of assertRaises as a context manager).

  • Lars Kruse fixed a bug with backing up of overlapping backup roots (e.g., / and /boot), given a test case by Adrien Clerc.

  • Thomas Eschenbacher fixed a bug in the format 6 repository code that would crash when there is an obscure problem and a B-tree code can't be found in the tree.

  • Tom Chiverton pointed out that the manual page was using "obnam restore" instead of "obnam mount" in an example for "obnam mount".

  • The yarn test suite now runs FUSE tests (obnam mount) when fusermount is available, rather than checking for membership in the group fuse. The latter is a Debianism (fixed in Debian jessie).

  • Thomas Waldmann noticed that obnam verify didn't notice that a file had new data, when the modification time was the same. Obnam now notices this.

  • Thomas Waldmann fixed many typos and minor bugs in the source code.

  • Laurence Perkins reported that the Tahoe-LAFS SFTP server returned some stat fields as None. Fixed to change those to be 0 instead.

  • Lars Wirzenius fixed double-downloading of chunks during restores.

Posted Sun Mar 22 16:12:22 2015 Tags:

Each software tool exists to solve some problem. For each problem, there are many possible solutions. Even when different programs basically do the same thing, they can have quite different shapes.

As an example, this morning I was wondering if it would be possible for me to use notmuch to index my entire mail archive. For that, I needed to convert a number of mbox folders to Maildir format. That's a resonably easy problem, given access to suitable programming libraries, but there's an existing tool for that, called mb2md. Unfortunately, it has the wrong shape for my needs.

mb2md doesn't just convert one mbox to one maildir. It's designed to for a mail admin converting all server-side mbox folders for a user into a corresponding structure of Maildir folders. This seems to be necessary when switching IMAP servers. That's a fairly specialised problem, and the program has been written to make it easy for a mail admin to do that.

What I need is part of the problem solved by mb2md and indeed it can do just that part. However, the overall shape of mb2md is such that my part is hard to do. The incantation is quite unintuitive and requires careful reading of the documentation.

The shape of a solution matters. mb2md could easily have been written in a way that provides a simple tool for the single folder conversion, and then a more complex tool for the mail admin's more complicated problem. This would have resulted in a much more general tool, and that would make it easier for more people to use it without much effort.

Mail folder format conversions are a fairly esoteric thing to do. However, the lack of generality is a frequent issue with how programs are designed. It is easy to fall into the trap of writing a highly specialised tool, instead of taking a step back and making a more general purpose tool. The specialised tool will help a small number of people. The general tool will help many people.

Examples of this are fairly common. Debian has a set of tools for making Debian live CDs; they are not quite able to make a bootable hard disk image as well (thus, vmdebootstrap). There's programs for computing cyclomatic complexity, which produce HTML reports, rather than something that can be processed by other programs without too much effort. There's tools for managing address books that are limited to specific cultures, e.g., by hardcoding assumptions of what a person's name looks like (thus, clab).

One of my favourite examples is xargs, which by default does the wrong thing by assuming its input is whitespace delimited. Any whitespace, not just newlines. Any sensible use requires adding the -0 option, which makes xargs that much more tedious to use.

Furthermore, I've often found that the more general tool is simpler. It's functional specification is simpler; it's implementation is simpler, and has fewer special cases; it's user experience is simpler. That's not always true, but often it is.

Sometimes the general solution shape is not worth it. But it's always worth considering whether it might be.

One of the parts of the Unix culture I really like is the preference for general tools that are easy to combine together.

Posted Thu Jan 1 15:31:03 2015 Tags:

For more, see the archive.