Welcome to my web log. See the first post for an introduction. See the archive page for all posts. (There is an english language feed if you don't want to see Finnish.)

Archives Tags Moderation policy Main site

All content outside of comments is copyrighted by Lars Wirzenius, and licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License. Comments are copyrighted by their authors. (No new comments are allowed.)


Another weekend, another big mailing list thread

This weekend, those interested in Debian development have been having a discussion on the debian-devel mailing list about "What can Debian do to provide complex applications to its users?". I'm commenting on that in my blog rather than the mailing list, since this got a bit too long to be usefully done in an email.

directhex's recent blog post "Packaging is hard. Packager-friendly is harder." is also relevant.

The problem

To start with, I don't think the email that started this discussion poses the right question. The problem not really about complex applications, we already have those in Debian. See, for example, LibreOffice. The discussion is really about how Debian should deal with the way some types of applications are developed upstream these days. They're not all complex, and they're not all big, but as usual, things only get interesting when n is big.

A particularly clear example is the whole nodejs ecosystem, but it's not limited to that and it's not limited to web applications. This is also not the first time this topic arises, but we've never come to any good conclusion.

My understanding of the problem is as follows:

A current trend in software development is to use programming languages, often interpreted high level languages, combined with heavy use of third-party libraries, and a language-specific package manager for installing libraries for the developer to use, and sometimes also for the sysadmin installing the software for production to use. This bypasses the Linux distributions entirely. The benefit is that it has allowed ecosystems for specific programming languages where there is very little friction for using libraries written in that language to be used by developers, speeding up development cycles a lot.

When I was young(er) the world was horrible

In comparison, in the old days, which for me means the 1990s, and before Debian took over my computing life, the cycle was something like this:

I would be writing an application, and would need to use a library to make some part of my application easier to write. To use that library, I would download the source code archive of the latest release, and laboriously decipher and follow the build and installation instructions, fix any problems, rinse, repeat. After getting the library installed, I would get back to developing my application. Often the installation of the dependency would take hours, so not a thing to be undertaken lightly.

Debian made some things better

With Debian, and apt, and having access to hundreds upon hundreds of libraries packaged for Debian, this become a much easier process. But only for the things packaged for Debian.

For those developing and publishing libraries, Debian didn't make the process any easier. They would still have to publish a source code archive, but also hope that it would eventually be included in Debian. And updates to libraries in the Debian stable release would not get into the hands of users until the next Debian stable release. This is a lot of friction. For C libraries, that friction has traditionally been tolerable. The effort of making the library in the first place is considerable, so any friction added by Debian is small by comparison.

The world has changed around Debian

In the modern world, developing a new library is much easier, and so also the friction caused by Debian is much more of a hindrance. My understanding is that things now happen more like this:

I'm developing an application. I realise I could use a library. I run the language-specific package manager (pip, cpan, gem, npm, cargo, etc), it downloads the library, installs it in my home directory or my application source tree, and in less than the time it takes to have sip of tea, I can get back to developing my application.

This has a lot less friction than the Debian route. The attraction to application programmers is clear. For library authors, the process is also much streamlined. Writing a library, especially in a high-level language, is fairly easy, and publishing it for others to use is quick and simple. This can lead to a virtuous cycle where I write a useful little library, you use and tell me about a bug or a missing feature, I add it, publish the new version, you use it, and we're both happy as can be. Where this might have taken weeks or months in the old days, it can now happen in minutes.

The big question: why Debian?

In this brave new world, why would anyone bother with Debian anymore? Or any traditional Linux distribution, since this isn't particularly specific to Debian. (But I mention Debian specifically, since it's what I now best.)

A number of things have been mentioned or alluded to in the discussion mentioned above, but I think it's good for the discussion to be explicit about them. As a computer user, software developer, system administrator, and software freedom enthusiast, I see the following reasons to continue to use Debian:

  • The freeness of software included in Debian has been vetted. I have a strong guarantee that software included in Debian is free software. This goes beyond the licence of that particular piece of software, but includes practical considerations like the software can actually be built using free tooling, and that I have access to that tooling, because the tooling, too, is included in Debian.

    • There was a time when Debian debated (with itself) whether it was OK to include a binary that needed to be built using a proprietary C compiler. We decided that it isn't, or not in the main package archive.

    • These days we have the question of whether "minimised Javascript" is OK to be included in Debian, if it can't be produced using tools packaged in Debian. My understanding is that we have already decided that it's not, but the discussion continues. To me, this seems equivalent to the above case.

  • I have a strong guarantee that software in a stable Debian release won't change underneath me in incompatible ways, except in special circumstances. This means that if I'm writing my application and targeting Debian stable, the library API won't change, at least not until the next Debian stable release. Likewise for every other bit of software I use. Having things to continue to work without having to worry is a good thing.

    • Note that a side-effect of the low friction of library development current ecosystems sometimes results in the library API changing. This would mean my application would need to change to adapt to the API change. That's friction for my work.
  • I have a strong guarantee that a dependency won't just disappear. Debian has a large mirror network of its package archive, and there are easy tools to run my own mirror, if I want to. While running my own mirror is possible for other package management systems, each one adds to the friction.

    • The nodejs NPM ecosystem seems to be especially vulnerable to this. More than once packages have gone missing, resulting other projects, which depend on the missing packages, to start failing.

    • The way the Debian project is organised, it is almost impossible for this to happen in Debian. Not only are package removals carefully co-ordinated, packages that are depended on on by other packages aren't removed.

  • I have a strong guarantee that a Debian package I get from a Debian mirror is the official package from Debian: either the actual package uploaded by a Debian developer or a binary package built by a trusted Debian build server. This is because Debian uses cryptographic signatures of the package lists and I have a trust path to the Debian signing key.

    • At least some of the language specific package managers fail to have such a trust path. This means that I have no guarantees that the library package I download today, was the same code uploaded by library author.

    • Note that https does not help here. It protects the transfer from the package manger's web server to me, but makes absolutely no guarantees about the validity of the package. There's been enough cases of the package repository having been attacked that this matters to me. Debian's signatures protect against malicious changes on mirror hosts.

  • I have a reasonably strong guarantee that any problem I find can be fixed, by me or someone else. This is not a strong guarantee, because Debian can't do anything about insanely complicated code, for example, but at least I can rely on being able to rebuild the software. That's a basic requirement for fixing a bug.

  • I have a reasonably strong guarantee that, after upgrading to the next Debian stable release, my stuff continues to work. Upgrades may always break, but at least Debian tests them and treats it as a bug if an upgrade doesn't work, or loses user data.

These are the reasons why I think Debian and the way it packages and distributes software is still important and relevant. (You may disagree. I'm OK with that.)

What about non-Linux free operating systems

I don't have much personal experience with non-Linux systems, so I've only talked about Linux here. I don't think the BSD systems, for example, are actually all that different from Linux distributions. Feel free to substitute "free operating system" for "Linux" throughout.

What is it Debian tries to do, anyway?

The previous section is one level of abstraction too low. It's important, but it's beneficial take a further step back and consider what it is Debian actually tries to achieve. Why does Debian exist?

The primary goal of Debian is to enable its users to use their computers using only free software. The freedom aspect is fundamentally important and a principle that Debian is not willing to compromise on.

The primary approach to achieve this goal is to produce a "distribution" of free software, to make installing a free software operating system and applications, and to maintain such a computer, a feasible thing for our users.

This leads to secondary goals, such as:

  • Making it easy to install Debian on a computer. (For values of easy that should be compared to toggling boot sector bytes manually.)

    We've achieved this, though of course things can always be improved.

  • Making it easy to install applications on a computer with Debian. (Again, compared to the olden days, when that meant configuring and compiling everything from scratch, with no guidance.)

    We've achieved this, too.

  • A system with Debian installed is reasonably secure, and easy to keep reasonably secure.

    This means Debian will provide security support for software it distributes, and has ways in which to install security fixes. We've achieved this, though this, too, can always be improved.

  • A system with Debian installed should keep working for extended periods of time. This is important to make using Debian feasible. If it takes too much effort to have a computer running Debian, it's not feasible for many people to that, and then Debian fails its primary goal.

    This is why Debian has stable releases with years of security support. We've achieved this.

The disconnect

On the one hand, we have Debian, which pretty much has achieved what I declare to be its primary goal. On the other hand, a lot of developers now expect much less friction than what Debian offers. This is a disconnect that is cause, I believe, the debian-devel discussion, and variants of that discussion all over the open source landscape.

These discussions often go one of two ways, depending on which community is talking.

  • In the distribution and more old-school communities, the low-friction approach of language-specific package managers is often considered to be a horror, and an abandonment of all the good things that the Linux world has achieved. "Young saplings, who do they think they are, all agile and bendy and with no principles at all, get off our carefully cultivated lawn."

  • In the low-friction communities, Linux distributions are something only old, stodgy, boring people care about. "Distributions are dead, they only get in the way, nobody bothers with them anymore."

This disconnect will require effort by both sides to close the gap.

On the one hand, so much new software is being written by people using the low-friction approach, that Linux distributions may fail to attract new users and especially new developers, and this will hurt them and their users.

On the other hand, the low-friction people may be sawing the tree branch they're sitting on. If distributions suffer, the base on which low-friction development relies on, will wither away, and we'll be left with running low-friction free software on proprietary platforms.

Things for low-friction proponents to improve

Here's a few things I've noticed that go wrong in the various communities oriented towards the low-friction approach.

  • Not enough care is given to copyright licences. This is a boring topic, but it's the legal basis that all of free software and open source is based on. If copyright licences are violated, or copyrights are not respected, or copyrights or licences are not expressed well enough, or incompatible licences are mixed, the result is very easily not actually either free software or open source.

    It's boring, but be sufficiently pedantic here. It's not even all that difficult.

  • Do provide actual source. It seems quite a number of Javascript projects only distribute "minimised" versions of code. That's not actually source code, any more than, say, Java byte code is, even if a de-compiler can make it kind of editable. If source isn't available, it's not free software or open source.

  • Please try to be careful with API changes. What used to work should still work with a new version of a library. If you need to make an API change that breaks compatibility, find a way to still support those who rely on the old API, using whatever mechanisms available to you. Ideally, support the old API for a long time, years. Two weeks is really not enough.

  • Do be careful with your dependencies. Locking down dependencies on a specific version makes things difficult for distributions, because they often can only provide one or a very small number of versions of any one package. Likewise, avoid embedding dependencies in your own source tree, because that explodes the amount of work distributions have to do to patch security holes. (No, distributions can't rely on tends of thousands of upstream to each do the patching correctly and promptly.)

Things for Debian to improve

There are many sources of friction that come from Debian itself. Some of them are unavoidable: if upstream projects don't take care of copyright licence hygiene, for example, then Debian will impose that on them and that can't be helped. Other things are more avoidable, however. Here's a list off the top of my head:

  • A lot of stuff in Debian happens over email, which might happen using a web application, if it were not for historical reasons. For example, the Debian bug tracking system (bugs.debian.org) requires using email, and given delays caused by spam filtering, this can cause delays of more than fifteen minutes. This is a source of friction that could be avoided.

  • Likewise, Debian voting happens over email, which can cause friction from delays.

  • Debian lets its package maintainers use any version control system, any packaging helper tooling, and packaging workflow they want. This means that every package is, to some extent, a new territory for someone other than its primary maintainers. Even when the same tools are used, they can be used in variety of different ways. Consistency should reduce friction.

  • There's too little infrastructure to do things like collecting copyright information into debian/control. This really shouldn't be a manual task.

  • Debian packaging uses arcane file formats, loosely based on email headers. More standard formats might make things easier, and reduce friction.

  • There's not enough automated testing, or it's too hard to use, making it too hard to know if a new package will work, or a modified package doesn't break anything that used to work.

  • Overall, making a Debian package tends to require too much manual work. Packaging helpers like dh certainly help, but not enough. I don't have a concrete suggestion how to reduce it, but it seems like an area Debian should work on.

  • Maybe consider supporting installing multiple versions of a package, even if only for, say, Javascript libraries. Possibly with a caveat that only specific versions will be security supported, and a way to alert the sysadmin if vulnerable packages are installed. Dunno, this is a difficult one.

  • Maybe consider providing something where the source package gets automatically updated to every new upstream release (or commit), with binary packages built from that, and those automatically tested. This might be a separate section of the archive, and packages would be included into the normal part of the archive only by manual decision.

  • There's more, but mostly not relevant to this discussion, I think. For example, Debian is a big project, and the mere size is a cause of friction.

Comments?

I don't allow comments on my blog, and I don't want to debate this in private. If you have comments on anything I've said above, please post to the debian-devel mailing list. Thanks.

Baits

To ensure I get some responses, I will leave these bait here:

Anyone who's been programming less than 12332 days is a young whipper-snapper and shouldn't be taken seriously.

Depending on the latest commit of a library is too slow. The proper thing to do for really fast development is to rely on the version in the unsaved editor buffer of the library developer.

You shouldn't have read any of this. I'm clearly a troll.

Posted Sat Feb 17 15:05:00 2018 Tags:

My company, QvarnLabs Ab, has today released the first alpha version of our new product, Qvisqve. Below is the press release. I wrote pretty much all the code, and it's free software (AGPL+).


Helsinki, Finland - 2018-02-09. QvarnLabs Ab is happy to announce the first public release of Qvisqve, an authorisation server and identity provider for web and mobile applications. Qvisqve aims to be secure, lightweight, fast, and easy to manage. "We have big plans for Qvisqve, and helping customers' manage cloud identities" says Kaius Häggblom, CEO of QvarnLabs.

In this alpha release, Qvisqve supports the OAuth2 client credentials grant, which is useful for authenticating and authorising automated systems, including IoT devices. Qvisqve can be integrated with any web service that can use OAuth2 and JWT tokens for access control.

Future releases will provide support for end-user authentication by implementing the OpenID Connect protocol, with a variety of authentication methods, including username/password, U2F, TOTP, and TLS client certificates. Multi-factor authentication will also be supported. "We will make Qvisqve be flexible for any serious use case", says Lars Wirzenius, software architect at QvarnLabs. "We hope Qvisqve will be useful to the software freedom ecosystem in general" Wirzenius adds.

Qvisqve is developed and supported by QvarnLabs Ab, and works together with the Qvarn software, which is award-winning free and open-source software for managing sensitive personal information. Qvarn is in production use in Finland and Sweden and manages over a million identities. Both Qvisqve and Qvarn are released under the Affero General Public Licence.

Posted Fri Feb 9 16:30:00 2018 Tags:

TL;DR: Ick is a continuous integration or CI system. See http://ick.liw.fi/ for more information.

More verbose version follows.

First public version released

The world may not need yet another continuous integration system (CI), but I do. I've been unsatisfied with the ones I've tried or looked at. More importantly, I am interested in a few things that are more powerful than what I've ever even heard of. So I've started writing my own.

My new personal hobby project is called ick. It is a CI system, which means it can run automated steps for building and testing software. The home page is at http://ick.liw.fi/, and the download page has links to the source code and .deb packages and an Ansible playbook for installing it.

I have now made the first publicly advertised release, dubbed ALPHA-1, version number 0.23. It is of alpha quality, and that means it doesn't have all the intended features and if any of the features it does have work, you should consider yourself lucky.

Invitation to contribute

Ick has so far been my personal project. I am hoping to make it more than that, and invite contributions. See the governance page for the constitution, the getting started page for tips on how to start contributing, and the contact page for how to get in touch.

Architecture

Ick has an architecture consisting of several components that communicate over HTTPS using RESTful APIs and JSON for structured data. See the architecture page for details.

Manifesto

Continuous integration (CI) is a powerful tool for software development. It should not be tedious, fragile, or annoying. It should be quick and simple to set up, and work quietly in the background unless there's a problem in the code being built and tested.

A CI system should be simple, easy, clear, clean, scalable, fast, comprehensible, transparent, reliable, and boost your productivity to get things done. It should not be a lot of effort to set up, require a lot of hardware just for the CI, need frequent attention for it to keep working, and developers should never have to wonder why something isn't working.

A CI system should be flexible to suit your build and test needs. It should support multiple types of workers, as far as CPU architecture and operating system version are concerned.

Also, like all software, CI should be fully and completely free software and your instance should be under your control.

(Ick is little of this yet, but it will try to become all of it. In the best possible taste.)

Dreams of the future

In the long run, I would ick to have features like ones described below. It may take a while to get all of them implemented.

  • A build may be triggered by a variety of events. Time is an obvious event, as is source code repository for the project changing. More powerfully, any build dependency changing, regardless of whether the dependency comes from another project built by ick, or a package from, say, Debian: ick should keep track of all the packages that get installed into the build environment of a project, and if any of their versions change, it should trigger the project build and tests again.

  • Ick should support building in (or against) any reasonable target, including any Linux distribution, any free operating system, and any non-free operating system that isn't brain-dead.

  • Ick should manage the build environment itself, and be able to do builds that are isolated from the build host or the network. This partially works: one can ask ick to build a container and run a build in the container. The container is implemented using systemd-nspawn. This can be improved upon, however. (If you think Docker is the only way to go, please contribute support for that.)

  • Ick should support any workers that it can control over ssh or a serial port or other such neutral communication channel, without having to install an agent of any kind on them. Ick won't assume that it can have, say, a full Java run time, so that the worker can be, say, a micro controller.

  • Ick should be able to effortlessly handle very large numbers of projects. I'm thinking here that it should be able to keep up with building everything in Debian, whenever a new Debian source package is uploaded. (Obviously whether that is feasible depends on whether there are enough resources to actually build things, but ick itself should not be the bottleneck.)

  • Ick should optionally provision workers as needed. If all workers of a certain type are busy, and ick's been configured to allow using more resources, it should do so. This seems like it would be easy to do with virtual machines, containers, cloud providers, etc.

  • Ick should be flexible in how it can notify interested parties, particularly about failures. It should allow an interested party to ask to be notified over IRC, Matrix, Mastodon, Twitter, email, SMS, or even by a phone call and speech syntethiser. "Hello, interested party. It is 04:00 and you wanted to be told when the hello package has been built for RISC-V."

Please give feedback

If you try ick, or even if you've just read this far, please share your thoughts on it. See the contact page for where to send it. Public feedback is preferred over private, but if you prefer private, that's OK too.

Posted Mon Jan 22 20:11:00 2018 Tags:

In mid-2017, I decided to experiment with using pull-requests (PRs) on Github. I've read that they make development using git much nicer. The end result of my experiment is that I'm not going to adopt a PR based workflow.

The project I chose for my experiment is vmdb2, a tool for generating disk images with Debian. I put it up on Github, and invited people to send pull requests or patches, as they wished. I got a bunch of PRs, mostly from two people. For a little while, there was a flurry of activity. It has has now calmed down, I think primarily because the software has reached a state where the two contributors find it useful and don't need it to be fixed or have new features added.

This was my first experience with PRs. I decided to give it until the end of 2017 until I made any conclusions. I've found good things about PRs and a workflow based on them:

  • they reduce some of the friction of contributing, making it easier for people to contribute; from a contributor point of view PRs certainly seem like a better way than sending patches over email or sending a message asking to pull from a remote branch
  • merging a PR in the web UI is very easy

I also found some bad things:

  • I really don't like the Github UI or UX, in general or for PRs in particular
  • especially the emails Github sends about PRs seemed useless beyond a basic "something happened" notification, which prompt me to check the web UI
  • PRs are a centralised feature, which is something I prefer to avoid; further, they're tied to Github, which is something I object to on principle, since it's not free software
    • note that Gitlab provides support for PRs as well, but I've not tried it; it's an "open core" system, which is not fully free software in my opinion, and so I'm wary of Gitlab; it's also a centralised solution
    • a "distributed PR" system would be nice
  • merging a PR is perhaps too easy, and I worry that it leads me to merging without sufficient review (that is of course a personal flaw)

In summary, PRs seem to me to prioritise making life easier for contributors, especially occasional contributors or "drive-by" contributors. I think I prefer to care more about frequent contributors, and myself as the person who merges contributions. For now, I'm not going to adopt a PR based workflow.

(I expect people to mock me for this.)

Posted Tue Jan 9 17:25:00 2018 Tags:

I wrote these when I woke up one night and had trouble getting back to sleep, and spent a while in a very philosophical mood thinking about life, success, and productivity as a programmer.

Imagine you're developing a piece of software.

  • You don't know it works, unless you've used it.

  • You don't know it's good, unless people tell you it is.

  • You don't know you can do it, unless you've already done it.

  • You don't know it can handle a given load, unless you've already tried it.

  • The real bottlenecks are always a surprise, the first time you measure.

  • It's not ready for production until it's been used in production.

  • Your automated tests always miss something, but with only manual tests, you always miss more.

Posted Sun Dec 17 11:09:00 2017 Tags:

A unit is a part of your program you can test in isolation. You write unit tests to test all aspects of it that you care about. If all your unit tests pass, you should know that your unit works well.

Integration tests are for testing that when your various well-tested, high quality units are combined, integrated, they work together. Integration tests test the integration, not the individual units.

You could think of building a car. Your units are the ball bearings, axles, wheels, brakes, etc. Your unit tests for the ball bearings might test, for example, that they can handle a billion rotations, at various temperatures, etc. Your integration test would assume the ball bearings work, and should instead test that the ball bearings are installed in the right way so that the car, as whole, can run a kilometers, and accelerate and brake every kilometer, uses only so much fuel, produces only so much pollution, and doesn't kill passengers in case of a crash.

Posted Mon Nov 13 00:09:00 2017 Tags:

GDPR is a new EU regulation for privacy. The name is short for "General Data Protection Regulation" and it covers all organisations that handle personal data of EU citizens and EU residents. It will become enforceable May 25, 2018 (Towel Day). This will affect Debian. I think it's time for Debian to start working on compliance, mainly because the GDPR requires sensible things.

I'm not an expert on GDPR legislation, but here's my understanding of what we in Debian should do:

  • do a privacy impact assessment, to review and document what data we have, and collect, and what risks that has for the people whose personal data it is if the data leaks

  • only collect personal information for specific purposes, and only use the data for those purposes

  • get explicit consent from each person for all collection and use of their personal information; archive this consent (e.g., list subscription confirmations)

  • allow each person to get a copy of all the personal information we have about them, in a portable manner, and let them correct it if it's wrong

  • allow people to have their personal information erased

  • maybe appoint one or more data protection officers (not sure this is required for Debian)

There's more, but let's start with those.

I think Debian has at least the following systems that will need to be reviewed with regards to the GDPR:

  • db.debian.org - Debian project members, "Debian developers"
  • nm.debian.org
  • contributors.debian.org
  • lists.debian.org - at least membership lists, maybe archives
  • possibly irc servers and log files
  • mail server log files
  • web server log files
  • version control services and repositories

There may be more; these are just off the top of my head.

I expect that mostly Debian will be OK, but we can't just assume that.

Posted Tue Oct 10 18:08:00 2017 Tags:

How do you attract contributors to a new free software project?

I'm in the very early stages of a new personal project. It is irrelevant for this blog post what the new project actually is. Instead, I am thinking about the following question:

Do I want the project to be mainly for myself, and maybe a handful of others, or do I want to try to make it a more generally useful, possibly even a well-known, popular project? In other words, do I want to just solve a specific problem I have or try to solve it for a large group of people?

If it's a personal project, I'm all set. I can just start writing code. (In fact, I have.) If it's the latter, I'll need to attract contributions from others, and how do I do that?

I asked that question on Twitter and Mastodon and got several suggestions. This is a summary of those, with some editorialising from me.

  • The most important thing is probably that the project should aim for something that interests other people. The more people it interests, the easier it will be to attract contributors. This should be written up and displayed prominently: what does (or will) the software do and what can it e used for.

  • Having something that kind of works, and easy to improve, seems to also be key. An empty project is daunting to do anything with. Part of this is that the software the project is producing should be easy to install and get running. It doesn't have to be fully featured. It doesn't even have to be alpha level quality. It needs to do something.

    If the project is about producing a spell checker, say, and it doesn't even try to read an input file, it's probably too early for anyone else to contribute. A spell checker that lists every word in the input file as badly spelt is probably more attractive to contribute to.

  • It helps to document where a new contributor should start, and how they would submit their contribution. A list of easy things to work on may also help. Having a roadmap of near future developent steps and a long-term vision will make things easier. Having an architectural document to explain how the system hangs together will help.

  • A welcoming, constructive atmosphere helps. People should get quick feedback to questions, issues, patches, in order to build momentum. Make it fun for people to contibute, and they'll contribute more.

  • A public source code repository, and a public ticketing system, and public discussion forums (mailing lists, web forums, IRC channels, etc) will help.

  • Share the power in the project. Give others the power to make decisions, or merge things from other contributors. Having a clear, functioning governance structure from the start helps.

I don't know if these things are all correct, or that they're enough to grow a successful, popular project.

Karl Foger'l seminal book Producing Open Source Software should also be mentioned.

Posted Sun Oct 1 10:21:00 2017 Tags:

This is a difficult announcement to write. The summary is if you use Obnam you should switch to another backup program in the coming months.

The first commit to Obnam's current code base is this:

commit 7eaf5a44534ffa7f9c0b9a4e9ee98d312f2fcb14
Author: Lars Wirzenius <liw@iki.fi>
Date:   Wed Sep 6 18:35:52 2006 +0300

    Initial commit.

It's followed by over 5200 more commits until the latest one, which is from yesterday. The NEWS file contains 58 releases. There are 20761 lines of Python, 15384 words in the English language manual, with translations in German and French. The yarn test suite, which is a kind of a manual, is another 13382 words in English and pseudo-English. That's a fair bit of code and prose. Not all of it mine, I've had help from some wonderful people. But most of it mine.

I wrote all of that because backups were fun. It was pleasing to use my own program to guarantee the safety of my own data. The technical challenges of implmenting the kind of backup program I wanted were interesting, and solving interesting problems is a big part of why I am a programmer.

Obnam has a kind user base. It's not a large user base: the Debian "popularity contest" service estimates it at around 500. But it's a user base that is kind and has treated me well. I have tried to reciprocate.

Unfortunately, I have not had fun while developing Obnam for some time now. This has changed. A few years ago, I lived in Manchester, UK, and commuted by train to work. It was a short train ride, about 15 minutes. At times I would sit on the floor with my laptop on my knees, writing code or the manual. Back then Obnam was a lot of fun. I was excited, and enthusiastic.

In the past two years or so, I've not been able to feel that excitement again. My native language, Finnish, has an expression describing unpleasant tasks: something is as much fun as drinking tar. That describes Obnam in recent years for me.

Obnam has not turned out well, from a maintainability point of view. It seems that every time I try to fix something, I break something else. Usuaully what breaks is speed or memory use: Obnam gets slower or starts using even more memory.

For several years now I've been working on a new repository format for Obnam, code names GREEN ALBATROSS. It was meant to solve Obnam's problems as far as extensibility, performance, and resource use were concerned. It seems to have failed.

I'm afraid I've had enough. I'm going to retire Obnam as a project and as a program, and move on to doing something else, so I can feel excitement and pleasure again.

After some careful thought, I fear that the maintainability problems of Obnam can realistically only be solved by a complete rewrite from scratch, and I'm not up to doing that.

If you use Obnam, you should migrate to some other backup solution. Don't worry, you have until the end of the year. I will be around and I intend to fix any serious bugs in Obnam; in particular, security flaws. But you should start looking for a replacement sooner rather than later.

I will be asking Obnam to be removed from the Debian unstable and testing branches. The next Debian release (buster, Debian 10) won't include Obnam.

The Obnam mailing lists are kindly hosted by Daniel Silverstone, and they will remain, but later this year I will change them to be moderated. The Obnam git repository will remain. The web site will remain, but I will add a note that Obnam is no longer maintained. Other Obnam online resources may disappear.

If you would like to take over the Obnam project, and try to resolve the various issues, please contact me to discuss that.

Thank you, and may you never need to restore.

Posted Sun Aug 13 21:48:00 2017 Tags:

I realised recently that my laptop isn't set up to send TRIM or DISCARD commands to its SSD. That means the SSD firmware has a harder time doing garbage collection (see whe linked Wikipedia page for more details.)

After some searching, I found two articles by Christopher Smart: one, update. Those, plus some addition reading of documentation, and a little experimentation, allowed me to do this. Since the information is a bit scattered, here's the details, for Debian stretch, as much for my own memory as to make sure this is collected into one place.

  • Append ,discard to the fourth column on relevant lines in /etc/crypttab. For me, this means the fourth column should be luks,discard.
  • Change in /etc/lvm/lvm.conf that says issue_discards to enable it (assign 1 instead of 0).
  • Append rd.luks.options=discard to the GRUB_CMDLINE_LINUX_DEFAULT value in /etc/default/grub
  • Run sudo update-grub
  • Run sudo update-initramfs -u
  • Reboot.
  • Run sudo fstrim -av - if this works, you're good! If it gives you errors, then you get to debug. I have no idea what I'm talking about.
  • Copy /usr/share/doc/util-linux/examples/fstrim.* to /etc/systemd/system and run sudo systemctl enable fstrim.timer. This will tell systemd to run fstrim every week. (If you don't use systemd you'll have to adapt the systemd bits mentioned here. I've no idea how.)

Note that it seems to be a possible information leak to TRIM encryped devices. I don't know the details, but if that bothers you, don't do it.

I don't know of any harmful effects for enabling TRIM for everything, except the crypto bit above, so I wonder if it wouldn't make sense for the Debian installer to do this by default.

Posted Sat Aug 5 20:58:00 2017 Tags: