Feed for Planet Debian.

In mid-2017, I decided to experiment with using pull-requests (PRs) on Github. I've read that they make development using git much nicer. The end result of my experiment is that I'm not going to adopt a PR based workflow.

The project I chose for my experiment is vmdb2, a tool for generating disk images with Debian. I put it up on Github, and invited people to send pull requests or patches, as they wished. I got a bunch of PRs, mostly from two people. For a little while, there was a flurry of activity. It has has now calmed down, I think primarily because the software has reached a state where the two contributors find it useful and don't need it to be fixed or have new features added.

This was my first experience with PRs. I decided to give it until the end of 2017 until I made any conclusions. I've found good things about PRs and a workflow based on them:

  • they reduce some of the friction of contributing, making it easier for people to contribute; from a contributor point of view PRs certainly seem like a better way than sending patches over email or sending a message asking to pull from a remote branch
  • merging a PR in the web UI is very easy

I also found some bad things:

  • I really don't like the Github UI or UX, in general or for PRs in particular
  • especially the emails Github sends about PRs seemed useless beyond a basic "something happened" notification, which prompt me to check the web UI
  • PRs are a centralised feature, which is something I prefer to avoid; further, they're tied to Github, which is something I object to on principle, since it's not free software
    • note that Gitlab provides support for PRs as well, but I've not tried it; it's an "open core" system, which is not fully free software in my opinion, and so I'm wary of Gitlab; it's also a centralised solution
    • a "distributed PR" system would be nice
  • merging a PR is perhaps too easy, and I worry that it leads me to merging without sufficient review (that is of course a personal flaw)

In summary, PRs seem to me to prioritise making life easier for contributors, especially occasional contributors or "drive-by" contributors. I think I prefer to care more about frequent contributors, and myself as the person who merges contributions. For now, I'm not going to adopt a PR based workflow.

(I expect people to mock me for this.)

Posted Tue Jan 9 17:25:00 2018 Tags:

I wrote these when I woke up one night and had trouble getting back to sleep, and spent a while in a very philosophical mood thinking about life, success, and productivity as a programmer.

Imagine you're developing a piece of software.

  • You don't know it works, unless you've used it.

  • You don't know it's good, unless people tell you it is.

  • You don't know you can do it, unless you've already done it.

  • You don't know it can handle a given load, unless you've already tried it.

  • The real bottlenecks are always a surprise, the first time you measure.

  • It's not ready for production until it's been used in production.

  • Your automated tests always miss something, but with only manual tests, you always miss more.

Posted Sun Dec 17 11:09:00 2017 Tags:

A unit is a part of your program you can test in isolation. You write unit tests to test all aspects of it that you care about. If all your unit tests pass, you should know that your unit works well.

Integration tests are for testing that when your various well-tested, high quality units are combined, integrated, they work together. Integration tests test the integration, not the individual units.

You could think of building a car. Your units are the ball bearings, axles, wheels, brakes, etc. Your unit tests for the ball bearings might test, for example, that they can handle a billion rotations, at various temperatures, etc. Your integration test would assume the ball bearings work, and should instead test that the ball bearings are installed in the right way so that the car, as whole, can run a kilometers, and accelerate and brake every kilometer, uses only so much fuel, produces only so much pollution, and doesn't kill passengers in case of a crash.

Posted Mon Nov 13 00:09:00 2017 Tags:

GDPR is a new EU regulation for privacy. The name is short for "General Data Protection Regulation" and it covers all organisations that handle personal data of EU citizens and EU residents. It will become enforceable May 25, 2018 (Towel Day). This will affect Debian. I think it's time for Debian to start working on compliance, mainly because the GDPR requires sensible things.

I'm not an expert on GDPR legislation, but here's my understanding of what we in Debian should do:

  • do a privacy impact assessment, to review and document what data we have, and collect, and what risks that has for the people whose personal data it is if the data leaks

  • only collect personal information for specific purposes, and only use the data for those purposes

  • get explicit consent from each person for all collection and use of their personal information; archive this consent (e.g., list subscription confirmations)

  • allow each person to get a copy of all the personal information we have about them, in a portable manner, and let them correct it if it's wrong

  • allow people to have their personal information erased

  • maybe appoint one or more data protection officers (not sure this is required for Debian)

There's more, but let's start with those.

I think Debian has at least the following systems that will need to be reviewed with regards to the GDPR:

  • db.debian.org - Debian project members, "Debian developers"
  • nm.debian.org
  • contributors.debian.org
  • lists.debian.org - at least membership lists, maybe archives
  • possibly irc servers and log files
  • mail server log files
  • web server log files
  • version control services and repositories

There may be more; these are just off the top of my head.

I expect that mostly Debian will be OK, but we can't just assume that.

Posted Tue Oct 10 18:08:00 2017 Tags:

How do you attract contributors to a new free software project?

I'm in the very early stages of a new personal project. It is irrelevant for this blog post what the new project actually is. Instead, I am thinking about the following question:

Do I want the project to be mainly for myself, and maybe a handful of others, or do I want to try to make it a more generally useful, possibly even a well-known, popular project? In other words, do I want to just solve a specific problem I have or try to solve it for a large group of people?

If it's a personal project, I'm all set. I can just start writing code. (In fact, I have.) If it's the latter, I'll need to attract contributions from others, and how do I do that?

I asked that question on Twitter and Mastodon and got several suggestions. This is a summary of those, with some editorialising from me.

  • The most important thing is probably that the project should aim for something that interests other people. The more people it interests, the easier it will be to attract contributors. This should be written up and displayed prominently: what does (or will) the software do and what can it e used for.

  • Having something that kind of works, and easy to improve, seems to also be key. An empty project is daunting to do anything with. Part of this is that the software the project is producing should be easy to install and get running. It doesn't have to be fully featured. It doesn't even have to be alpha level quality. It needs to do something.

    If the project is about producing a spell checker, say, and it doesn't even try to read an input file, it's probably too early for anyone else to contribute. A spell checker that lists every word in the input file as badly spelt is probably more attractive to contribute to.

  • It helps to document where a new contributor should start, and how they would submit their contribution. A list of easy things to work on may also help. Having a roadmap of near future developent steps and a long-term vision will make things easier. Having an architectural document to explain how the system hangs together will help.

  • A welcoming, constructive atmosphere helps. People should get quick feedback to questions, issues, patches, in order to build momentum. Make it fun for people to contibute, and they'll contribute more.

  • A public source code repository, and a public ticketing system, and public discussion forums (mailing lists, web forums, IRC channels, etc) will help.

  • Share the power in the project. Give others the power to make decisions, or merge things from other contributors. Having a clear, functioning governance structure from the start helps.

I don't know if these things are all correct, or that they're enough to grow a successful, popular project.

Karl Foger'l seminal book Producing Open Source Software should also be mentioned.

Posted Sun Oct 1 10:21:00 2017 Tags:

This is a difficult announcement to write. The summary is if you use Obnam you should switch to another backup program in the coming months.

The first commit to Obnam's current code base is this:

commit 7eaf5a44534ffa7f9c0b9a4e9ee98d312f2fcb14
Author: Lars Wirzenius <liw@iki.fi>
Date:   Wed Sep 6 18:35:52 2006 +0300

    Initial commit.

It's followed by over 5200 more commits until the latest one, which is from yesterday. The NEWS file contains 58 releases. There are 20761 lines of Python, 15384 words in the English language manual, with translations in German and French. The yarn test suite, which is a kind of a manual, is another 13382 words in English and pseudo-English. That's a fair bit of code and prose. Not all of it mine, I've had help from some wonderful people. But most of it mine.

I wrote all of that because backups were fun. It was pleasing to use my own program to guarantee the safety of my own data. The technical challenges of implmenting the kind of backup program I wanted were interesting, and solving interesting problems is a big part of why I am a programmer.

Obnam has a kind user base. It's not a large user base: the Debian "popularity contest" service estimates it at around 500. But it's a user base that is kind and has treated me well. I have tried to reciprocate.

Unfortunately, I have not had fun while developing Obnam for some time now. This has changed. A few years ago, I lived in Manchester, UK, and commuted by train to work. It was a short train ride, about 15 minutes. At times I would sit on the floor with my laptop on my knees, writing code or the manual. Back then Obnam was a lot of fun. I was excited, and enthusiastic.

In the past two years or so, I've not been able to feel that excitement again. My native language, Finnish, has an expression describing unpleasant tasks: something is as much fun as drinking tar. That describes Obnam in recent years for me.

Obnam has not turned out well, from a maintainability point of view. It seems that every time I try to fix something, I break something else. Usuaully what breaks is speed or memory use: Obnam gets slower or starts using even more memory.

For several years now I've been working on a new repository format for Obnam, code names GREEN ALBATROSS. It was meant to solve Obnam's problems as far as extensibility, performance, and resource use were concerned. It seems to have failed.

I'm afraid I've had enough. I'm going to retire Obnam as a project and as a program, and move on to doing something else, so I can feel excitement and pleasure again.

After some careful thought, I fear that the maintainability problems of Obnam can realistically only be solved by a complete rewrite from scratch, and I'm not up to doing that.

If you use Obnam, you should migrate to some other backup solution. Don't worry, you have until the end of the year. I will be around and I intend to fix any serious bugs in Obnam; in particular, security flaws. But you should start looking for a replacement sooner rather than later.

I will be asking Obnam to be removed from the Debian unstable and testing branches. The next Debian release (buster, Debian 10) won't include Obnam.

The Obnam mailing lists are kindly hosted by Daniel Silverstone, and they will remain, but later this year I will change them to be moderated. The Obnam git repository will remain. The web site will remain, but I will add a note that Obnam is no longer maintained. Other Obnam online resources may disappear.

If you would like to take over the Obnam project, and try to resolve the various issues, please contact me to discuss that.

Thank you, and may you never need to restore.

Posted Sun Aug 13 21:48:00 2017 Tags:

I realised recently that my laptop isn't set up to send TRIM or DISCARD commands to its SSD. That means the SSD firmware has a harder time doing garbage collection (see whe linked Wikipedia page for more details.)

After some searching, I found two articles by Christopher Smart: one, update. Those, plus some addition reading of documentation, and a little experimentation, allowed me to do this. Since the information is a bit scattered, here's the details, for Debian stretch, as much for my own memory as to make sure this is collected into one place.

  • Append ,discard to the fourth column on relevant lines in /etc/crypttab. For me, this means the fourth column should be luks,discard.
  • Change in /etc/lvm/lvm.conf that says issue_discards to enable it (assign 1 instead of 0).
  • Append rd.luks.options=discard to the GRUB_CMDLINE_LINUX_DEFAULT value in /etc/default/grub
  • Run sudo update-grub
  • Run sudo update-initramfs -u
  • Reboot.
  • Run sudo fstrim -av - if this works, you're good! If it gives you errors, then you get to debug. I have no idea what I'm talking about.
  • Copy /usr/share/doc/util-linux/examples/fstrim.* to /etc/systemd/system and run sudo systemctl enable fstrim.timer. This will tell systemd to run fstrim every week. (If you don't use systemd you'll have to adapt the systemd bits mentioned here. I've no idea how.)

Note that it seems to be a possible information leak to TRIM encryped devices. I don't know the details, but if that bothers you, don't do it.

I don't know of any harmful effects for enabling TRIM for everything, except the crypto bit above, so I wonder if it wouldn't make sense for the Debian installer to do this by default.

Posted Sat Aug 5 20:58:00 2017 Tags:

A couple of people objected to having Yakking on Planet Debian, so I've removed it.

Posted Wed Jul 19 08:53:00 2017 Tags:

In a case of blatant self-promotion, I am going to add the Yakking RSS feed to the Planet Debian aggregation. (But really because I think some of the readership of Planet Debian may be interested in the content.)

Yakking is a group blog by a few friends aimed at new free software contributors. From the front page description:

Welcome to Yakking.

This is a blog for topics relevant to someone new to free software development. We assume you are already familiar with computers, and are curious about participating in the production of free software. You don't need to be a programmer: software development requires a wide variety of skills, and you can be a valued core contributor to a project without being a programmer.

If anyone objects, please let me know.

Posted Thu Jul 13 13:01:00 2017

I've just released version 1.22 of Obnam, my backup application. It is the first release for this year. Packages are available on code.liw.fi/debian and in Debian unstable, and source is in git. A summary of the user-visible changes is below.

For those interested in living dangerously and accidentally on purpose deleting all their data, the link below shows that status and roadmap for FORMAT GREEN ALBATROSS. http://distix.obnam.org/obnam-dev/182bd772889544d5867e1a0ce4e76652.html

Version 1.22, released 2017-06-25

  • Lars Wirzenius made Obnam log the full text of an Obnam exception/error message with more than one line. In particular this applies to encryption error messages, which now log the gpg output.

  • Lars Wirzenius made obnam restore require absolute paths for files to be restored.

  • Lars Wirzenius made obnam forget use a little less memory. The amount depends on the number of genrations and the chunks they refer to.

  • Jan Niggemann updated the German translation of the Obnam manual to match recent changes in the English version.

  • SanskritFritz and Ian Cambell fixed the kdirstat plugin.

  • Lars Wirzenius changed Obnam to hide a Python stack trace when there's a problem with the SSH connection (e.g., failure to authenticate, or existing connection breaks).

  • Lars Wirzenius made the Green Albatross version of obnam forget actually free chunks that are no longer used.

Posted Sun Jun 25 15:40:00 2017 Tags: