Welcome to my web log. See the first post for an introduction. See the archive page for all posts. (There is an english language feed if you don't want to see Finnish.)

Archives Tags Moderation policy Main site

All content outside of comments is copyrighted by Lars Wirzenius, and licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License. Comments are copyrighted by their authors. (No new comments are allowed.)


I now have a rudimentary [roadmap][] for reaching 1.0 of [vmdb2][], my Debian image building tool.

Visual roadmap

The visual roadmap is generated from the following YAML file:

vmdb2_1_0:
  label: |
    vmdb2 is production ready
  depends:
    - ci_builds_images
    - docs
    - x220_install

docs:
  label: |
    vmdb2 has a user
    manual of acceptable
    quality

x220_install:
  label: |
    x220 can install Debian
    onto a Thinkpad x220
    laptop

ci_builds_images:
  label: |
    CI builds and publishes
    images using vmdb2
  depends:
    - amd64_images
    - arm_images

amd64_images:
  label: |
    CI: amd64 images

arm_images:
  label: |
    CI: arm images of
    various kinds
Posted Thu Sep 20 10:58:00 2018 Tags:

I've set up a new website for vmdb2, my tool for building Debian images (basically "debootstrap, except in a disk image"). As usual for my websites, it's ugly. Feedback welcome.

Posted Thu Sep 13 19:43:00 2018 Tags:

I'm starting a new job in about a month. Until then, it'd be really helpful if I could earn some money via a short-term contracting or consulting job. If your company or employer could benefit from any of the following, please get in touch. I will invoice via a Finnish company, not as a person (within the EU, at least, this makes it easier for the clients). I also reside in Finland, if that matters (meaning, meeting outside of Helsinki gets tricky).

  • software architecture design and review
  • coding in Python, C, shell, or code review
  • documentation: writing, review
  • git training
  • help with automated testing: unit tests, integration tests
  • help with Ansible
  • packaging and distributing software as .deb packages
Posted Sat Sep 8 09:12:00 2018

In the modern world, a lot of computing happens on other people's computers. We use a lot of services provided by various parties. This is a problem for user freedom and software freedom. For example, when I use Twitter, the software runs on Twitter's servers, and it's entirely proprietary. Even if it were free software, even if it were using the Affero GPL license (AGPL), my freedom would be limited by the fact that I can't change the software running on Twitter's servers.

If I could, it would be a fairly large security problem. If I could, then anyone could, and they might not be good people like I am.

If the software were free, instead of proprietary, I could run it on my own server, or find someone else to run the software for me. This would make me more free.

That still leaves the data. My calendars would still be on Twitter's servers: all my tweets, direct messages, the lists of people I follow, or who follow me. Probably other things as well.

For true freedom in this context, I would need to have a way to migrate my data from Twitter to another service. For practical freedom, the migration should not be excessively much work, or be excessively expensive, not just possible in principle.

For Twitter specifically, there's free-er alternatives, such as Mastodon.

For ick, my CI / CD engine, here is my current thinking: ick should not be a centralised service. It should be possible to pick and choose between instances of its various components: the controller, the workers, the artifact store, and Qvisqve (authentication server). Ditto for any additional components in the future.

Since users and the components need to have some trust in each other, and there may be payment involved, this may need some co-ordination, and it may not be possible to pick entirely freely. However, as a thought experiment, let's consider a scenario.

Alice has a bunch of non-mainstream computers she doesn't use herself much: Arm boards, RISCV boards, PowerPC Macs, Amigas, etc. All in good working condition. She'd be happy to set them up as build workers, and let people use them, for a small fee to cover her expenses.

Bettina has a bunch of servers with lots of storage space. She'd be happy to let people use them as artifact stores, for a fee.

Cecilia has a bunch of really fast x86-64 machines, with lots of RAM and very fast NVMe disks. She'd also be happy to rent them out as build workers.

Dinah needs a CI system, but only has one small server, which would work fine as a controller for her own projects, but is too slow to comfortably do any actual building.

Eliza also needs a CI system, but wants to keep her projects separate from Dinah's, so wants to have her own controller. (Eliza and Dinah can't tolerate each other and do not trust each other.)

Fatima is trusted by everyone, except Eliza, and would be happy to run a secure server with Qvisqve.

Georgina is like Fatima, except Eliza trusts her, and Dinah doesn't.

The setup would be like this:

  • Alice and Cecilia run build workers. The workers trust both Fatima's and Georgina's Qvisqves. All of their workers are registered with both Qvisqves, and both Dinah's and Eliza's controllers.

  • Bettina's artifact store also trusts both Qvisqves.

  • Dinah creates an account on Fatima's Qvisqve. Eliza on Georgina's Qvisqve. They each get an API token from the respective Qvisqve.

  • When Dinah's project builds, her controller uses the API token to get an identity token from Fatima's Qvisqve, and gives that to each worker used in her builds. The worker checks the ID token, and then accepts work from Dinah's controller. The worker reports the time used to do the work to its billing system, and Alice or Cecilia uses that information to bill Dinah.

  • If a build needs to use an artifact store, the ID token is again used to bill Dinah.

  • For Eliza, the same thing happens, except with another Qvisqve, and costs from he builds go to her, not Dinah.

This can be generalised to any number of ick components, which can be used criss-cross. Each component needs to be configured as to which Qvisqves it trusts.

I think this would be a nicer thing to have than the centralised hosted ick I've been thinking about so far. Much more complicated, and much more work, of course. But interesting.

There are some interesting and difficult questions about security to solve. I don't want to start thinking about the details yet, I'll play with the general idea first.

What do you think? Send me your thoughts by email.

Posted Thu Aug 30 17:09:00 2018 Tags:

I was watching the Matthew Garret "Heresies" talk from Debconf17 today. The following thought struck me:

True software freedom for this age: you can get the source code of a service you use, and can set it up on your own server. You can also get all your data from the service, and migrate it to another service (hosted by you or someone else). Futher, all of this needs to be easy, fast, and cheap enough to be feasible, and there can't be "network effects" that lock you into a specific service instance.

I will need to think hard what this means for my future projects.

Posted Fri Aug 24 23:20:00 2018 Tags:

This week's issue of LWN has a quote by Linus Torvalds on translating kernel messages to something else than English. He's against it:

Really. No translation. No design for translation. It's a nasty nasty rat-hole, and it's a pain for everybody.

There's another reason I fundamentally don't want translations for any kernel interfaces. If I get an error report, I want to be able to just 'git grep" it. Translations make that basically impossible.

So the fact is, I want simple English interfaces. And people who have issues with that should just not use them. End of story. Use the existing error numbers if you want internationalization, and live with the fact that you only get the very limited error number.

I can understand Linus's point of view. The LWN readers are having a discussion about it, and one of the comments there provoked this blog post:

It somewhat bothers me that English, being the lingua franca of of free software development, excludes a pretty huge parts of the world from participation. I thought that for a significant part of the world, writing an English commit message has to be more difficult than writing code.

I can understand that point of view as well.

Here's my point of view:

  • It is entirely true that if a project requires English for communication within the project, it discriminates against those who don't know English well.

  • Not having a common language within a project, between those who contribute to the project, now and later, would pretty much destroy any hope of productive collaboration.

    If I have a free software project, and you ask me to merge something where commit messages are in Hindi, error messages in French, and code comments in Swahili, I'm not going to understand any of them. I won't merge what I don't understand.

    If I write my commit messages in Swedish, my code comments in Finnish, and my error messages by entering randomly chosen words from /usr/share/dict/words into search engine, and taking the page title of the fourteenth hit, then you're not going to understand anything either. You're unlikely to make any changes to my project.

    When Bo finds the project in 2038, and needs it to prevent the apocalypse from 32-time timestamps ending, and can't understand the README, humanity is doomed.

    Thus, on balance, I'm OK with requiring the use of a single language for intra-project communication.

  • Users should not be presented with text in a language foreign to them. However, this raises a support issue, where a user may copy-paste an error message in their native language, and ask for help, but the developers don't understand the language, and don't even know what the error is. If they knew the error was "permission denied", they could tell the user to run the chmod command to fix the permissions. This is a dilemma.

    I've solved the dilemma by having a unique error code for each error message. If the user tells me "R12345X: Xscpft ulkacsx ggg: /etc/obnam.conf!" I can look up R12345X and see that the error is that /etc/obnam.conf is not in the expected file format.

    This could be improved by making the "parameters" for the error message easy to parse. Perhaps something like this:

    R12345X: Xscpft ulkacsx ggg! filename=/etc/obnam.conf

    Maintaining such error codes by hand would be quite tedious, of course. I invented a module for doing that. Each error message is represented by a class, and the class creates its own error code by taking the its Python module and class name, and computing and MD5 of that. The first five hexadecimal digits are the code, and get surrounded by R and X to make it easier to grep.

    (I don't know if something similar might be used for the Linux kernel.)

  • Humans and inter-human communication is difficult. In many cases, there is not solution that's good for everyone. But let's not give up.

Posted Fri Aug 3 15:49:00 2018

I've recently made the first release of ick, my CI engine, which was built by ick itself. It went OK, but the process needs improvement. This blog post is some pondering on how the process of building Debian packages should happen in the best possible taste.

I'd appreciate feedback, preferably by email (liw@liw.fi).

Context

I develop a number of (fairly small) programs, as a hobby. Some of them I also maintain as packages in Debian. All of them I publish as Debian packages in my own APT repository. I want to make the process for making a release of any of my programs as easy and automated as possible, and that includes building Debian packages and uploading them to my personal APT repository, and to Debian itself.

My personal APT repository contains builds of my programs against several Debian releases, because I want people to have the latest version of my stuff regardless of what version of Debian they run. (This is somewhat similar to what OEMs that provide packages of their own software as Debian packages need to do. I think. I'm not an OEM and I'm extrapolating wildly here.)

I currently don't provide packages for anything but Debian. That's mostly because Debian is the only Linux distribution I know well, or use, or know how to make packages for. I could do Ubuntu builds fairly easily, but supporting Fedora, RHEL, Suse, Arch, Gentoo, etc, is not something I have the energy for at this time. I would appreciate help in doing that, however.

I currently don't provide Debian packages for anything other than the AMD64 (x86-64, "Intel 64-bit") architecture. I've previously provided packages for i386 (x86-32), and may in the future want to provide packages for other architectures (RISC-V, various Arm variants, and possibly more). I want to keep this in mind for this discussion.

Overview

For the context of this blog post, let's assume I have a project Foo. Its source code is stored in foo.git. When I make a release, I tag it using a signed git tag. From this tag, I want to build several things:

  • A release tarball. I will publish and archive this. I don't trust git, and related tools (tar, compression programs, etc) to be able to reproducibly produce the same bit-by-bit compressed tarball in perpetuity. There's too many things that can go wrong. For security reasons it's important to be able to have the exact same tarball in the future as today. The simplest way to achive this is to not try to reproduce, but to archive.

  • A Debian source package.

  • A Debian binary package built for each target version of Debian, and each target hardware architecture (CPU, ABI, possibly toolchain version). The binary package should be built from the source package, because otherwise we don't know the source package can be built.

The release tarball should be put in a (public) archive. A digital signature using my personal PGP key should also be provided.

The Debian source and binary packages should be uploaded to one or more APT repositories: my personal one, and selected packages also the Debian one. For uploading to Debian, the packages will need to be signed with my personal PGP key.

(I am not going to give my CI access to my PGP key. Anything that needs to be signed with my own PGP key needs to be a manual step.)

Package versioning

In Debian, packages are uploaded to the "unstable" section of the package archive, and then automatically copied into the "testing" section, and from there to the "stable" section, unless there are problems in a specific version of a package. Thus all binary packages are built against unstable, using versions of build dependencies in unstable. The process of copying via testing to stable can take years, and is a core part of how Debian achieves quality in its releases. (This is simplified and skips consideration like security updates and other updates directly to stable, which bypass unstable. These details are not relevant to this discussion, I think.)

In my personal APT repository, no such copying takes place. A package built for unstable does not get copied into section with packages built for a released version of Debian, when Debian makes a release.

Thus, for my personal APT repository, there may be several builds of the any one version of Foo available.

  • foo 1.2, built for unstable
  • foo 1.2, built for Debian 9
  • foo 1.2, built for Debian 8

In the future, that list may be expanded by having builds for several architectures:

  • foo 1.2, built for unstable, on amd64
  • foo 1.2, built for Debian 9, on amd64
  • foo 1.2, built for Debian 8, on amd64

  • foo 1.2, built for unstable, on riscv

  • foo 1.2, built for Debian 9, on riscv
  • foo 1.2, built for Debian 8, on riscv

When I or my users upgrade our Debian hosts, say from Debian 8 to Debian 9, any packges from my personal APT archive should be updated accordingly. When I upgrade a host running Debian 8, with foo 1.2 built for Debian 8, gets upgraded to Debian 9, foo should be upgraded to the version of 1.2 built for Debian 9.

Because the Debian package manager works on combinations of package name and package version, that means that the version built for Debian 8 should have a different, and lesser, version than the one built for Debian 9, even if the source code is identical except for the version number. The easiest way to achieve this is probably to build a different source package for each target Debian release. That source package has no other differences than the debian/changelog entry with a new version number, so it doesn't necessarily need to be stored persistently.

(This is effectively what Debians "binary NMU" uploads do: use the same source package version, but do a build varying only the version number. Debian does this, among other reasons, to force a re-build of a package using a new version of a build depenency, for which it is unnecessary to do a whole new sourceful upload. For my CI build purposes, it may be useful to have a new source package, for cases where there are other changes than the source package. This will need further thought and research.)

Thus, I need to produce the following source and binary packages:

  • foo_1.2-1.dsc — source package for unstable
  • foo_1.2-1.orig.tar.xz — upstream tarball
  • foo_1.2-1.debian.tar.xz — Debian packaging and changes
  • foo_1.2-1_amd64.deb — binary package for unstable, amd64
  • foo_1.2-1_riscv.deb — binary package for unstable, riscv

  • foo_1.2-1~debian8.dsc — source package for Debian 8

  • foo_1.2-1~debian8.debian.tar.xz — Debian packaging and changes
  • foo_1.2-1~debian8_amd64.deb — binary package for Debian 8, amd64
  • foo_1.2-1~debian8_riscv.deb — binary package for Debian 8, riscv

  • foo_1.2-1~debian9.dsc — source package for Debian 9

  • foo_1.2-1~debian9.debian.tar.xz — Debian packaging and changes
  • foo_1.2-1~debian9_amd64.deb — binary package for Debian 9, amd64
  • foo_1.2-1~debian9_riscv.deb — binary package for Debian 9, riscv

The orig.tar.xz file is a bit-by-bit copy of the upstream release tarball. The debian.tar.xz files have the Debian packaging files, plus any Debian specific changes. (For simplicity, I'm assuming a specific Debian source package format. The actual list of files may vary, but the .dsc file is crucial, and references the other files in the source package. Again, these details don't really matter for this discussion.)

To upload to Debian, I would upload the foo_1.2-1.dsc source package from the list above, after downloading the files and signing them with my PGP key. To upload to my personal APT repository, I would upload all of them.

Where should Debian packaging be stored in version control?

There seems to be no strong consensus in Debian about where the packaging files (the debian/ subdirectory and its contents) should be stored in version control. Several approaches are common. The examples below use git as the version control system, as it's clearly the most common one now.

  • The "upstream does the packaging" approach: upstream's foo.git also contains the Debian packaging. Packages are built using that. This seems to be especially common for programs, where upstream and the Debian package maintainer are the same entity. That's also the OEM model.

  • The "clone upstream and add packaging" approach: the Debian package maintainer clonse the upstream repository, and adds the packaging files in a separate branch. When upstream makes a release, the master branch in the packaging repository is updated to match the upstream's master branch, and the packaging branch is rebased on top of that.

  • The "keep it separate" approach: the Debian packager puts the packaging files in their own repository, and the source tree is constructed from botht the upstream repository and the packaging repository.

For my own use, I prefer the "upstream does packaging" approach, as it's the least amount of friction for me. For ick, I want to support any approach.

There are various tools for maintaining package source in git (e.g., dgit and git-buildpackage), but those seem to not be relevant to this blog post, so I'm not discussing them in any detail.

The build process

Everything starts from a signed git tag in the foo.git plus additional tags in any packaging repository. The tags are made by the upstream developers and Debian package maintainers. CI will notice the new tag, and build a release from that.

  • Create the upstream tarball (foo-1.2.tar.gz).

  • Manully download and sign the upstream tarball with PGP.

  • Manully publish the upstream tarball and its signature in a suitable place.

  • Create the Debian source package for unstable (foo_1.2-1.dsc), using a copy of the upstream tarball, renamed.

  • Using the Debian source package, build a Debian binary package for unstable for each target architecture (foo_1.2-1_amd64.deb etc).

  • For each target Debian release other than unstable, create a new source package by unpacking the source package for unstable, and adding a debian/changelog entry with ~debianN appended to the version number. If there is a need, make any additional Debian release specific changes to the source package.

  • Build each of those source packages for each target architecture, in a build environment with the target Debian release. (foo_1.2-1~debianN_amd64.deb etc).

  • Upload all the Debian source and binary packages to an APT repository that allows upload by CI. Have that APT repository sign the resulting Packages file with its own PGP key.

  • Manully download the Debian packages and sign the unstable build to Debian, and upload it to Debian. (Source package only, except in cases where the binary package also needs to be uploaded, such as for new packages.)

Posted Thu Jul 19 18:57:00 2018 Tags:

I have just made a new release of ick, my CI system. The new version number is 0.53, and a summary of the changes is below. The source code is pushed to my git server (git.liw.fi), and Debian packages to my APT repository (code.liw.fi/debian). See https://ick.liw.fi/download/ for instructions.

See the website for more information: https://ick.liw.fi/

A notable change from previous releases should be invisible to users: the release is built by ick2 itself, instead of my old mostly-manual CI script. This means I can abandon the old script and live in a brave, new world with tea, frozen-bubble, and deep meaningful relationships with good people.

Version 0.53, released 2018-07-18

  • Notification mails now include controller URL, so it's easy to see which ick instance they come from. They also include the exit code (assuming the notification itself doesn't fail), and a clear SUCCESS or FAILURE in the subject.

  • Icktool shows a more humane error message if getting a token fails, instead of a Python stack trace.

  • Icktool will now give a more humane error message if user triggers the build of a project that doesn't exist, instead of a Python stack trace.

  • Icktool now looks for credentials using both the controller URL, and the authentication URL.

  • Icktool can now download artifacts from the artifact store, with the new get-artifact subcomand.

  • The archive: workspace action now takes an optional globs field, which is a list of Unix filename globs, for what to include in the artifact. Also, optionally the field name_from can be used to specify the name of a project parameter, which contains the name of the artifact. The default is the artifact_name parameter.

  • A Code of Conduct has been added to the ick project. https://ick.liw.fi/conduct/ has the canonical copy.

Posted Wed Jul 18 18:14:00 2018 Tags:

It gives me no small amount of satisfaction to announce the ALPHA-6 version of ick, my fledgling continuous integration and deployment engine. Ick has been now deployed and used by other people than myself.

Ick can, right now:

  • Build system trees for containers.
  • Use system trees to run builds in containers.
  • Build Debian packages.
  • Publish Debian packages via its own APT repository.
  • Deploy to a production server.

There's still many missing features. Ick is by no means ready to replace your existing CI/CD system, but if you'd like to have a look at ick, and help us make it the CI/CD system of your dreams, now is a good time to give it a whirl.

(Big missing features: web UI, building for multiple CPU architectures, dependencies between projects, good documentation, a development community. I intend to make all of these happen in due time. Help would be welcome.)

Posted Thu Jun 21 19:28:00 2018 Tags:

I've been slowly writing on would-be novel, Hacker Noir. See also my Patreon post. I've just pushed out a new public chapter, Assault, to the public website, and a patron-only chapter to Patreon: "Ambush", where the Team is ambushed, and then something bad happens.

The Assault chapter was hard to write. It's based on something that happened to me earlier this year. The Ambush chapter was much more fun.

Posted Sat Jun 9 21:41:00 2018 Tags: