Feed for Planet Debian.

This week's issue of LWN has a quote by Linus Torvalds on translating kernel messages to something else than English. He's against it:

Really. No translation. No design for translation. It's a nasty nasty rat-hole, and it's a pain for everybody.

There's another reason I fundamentally don't want translations for any kernel interfaces. If I get an error report, I want to be able to just 'git grep" it. Translations make that basically impossible.

So the fact is, I want simple English interfaces. And people who have issues with that should just not use them. End of story. Use the existing error numbers if you want internationalization, and live with the fact that you only get the very limited error number.

I can understand Linus's point of view. The LWN readers are having a discussion about it, and one of the comments there provoked this blog post:

It somewhat bothers me that English, being the lingua franca of of free software development, excludes a pretty huge parts of the world from participation. I thought that for a significant part of the world, writing an English commit message has to be more difficult than writing code.

I can understand that point of view as well.

Here's my point of view:

  • It is entirely true that if a project requires English for communication within the project, it discriminates against those who don't know English well.

  • Not having a common language within a project, between those who contribute to the project, now and later, would pretty much destroy any hope of productive collaboration.

    If I have a free software project, and you ask me to merge something where commit messages are in Hindi, error messages in French, and code comments in Swahili, I'm not going to understand any of them. I won't merge what I don't understand.

    If I write my commit messages in Swedish, my code comments in Finnish, and my error messages by entering randomly chosen words from /usr/share/dict/words into search engine, and taking the page title of the fourteenth hit, then you're not going to understand anything either. You're unlikely to make any changes to my project.

    When Bo finds the project in 2038, and needs it to prevent the apocalypse from 32-time timestamps ending, and can't understand the README, humanity is doomed.

    Thus, on balance, I'm OK with requiring the use of a single language for intra-project communication.

  • Users should not be presented with text in a language foreign to them. However, this raises a support issue, where a user may copy-paste an error message in their native language, and ask for help, but the developers don't understand the language, and don't even know what the error is. If they knew the error was "permission denied", they could tell the user to run the chmod command to fix the permissions. This is a dilemma.

    I've solved the dilemma by having a unique error code for each error message. If the user tells me "R12345X: Xscpft ulkacsx ggg: /etc/obnam.conf!" I can look up R12345X and see that the error is that /etc/obnam.conf is not in the expected file format.

    This could be improved by making the "parameters" for the error message easy to parse. Perhaps something like this:

    R12345X: Xscpft ulkacsx ggg! filename=/etc/obnam.conf

    Maintaining such error codes by hand would be quite tedious, of course. I invented a module for doing that. Each error message is represented by a class, and the class creates its own error code by taking the its Python module and class name, and computing and MD5 of that. The first five hexadecimal digits are the code, and get surrounded by R and X to make it easier to grep.

    (I don't know if something similar might be used for the Linux kernel.)

  • Humans and inter-human communication is difficult. In many cases, there is not solution that's good for everyone. But let's not give up.

Posted Fri Aug 3 15:49:00 2018

I've recently made the first release of ick, my CI engine, which was built by ick itself. It went OK, but the process needs improvement. This blog post is some pondering on how the process of building Debian packages should happen in the best possible taste.

I'd appreciate feedback, preferably by email (liw@liw.fi).

Context

I develop a number of (fairly small) programs, as a hobby. Some of them I also maintain as packages in Debian. All of them I publish as Debian packages in my own APT repository. I want to make the process for making a release of any of my programs as easy and automated as possible, and that includes building Debian packages and uploading them to my personal APT repository, and to Debian itself.

My personal APT repository contains builds of my programs against several Debian releases, because I want people to have the latest version of my stuff regardless of what version of Debian they run. (This is somewhat similar to what OEMs that provide packages of their own software as Debian packages need to do. I think. I'm not an OEM and I'm extrapolating wildly here.)

I currently don't provide packages for anything but Debian. That's mostly because Debian is the only Linux distribution I know well, or use, or know how to make packages for. I could do Ubuntu builds fairly easily, but supporting Fedora, RHEL, Suse, Arch, Gentoo, etc, is not something I have the energy for at this time. I would appreciate help in doing that, however.

I currently don't provide Debian packages for anything other than the AMD64 (x86-64, "Intel 64-bit") architecture. I've previously provided packages for i386 (x86-32), and may in the future want to provide packages for other architectures (RISC-V, various Arm variants, and possibly more). I want to keep this in mind for this discussion.

Overview

For the context of this blog post, let's assume I have a project Foo. Its source code is stored in foo.git. When I make a release, I tag it using a signed git tag. From this tag, I want to build several things:

  • A release tarball. I will publish and archive this. I don't trust git, and related tools (tar, compression programs, etc) to be able to reproducibly produce the same bit-by-bit compressed tarball in perpetuity. There's too many things that can go wrong. For security reasons it's important to be able to have the exact same tarball in the future as today. The simplest way to achive this is to not try to reproduce, but to archive.

  • A Debian source package.

  • A Debian binary package built for each target version of Debian, and each target hardware architecture (CPU, ABI, possibly toolchain version). The binary package should be built from the source package, because otherwise we don't know the source package can be built.

The release tarball should be put in a (public) archive. A digital signature using my personal PGP key should also be provided.

The Debian source and binary packages should be uploaded to one or more APT repositories: my personal one, and selected packages also the Debian one. For uploading to Debian, the packages will need to be signed with my personal PGP key.

(I am not going to give my CI access to my PGP key. Anything that needs to be signed with my own PGP key needs to be a manual step.)

Package versioning

In Debian, packages are uploaded to the "unstable" section of the package archive, and then automatically copied into the "testing" section, and from there to the "stable" section, unless there are problems in a specific version of a package. Thus all binary packages are built against unstable, using versions of build dependencies in unstable. The process of copying via testing to stable can take years, and is a core part of how Debian achieves quality in its releases. (This is simplified and skips consideration like security updates and other updates directly to stable, which bypass unstable. These details are not relevant to this discussion, I think.)

In my personal APT repository, no such copying takes place. A package built for unstable does not get copied into section with packages built for a released version of Debian, when Debian makes a release.

Thus, for my personal APT repository, there may be several builds of the any one version of Foo available.

  • foo 1.2, built for unstable
  • foo 1.2, built for Debian 9
  • foo 1.2, built for Debian 8

In the future, that list may be expanded by having builds for several architectures:

  • foo 1.2, built for unstable, on amd64
  • foo 1.2, built for Debian 9, on amd64
  • foo 1.2, built for Debian 8, on amd64

  • foo 1.2, built for unstable, on riscv

  • foo 1.2, built for Debian 9, on riscv
  • foo 1.2, built for Debian 8, on riscv

When I or my users upgrade our Debian hosts, say from Debian 8 to Debian 9, any packges from my personal APT archive should be updated accordingly. When I upgrade a host running Debian 8, with foo 1.2 built for Debian 8, gets upgraded to Debian 9, foo should be upgraded to the version of 1.2 built for Debian 9.

Because the Debian package manager works on combinations of package name and package version, that means that the version built for Debian 8 should have a different, and lesser, version than the one built for Debian 9, even if the source code is identical except for the version number. The easiest way to achieve this is probably to build a different source package for each target Debian release. That source package has no other differences than the debian/changelog entry with a new version number, so it doesn't necessarily need to be stored persistently.

(This is effectively what Debians "binary NMU" uploads do: use the same source package version, but do a build varying only the version number. Debian does this, among other reasons, to force a re-build of a package using a new version of a build depenency, for which it is unnecessary to do a whole new sourceful upload. For my CI build purposes, it may be useful to have a new source package, for cases where there are other changes than the source package. This will need further thought and research.)

Thus, I need to produce the following source and binary packages:

  • foo_1.2-1.dsc — source package for unstable
  • foo_1.2-1.orig.tar.xz — upstream tarball
  • foo_1.2-1.debian.tar.xz — Debian packaging and changes
  • foo_1.2-1_amd64.deb — binary package for unstable, amd64
  • foo_1.2-1_riscv.deb — binary package for unstable, riscv

  • foo_1.2-1~debian8.dsc — source package for Debian 8

  • foo_1.2-1~debian8.debian.tar.xz — Debian packaging and changes
  • foo_1.2-1~debian8_amd64.deb — binary package for Debian 8, amd64
  • foo_1.2-1~debian8_riscv.deb — binary package for Debian 8, riscv

  • foo_1.2-1~debian9.dsc — source package for Debian 9

  • foo_1.2-1~debian9.debian.tar.xz — Debian packaging and changes
  • foo_1.2-1~debian9_amd64.deb — binary package for Debian 9, amd64
  • foo_1.2-1~debian9_riscv.deb — binary package for Debian 9, riscv

The orig.tar.xz file is a bit-by-bit copy of the upstream release tarball. The debian.tar.xz files have the Debian packaging files, plus any Debian specific changes. (For simplicity, I'm assuming a specific Debian source package format. The actual list of files may vary, but the .dsc file is crucial, and references the other files in the source package. Again, these details don't really matter for this discussion.)

To upload to Debian, I would upload the foo_1.2-1.dsc source package from the list above, after downloading the files and signing them with my PGP key. To upload to my personal APT repository, I would upload all of them.

Where should Debian packaging be stored in version control?

There seems to be no strong consensus in Debian about where the packaging files (the debian/ subdirectory and its contents) should be stored in version control. Several approaches are common. The examples below use git as the version control system, as it's clearly the most common one now.

  • The "upstream does the packaging" approach: upstream's foo.git also contains the Debian packaging. Packages are built using that. This seems to be especially common for programs, where upstream and the Debian package maintainer are the same entity. That's also the OEM model.

  • The "clone upstream and add packaging" approach: the Debian package maintainer clonse the upstream repository, and adds the packaging files in a separate branch. When upstream makes a release, the master branch in the packaging repository is updated to match the upstream's master branch, and the packaging branch is rebased on top of that.

  • The "keep it separate" approach: the Debian packager puts the packaging files in their own repository, and the source tree is constructed from botht the upstream repository and the packaging repository.

For my own use, I prefer the "upstream does packaging" approach, as it's the least amount of friction for me. For ick, I want to support any approach.

There are various tools for maintaining package source in git (e.g., dgit and git-buildpackage), but those seem to not be relevant to this blog post, so I'm not discussing them in any detail.

The build process

Everything starts from a signed git tag in the foo.git plus additional tags in any packaging repository. The tags are made by the upstream developers and Debian package maintainers. CI will notice the new tag, and build a release from that.

  • Create the upstream tarball (foo-1.2.tar.gz).

  • Manully download and sign the upstream tarball with PGP.

  • Manully publish the upstream tarball and its signature in a suitable place.

  • Create the Debian source package for unstable (foo_1.2-1.dsc), using a copy of the upstream tarball, renamed.

  • Using the Debian source package, build a Debian binary package for unstable for each target architecture (foo_1.2-1_amd64.deb etc).

  • For each target Debian release other than unstable, create a new source package by unpacking the source package for unstable, and adding a debian/changelog entry with ~debianN appended to the version number. If there is a need, make any additional Debian release specific changes to the source package.

  • Build each of those source packages for each target architecture, in a build environment with the target Debian release. (foo_1.2-1~debianN_amd64.deb etc).

  • Upload all the Debian source and binary packages to an APT repository that allows upload by CI. Have that APT repository sign the resulting Packages file with its own PGP key.

  • Manully download the Debian packages and sign the unstable build to Debian, and upload it to Debian. (Source package only, except in cases where the binary package also needs to be uploaded, such as for new packages.)

Posted Thu Jul 19 18:57:00 2018 Tags:

I have just made a new release of ick, my CI system. The new version number is 0.53, and a summary of the changes is below. The source code is pushed to my git server (git.liw.fi), and Debian packages to my APT repository (code.liw.fi/debian). See https://ick.liw.fi/download/ for instructions.

See the website for more information: https://ick.liw.fi/

A notable change from previous releases should be invisible to users: the release is built by ick2 itself, instead of my old mostly-manual CI script. This means I can abandon the old script and live in a brave, new world with tea, frozen-bubble, and deep meaningful relationships with good people.

Version 0.53, released 2018-07-18

  • Notification mails now include controller URL, so it's easy to see which ick instance they come from. They also include the exit code (assuming the notification itself doesn't fail), and a clear SUCCESS or FAILURE in the subject.

  • Icktool shows a more humane error message if getting a token fails, instead of a Python stack trace.

  • Icktool will now give a more humane error message if user triggers the build of a project that doesn't exist, instead of a Python stack trace.

  • Icktool now looks for credentials using both the controller URL, and the authentication URL.

  • Icktool can now download artifacts from the artifact store, with the new get-artifact subcomand.

  • The archive: workspace action now takes an optional globs field, which is a list of Unix filename globs, for what to include in the artifact. Also, optionally the field name_from can be used to specify the name of a project parameter, which contains the name of the artifact. The default is the artifact_name parameter.

  • A Code of Conduct has been added to the ick project. https://ick.liw.fi/conduct/ has the canonical copy.

Posted Wed Jul 18 18:14:00 2018 Tags:

It gives me no small amount of satisfaction to announce the ALPHA-6 version of ick, my fledgling continuous integration and deployment engine. Ick has been now deployed and used by other people than myself.

Ick can, right now:

  • Build system trees for containers.
  • Use system trees to run builds in containers.
  • Build Debian packages.
  • Publish Debian packages via its own APT repository.
  • Deploy to a production server.

There's still many missing features. Ick is by no means ready to replace your existing CI/CD system, but if you'd like to have a look at ick, and help us make it the CI/CD system of your dreams, now is a good time to give it a whirl.

(Big missing features: web UI, building for multiple CPU architectures, dependencies between projects, good documentation, a development community. I intend to make all of these happen in due time. Help would be welcome.)

Posted Thu Jun 21 19:28:00 2018 Tags:

I've been slowly writing on would-be novel, Hacker Noir. See also my Patreon post. I've just pushed out a new public chapter, Assault, to the public website, and a patron-only chapter to Patreon: "Ambush", where the Team is ambushed, and then something bad happens.

The Assault chapter was hard to write. It's based on something that happened to me earlier this year. The Ambush chapter was much more fun.

Posted Sat Jun 9 21:41:00 2018 Tags:

"Code smells" are a well-known concept: they're things that make you worried your code is not of good quality, without necessarily being horribly bad in and of themsleves. Like a bad smell in a kitchen that looks clean.

I've lately been thinking of "architecture aromas", which indicate there might be something good about an architecture, but don't guarantee it.

An example: you can't think of any compontent that could be removed from an architecture without sacrificing main functionality.

Posted Mon Apr 9 09:12:00 2018

This year I've implemented a rudimentary authentication server for work, called Qvisqve. I am in the process for also using it for my current hobby project, ick, which provides HTTP APIs and needs authentication. Qvisqve stores passwords using scrypt: source. It's not been audited, and I'm not claiming it to be perfect, but it's at least not storing passwords in cleartext. (If you find a problem, do email me and tell me: liw@liw.fi.)

This week, two news stories have reached me about service providers storing passwords in cleartext. One is a Finnish system for people starting a new business. The password database has leaked, with about 130,000 cleartext passwords. The other is about T-mobile in Austria bragging on Twitter that they store customer passwords in cleartext, and some people not liking that.

In both cases, representatives of the company claim it's OK, because they have "good security". I disagree. Storing passwords is itself shockingly bad security, regardless of how good your other security measures are, and whether your password database leaks or not. Claiming it's ever OK to store user passwords in cleartext in a service is incompetence at best.

When you have large numbers of users, storing passwords in cleartext becomes more than just a small "oops". It becomes a security risk for all your users. It becomes gross incompetence.

A bank is required to keep their customers' money secure. They're not allowed to store their customers cash in a suitcase on the pavement without anyone guarding it. Even with a guard, it'd be negligent, incompetent, to do that. The bulk of the money gets stored in a vault, with alarms, and guards, and the bank spends much effort on making sure the money is safe. Everyone understands this.

Similar requirements should be placed on those storing passwords, or other such security-sensitive information of their users.

Storing passwords in cleartext, when you have large numbers of users, should be criminal negligence, and should carry legally mandated sanctions. This should happen when the situation is realised, even if the passwords haven't leaked.

Posted Sat Apr 7 10:28:00 2018 Tags:

For the 2016 NaNoWriMo I started writing a novel about software development, "Hacker Noir". I didn't finish it during that November, and I still haven't finished it. I had a year long hiatus, due to work and life being stressful, when I didn't write on the novel at all. However, inspired by both the Doctorow method and the Seinfeld method, I have recently started writing again.

I've just published a new chapter. However, unlike last year, I'm publishing it on my Patreon only, for the first month, and only for patrons. Then, next month, I'll be putting that chapter on the book's public site (noir.liw.fi), and another new chapter on Patreon.

I don't expect to make a lot of money, but I am hoping having active supporters will motivate me to keep writing.

I'm writing the first draft of the book. It's likely to be as horrific as every first-time author's first draft is. If you'd like to read it as raw as it gets, please do. Once the first draft is finished, I expect to read it myself, and be horrified, and throw it all away, and start over.

Also, I should go get some training on marketing.

Posted Thu Mar 8 13:20:00 2018 Tags:

Random crazy Debian idea of today: add support to dpkg so that it uses containers (or namespaces, or whatever works for this) for running package maintainer scripts (pre- and postinst, pre- and postrm), to prevent them from accidentally or maliciously writing to unwanted parts of the filesystem, or from doing unwanted network I/O.

I think this would be useful for third-party packages, but also for packages from Debian itself. You heard it here first! Debian package maintainers have been known to make mistakes.

Obviously there needs to be ways in which these restrictions can be overridden, but that override should be clear and obvious to the user (sysadmin), not something they notice because they happen to be running strace or tcpdump during the install.

Corollary: dpkg could restrict where a .deb can place files based on the origin of the package.

Example: Installing chrome.deb from Google installs a file in /etc/apt/sources.list.d, which is a surprise to some. If dpkg were to not allow that (as a file in the .deb, or a file created in postinst), unless the user was told and explicitly agreed to it, it would be less of a nasty surprise.

Example: Some stupid Debian package maintainer is very busy at work and does Debian hacking when they should really be sleeping, and types the following into their postrm script, while being asleep:

#!/bin/sh

PKG=perfectbackup
LIB="/var/lib/ $PKG"

rm -rf "$LIB"

See the mistake? Ideally, this would be found during automated testing before the package gets uploaded, but that assumes said package maintainer uses tools like piuparts.

I think it'd be better if we didn't rely only infallible, indefatigable people with perfect workflows and processes for safety.

Having dpkg make the whole filesystem read-only, except for the parts that clearly belong to the package, based on some sensible set of rules, or based a suitable override, would protect against mistakes like this.

Posted Mon Mar 5 12:58:00 2018 Tags:

Another weekend, another big mailing list thread

This weekend, those interested in Debian development have been having a discussion on the debian-devel mailing list about "What can Debian do to provide complex applications to its users?". I'm commenting on that in my blog rather than the mailing list, since this got a bit too long to be usefully done in an email.

directhex's recent blog post "Packaging is hard. Packager-friendly is harder." is also relevant.

The problem

To start with, I don't think the email that started this discussion poses the right question. The problem not really about complex applications, we already have those in Debian. See, for example, LibreOffice. The discussion is really about how Debian should deal with the way some types of applications are developed upstream these days. They're not all complex, and they're not all big, but as usual, things only get interesting when n is big.

A particularly clear example is the whole nodejs ecosystem, but it's not limited to that and it's not limited to web applications. This is also not the first time this topic arises, but we've never come to any good conclusion.

My understanding of the problem is as follows:

A current trend in software development is to use programming languages, often interpreted high level languages, combined with heavy use of third-party libraries, and a language-specific package manager for installing libraries for the developer to use, and sometimes also for the sysadmin installing the software for production to use. This bypasses the Linux distributions entirely. The benefit is that it has allowed ecosystems for specific programming languages where there is very little friction for using libraries written in that language to be used by developers, speeding up development cycles a lot.

When I was young(er) the world was horrible

In comparison, in the old days, which for me means the 1990s, and before Debian took over my computing life, the cycle was something like this:

I would be writing an application, and would need to use a library to make some part of my application easier to write. To use that library, I would download the source code archive of the latest release, and laboriously decipher and follow the build and installation instructions, fix any problems, rinse, repeat. After getting the library installed, I would get back to developing my application. Often the installation of the dependency would take hours, so not a thing to be undertaken lightly.

Debian made some things better

With Debian, and apt, and having access to hundreds upon hundreds of libraries packaged for Debian, this become a much easier process. But only for the things packaged for Debian.

For those developing and publishing libraries, Debian didn't make the process any easier. They would still have to publish a source code archive, but also hope that it would eventually be included in Debian. And updates to libraries in the Debian stable release would not get into the hands of users until the next Debian stable release. This is a lot of friction. For C libraries, that friction has traditionally been tolerable. The effort of making the library in the first place is considerable, so any friction added by Debian is small by comparison.

The world has changed around Debian

In the modern world, developing a new library is much easier, and so also the friction caused by Debian is much more of a hindrance. My understanding is that things now happen more like this:

I'm developing an application. I realise I could use a library. I run the language-specific package manager (pip, cpan, gem, npm, cargo, etc), it downloads the library, installs it in my home directory or my application source tree, and in less than the time it takes to have sip of tea, I can get back to developing my application.

This has a lot less friction than the Debian route. The attraction to application programmers is clear. For library authors, the process is also much streamlined. Writing a library, especially in a high-level language, is fairly easy, and publishing it for others to use is quick and simple. This can lead to a virtuous cycle where I write a useful little library, you use and tell me about a bug or a missing feature, I add it, publish the new version, you use it, and we're both happy as can be. Where this might have taken weeks or months in the old days, it can now happen in minutes.

The big question: why Debian?

In this brave new world, why would anyone bother with Debian anymore? Or any traditional Linux distribution, since this isn't particularly specific to Debian. (But I mention Debian specifically, since it's what I now best.)

A number of things have been mentioned or alluded to in the discussion mentioned above, but I think it's good for the discussion to be explicit about them. As a computer user, software developer, system administrator, and software freedom enthusiast, I see the following reasons to continue to use Debian:

  • The freeness of software included in Debian has been vetted. I have a strong guarantee that software included in Debian is free software. This goes beyond the licence of that particular piece of software, but includes practical considerations like the software can actually be built using free tooling, and that I have access to that tooling, because the tooling, too, is included in Debian.

    • There was a time when Debian debated (with itself) whether it was OK to include a binary that needed to be built using a proprietary C compiler. We decided that it isn't, or not in the main package archive.

    • These days we have the question of whether "minimised Javascript" is OK to be included in Debian, if it can't be produced using tools packaged in Debian. My understanding is that we have already decided that it's not, but the discussion continues. To me, this seems equivalent to the above case.

  • I have a strong guarantee that software in a stable Debian release won't change underneath me in incompatible ways, except in special circumstances. This means that if I'm writing my application and targeting Debian stable, the library API won't change, at least not until the next Debian stable release. Likewise for every other bit of software I use. Having things to continue to work without having to worry is a good thing.

    • Note that a side-effect of the low friction of library development current ecosystems sometimes results in the library API changing. This would mean my application would need to change to adapt to the API change. That's friction for my work.
  • I have a strong guarantee that a dependency won't just disappear. Debian has a large mirror network of its package archive, and there are easy tools to run my own mirror, if I want to. While running my own mirror is possible for other package management systems, each one adds to the friction.

    • The nodejs NPM ecosystem seems to be especially vulnerable to this. More than once packages have gone missing, resulting other projects, which depend on the missing packages, to start failing.

    • The way the Debian project is organised, it is almost impossible for this to happen in Debian. Not only are package removals carefully co-ordinated, packages that are depended on on by other packages aren't removed.

  • I have a strong guarantee that a Debian package I get from a Debian mirror is the official package from Debian: either the actual package uploaded by a Debian developer or a binary package built by a trusted Debian build server. This is because Debian uses cryptographic signatures of the package lists and I have a trust path to the Debian signing key.

    • At least some of the language specific package managers fail to have such a trust path. This means that I have no guarantees that the library package I download today, was the same code uploaded by library author.

    • Note that https does not help here. It protects the transfer from the package manger's web server to me, but makes absolutely no guarantees about the validity of the package. There's been enough cases of the package repository having been attacked that this matters to me. Debian's signatures protect against malicious changes on mirror hosts.

  • I have a reasonably strong guarantee that any problem I find can be fixed, by me or someone else. This is not a strong guarantee, because Debian can't do anything about insanely complicated code, for example, but at least I can rely on being able to rebuild the software. That's a basic requirement for fixing a bug.

  • I have a reasonably strong guarantee that, after upgrading to the next Debian stable release, my stuff continues to work. Upgrades may always break, but at least Debian tests them and treats it as a bug if an upgrade doesn't work, or loses user data.

These are the reasons why I think Debian and the way it packages and distributes software is still important and relevant. (You may disagree. I'm OK with that.)

What about non-Linux free operating systems

I don't have much personal experience with non-Linux systems, so I've only talked about Linux here. I don't think the BSD systems, for example, are actually all that different from Linux distributions. Feel free to substitute "free operating system" for "Linux" throughout.

What is it Debian tries to do, anyway?

The previous section is one level of abstraction too low. It's important, but it's beneficial take a further step back and consider what it is Debian actually tries to achieve. Why does Debian exist?

The primary goal of Debian is to enable its users to use their computers using only free software. The freedom aspect is fundamentally important and a principle that Debian is not willing to compromise on.

The primary approach to achieve this goal is to produce a "distribution" of free software, to make installing a free software operating system and applications, and to maintain such a computer, a feasible thing for our users.

This leads to secondary goals, such as:

  • Making it easy to install Debian on a computer. (For values of easy that should be compared to toggling boot sector bytes manually.)

    We've achieved this, though of course things can always be improved.

  • Making it easy to install applications on a computer with Debian. (Again, compared to the olden days, when that meant configuring and compiling everything from scratch, with no guidance.)

    We've achieved this, too.

  • A system with Debian installed is reasonably secure, and easy to keep reasonably secure.

    This means Debian will provide security support for software it distributes, and has ways in which to install security fixes. We've achieved this, though this, too, can always be improved.

  • A system with Debian installed should keep working for extended periods of time. This is important to make using Debian feasible. If it takes too much effort to have a computer running Debian, it's not feasible for many people to that, and then Debian fails its primary goal.

    This is why Debian has stable releases with years of security support. We've achieved this.

The disconnect

On the one hand, we have Debian, which pretty much has achieved what I declare to be its primary goal. On the other hand, a lot of developers now expect much less friction than what Debian offers. This is a disconnect that is cause, I believe, the debian-devel discussion, and variants of that discussion all over the open source landscape.

These discussions often go one of two ways, depending on which community is talking.

  • In the distribution and more old-school communities, the low-friction approach of language-specific package managers is often considered to be a horror, and an abandonment of all the good things that the Linux world has achieved. "Young saplings, who do they think they are, all agile and bendy and with no principles at all, get off our carefully cultivated lawn."

  • In the low-friction communities, Linux distributions are something only old, stodgy, boring people care about. "Distributions are dead, they only get in the way, nobody bothers with them anymore."

This disconnect will require effort by both sides to close the gap.

On the one hand, so much new software is being written by people using the low-friction approach, that Linux distributions may fail to attract new users and especially new developers, and this will hurt them and their users.

On the other hand, the low-friction people may be sawing the tree branch they're sitting on. If distributions suffer, the base on which low-friction development relies on, will wither away, and we'll be left with running low-friction free software on proprietary platforms.

Things for low-friction proponents to improve

Here's a few things I've noticed that go wrong in the various communities oriented towards the low-friction approach.

  • Not enough care is given to copyright licences. This is a boring topic, but it's the legal basis that all of free software and open source is based on. If copyright licences are violated, or copyrights are not respected, or copyrights or licences are not expressed well enough, or incompatible licences are mixed, the result is very easily not actually either free software or open source.

    It's boring, but be sufficiently pedantic here. It's not even all that difficult.

  • Do provide actual source. It seems quite a number of Javascript projects only distribute "minimised" versions of code. That's not actually source code, any more than, say, Java byte code is, even if a de-compiler can make it kind of editable. If source isn't available, it's not free software or open source.

  • Please try to be careful with API changes. What used to work should still work with a new version of a library. If you need to make an API change that breaks compatibility, find a way to still support those who rely on the old API, using whatever mechanisms available to you. Ideally, support the old API for a long time, years. Two weeks is really not enough.

  • Do be careful with your dependencies. Locking down dependencies on a specific version makes things difficult for distributions, because they often can only provide one or a very small number of versions of any one package. Likewise, avoid embedding dependencies in your own source tree, because that explodes the amount of work distributions have to do to patch security holes. (No, distributions can't rely on tends of thousands of upstream to each do the patching correctly and promptly.)

Things for Debian to improve

There are many sources of friction that come from Debian itself. Some of them are unavoidable: if upstream projects don't take care of copyright licence hygiene, for example, then Debian will impose that on them and that can't be helped. Other things are more avoidable, however. Here's a list off the top of my head:

  • A lot of stuff in Debian happens over email, which might happen using a web application, if it were not for historical reasons. For example, the Debian bug tracking system (bugs.debian.org) requires using email, and given delays caused by spam filtering, this can cause delays of more than fifteen minutes. This is a source of friction that could be avoided.

  • Likewise, Debian voting happens over email, which can cause friction from delays.

  • Debian lets its package maintainers use any version control system, any packaging helper tooling, and packaging workflow they want. This means that every package is, to some extent, a new territory for someone other than its primary maintainers. Even when the same tools are used, they can be used in variety of different ways. Consistency should reduce friction.

  • There's too little infrastructure to do things like collecting copyright information into debian/control. This really shouldn't be a manual task.

  • Debian packaging uses arcane file formats, loosely based on email headers. More standard formats might make things easier, and reduce friction.

  • There's not enough automated testing, or it's too hard to use, making it too hard to know if a new package will work, or a modified package doesn't break anything that used to work.

  • Overall, making a Debian package tends to require too much manual work. Packaging helpers like dh certainly help, but not enough. I don't have a concrete suggestion how to reduce it, but it seems like an area Debian should work on.

  • Maybe consider supporting installing multiple versions of a package, even if only for, say, Javascript libraries. Possibly with a caveat that only specific versions will be security supported, and a way to alert the sysadmin if vulnerable packages are installed. Dunno, this is a difficult one.

  • Maybe consider providing something where the source package gets automatically updated to every new upstream release (or commit), with binary packages built from that, and those automatically tested. This might be a separate section of the archive, and packages would be included into the normal part of the archive only by manual decision.

  • There's more, but mostly not relevant to this discussion, I think. For example, Debian is a big project, and the mere size is a cause of friction.

Comments?

I don't allow comments on my blog, and I don't want to debate this in private. If you have comments on anything I've said above, please post to the debian-devel mailing list. Thanks.

Baits

To ensure I get some responses, I will leave these bait here:

Anyone who's been programming less than 12332 days is a young whipper-snapper and shouldn't be taken seriously.

Depending on the latest commit of a library is too slow. The proper thing to do for really fast development is to rely on the version in the unsaved editor buffer of the library developer.

You shouldn't have read any of this. I'm clearly a troll.

Posted Sat Feb 17 15:05:00 2018 Tags: