Welcome to my web log. See the first post for an introduction. See the archive page for all posts. (There is an english language feed if you don't want to see Finnish.)

Archives Tags Moderation policy Main site

Me on Mastodon, for anything that is too small to warrant a blog post.

All content outside of comments is copyrighted by Lars Wirzenius, and licensed under a Creative Commons Attribution-Share Alike 3.0 Unported License. Comments are copyrighted by their authors. (No new comments are allowed.)


I was curious about how fast I can transfer data in various contexts. I did some informal benchmarks: these are not scientific, and your results might be different.

The data transferred is a one terabyte sparse file:

$ truncate --size 1T terabyte

First is just copying files on a disk: from the file to /dev/null.

software sender receiver time (s) speed (MB/s)
cat exolobe1 exolobe1 274 3826
dd exolobe1 exolobe1 2729 384
dd bs=1M exolobe1 exolobe1 304 3449
dd bs=16M exolobe1 exolobe1 314 3338
dd bs=1G exolobe1 exolobe1 328 3196

Conclusion: cat is quite good, and setting almost any size buffer on dd is a win over the default. (dd has other options that cat lacks, such as oflag=direct, which means its still interesting for me, despite a little slower than cat.)

Then network transfers. I use the fling software, which is quite efficient when security is not needed. It does not handle sparse files specially.

Source of the transfer is my laptop (exolobe1), target is the same machine over localhost, a VM under libvirt on the same laptop, a VM under qemu-system on the same laptop, or my server (exolobe2), or a VM on the server (holywood2).

$ fling -p terabyte receiver 8888     # sender
$ receiver: fling -r 8888 > /dev/null # receiver

For extra kicks, transfers between nested VMs inside a VM on the laptop, where the outer VM is running under libvirt or qemu-system. The nested VMs always run under libvirt.

software sender receiver time (s) speed (MB/s)
fling exolobe1 exolobe1 395 2655
fling exolobe1 libvirt guest 613 1710
fling exolobe1 qemu-system-x86_64 4653 225
fling exolobe1 exolobe2 too long 111
fling exolobe1 holywood2 too long 78
fling nested libvirt-guest other nested guest too long 280
fling nested qemu-system other such too long 12

The "too long" results is because I got impatient. The speeds in those cases are from fling -p output.

Conclusion: localhost is fast. libvirt networking inside the same physical machine is fast. Every other case is slow.

I hope that's useful to someone.

Posted Thu May 21 13:44:00 2020 Tags:

We had our third book club meeting yesterday. Daniel posted his summary of our discussion.

Posted Tue May 12 10:02:00 2020 Tags:

I don't like the "infinite scroll" or "never-ending stream" type of communication application. I prefer to have a clear "inbox", where messages go, and from where I can remove them when I'm done with them. The reason I like inboxes is that they make it easier to keep track of things and harder to miss or forget about things. With the stream, I have to capture the message into my GTD system or I'll miss it. Obviously I can do that, and I do, but it's less convenient.

(Note that for the purposes of this rant, whether all your incoming messages go into one inbox or are automatically filtered into many is irrelevant.)

Another reason is that in the streaming model it's harder to look at only the messages that have arrived since I last looked, say overnight while I slept, or while I was busy doing other things. This makes it even easier to accidentally miss important things.

Examples of inboxes: email, SMS.

Examples of streams: Twitter, Mastodon, IRC, Matrix, Telegram, RSS feeds.

You may notice that all "modern" applications tend to be streams. In this, too, I feel like an old man shouting at clouds.

Posted Thu May 7 11:05:00 2020 Tags:

I am tired of the existing Internet email system, both as a sender of email, as a recipient, and as an operator of an email server.

I've been thinking about ways to solve that, and have written an essay about it, without any plans to work on implementing a solution. I'm interested in inspiring more discussion. See HTML and PDF versions. It's a bit long, and does not yet address more than the spam problem.

TL;DR: I don't think the current email system can be improved enough to be worthwhile, but a new system could use digital signatures and digital stamps (tokens) to only allow recipients accept mail from specific senders.

Posted Sun Apr 12 12:09:00 2020 Tags:

Three friends sat down and discussed a book, er, blog post that they'd read: Daniel, Mark, and myself. We live in different countries, so we did this over video conferencing. This is a summary of the discussion.

The article: https://latacora.singles/2020/02/19/stop-using-encrypted.html published in February this year. The title is "Stop Using Encrypted Email". It's not long.

To start with, we all agree that using encryption with the current Internet email system is far from ideal. The blog post correctly points out problems:

  • email metadata (headers, routing) is public, even on encrypted messages
  • it's easy to reply to an encrypted email in cleartext
  • PGP is far from ideal
  • PGP users tend to have long-lived encryption keys, and that if and when they are broken or leak, all messages' security is at danger
  • personal email archives can leak an encrypted message long after it was sent

However, we think that blog post argues too strongly that encrypted email is pointless.

Most importantly, they claim that encrypted email should not be used by anyone, ever, for anything. We find this to be too strong, if understandable. They don't describe an actual threat model, though they give some examples, and seem to mostly concentrate on a threat where a very powerful adversary, with pervasive surveillance capabilities, is trying to catch individuals so they can punish them, and possibly kill them, possibly long after the communication happens. That is certainly a threat model where current encrypted email fails.

However, we claim there are situations where the encrypted email works well enough. For example, password reset emails that are encrypted to the PGP public key registered with the service. The value of the email disappears minutes after it's sent.

Or emails preparing a surprise party for someone's spouse. If the messages leak, it's a bummer, but it's not a big problem, especially after the party is over.

Thus we feel that rather than telling people to not use encrypted email at all, for anything, ever, a more sensible and useful approach is to discuss the risks and give people tools to decide for themselves. Accurate information is more valuable than overblown rhetoric, whether it's for or against email encryption.

We agree that the secure messaging systems they promote are good, but we don't agree that they're as good as the article implies. Signal, for example, routes all traffic through its own servers. A very powerful adversary with pervasive surveillance capabilities can deduce much from traffic patterns. This has already been used against Tor users (see for example 1 and 2).

We're also not entirely happy with messaging systems that require the use of phone numbers. Signal is one of these. Signal is also problematic when changing phones or phone numbers, as all trust relationships within it have to be re-established.

Messaging systems are also meant for use cases that aren't all the same as email's. For example, offline use, and long-form messages. We see messaging systems and email as complementary more than competing.

We also do not agree that improving email security is as hopeless as the blog post claims. Much could be done just by improving email client software. That said, we repeat that we agree that it's not going to be good enough against their implied threat model.

For example, email clients and servers could refuse to send or accept email except over unencrypted or unverified channels, or emails that are unencrypted. This wouldn't help, say, gmail users, but we would not expect people with the blog post's implied threat model to use gmail. Or email at all.

In summary, we do think the email system could be improved. We just don't think it and its encryption are as useless as the blog post claims, and we don't think the blog post is making things better.

Posted Sat Apr 11 08:26:00 2020 Tags:

TL;DR: I wrote a little program to build and test software in a pair of nested virtual machines, to reduce the risk of bugs or malware in dependencies doing bad things. It's called the Contractor and it's just barely usable now. Feedback welcome.

Software development is a security risk.

Building software from source code and running it is a core activity of software development. Software developers do it on the machine they use for other things too. The process is roughly as follows:

  • install any dependencies
  • build the software
  • run the software, perhaps as part of unit testing

When the software is run, even if only a small unit of it, it can do anything that the person running the build can do, in principle:

  • delete files
  • modify files
  • log into remote hosts using SSH
  • decrypt or sign files with PGP
  • send email
  • delete email
  • commit to version control repositories
  • do anything with the browser that a the person could do
  • run things as sudo
  • in general, cause mayhem and chaos

Normally, a software developer can assume that the code they wrote themselves doesn't do any of that. They can even assume that people they work with don't do any of that. In both cases, they may be wrong: mistakes happen. It's a well-guarded secret among programmers that they sometimes, even if rarely, make catastrophic mistakes.

Accidents aside, mayhem and chaos may be intentional. Your own project may not have malware, and you may have vetted all your dependencies, and you trust them. But your dependencies have dependencies, which have further dependencies, which have dependencies of their own. You'd need to vet the whole dependency tree. Even decades ago, in the 1990s, this could easily be hundreds of thousands of lines of code, and modern systems a much larger. Note that build tools are themselves dependencies, as is the whole operating system. Any code that is used in the build process is a dependency.

How certain are you that you can spot malicious code that's intentionally hidden and obfuscated?

Are you prepared to vet any changes to any transitive dependencies?

Does this really matter? Maybe it doesn't. If you can't ever do anything on your computer that would affect you or anyone else in a negative way, it probably doesn't matter. Most software developers are not in that position.

This risk affects every operating system and every programming language. The degree in which it exists varies, a lot. Some programming language ecosystems seem more vulnerable than others: the nodejs/npm one, for example, values tiny and highly focused packages, which leads to immense dependency trees. The more direct or indirect dependencies there are, the higher the chance that one of them turns out to be bad.

The risk also exists for more traditional languages, such as C. Few C programs have no dependencies. They all need a C compiler, which in turn requires an operating system, at least.

The risk is there for both free software systems, and non-free ones. As an example, the Debian system is entirely free software, but it's huge: the Debian 10 (buster) release has tens of thousands of software packages, maintained by thousands of people. While it's probable that none of those packages contains actual malware, it's not certain. Even if everyone who helps maintain is completely trustworthy, the amount of software in Debian is much too large for all code to be comprehensively reviewed. Also, no amount of review will catch all bugs.

This is true for all operating systems that are not mere toys.

The conclusion here is that to build software securely, we can't assume all code involved in the build to be secure. We need something more secure. The Contractor aims to be a possible solution.

See the README for instructions how to try it. See the subplot document for more about the architecture and so on for how it works.

The Contractor has only just reached a state where it can build and test some of my other projects. It's ugly, buggy, and awkward, but I expect to have much fun using and improving it in the future. Maybe you'd like to join the adventure?

Posted Thu Apr 9 20:34:00 2020 Tags:

Mike Godwin in an essay on slate.com:

That’s the biggest thing I learned at the Wikimedia Foundation: When ordinary people are empowered to come together and work on a common, humanity-benefiting project like Wikipedia, unexpectedly great and positive things can happen. Wikipedia is not the anomaly my journalist friend thinks it is. Instead, it’s a promise of the good works that ordinary people freed by the internet can create. I no longer argue primarily that the explosion of freedom of expression and diverse voices, facilitated by the internet, is simply a burden we dutifully have to bear. Now, more than I ever did 30 years ago, I argue that it’s the solution.

I thought that was well said.

Posted Mon Mar 9 09:35:00 2020 Tags:

I asked a couple of weeks ago what people like or hate about email. Here's a summary of the responses. I admit the summary may be tainted by my current thinking about re-inventing email.

Like

  • It's not real time. Sender and recipient do not net need to be participating in the communication at the same time. The sender can take their time to craft their message, the recipient can take their time to ponder on the message and how to respond.

  • It's established, ubiquitous.

  • It's de-centralized.

  • It's built on top of well-known data formats and protocols, and data can be stored locally under user control, and is highly portable. There are a variety of client software to choose from.

  • Separate discussions are kept separate.

  • Formatting, attachments, and lenght is flexible.

  • Mailing lists can be archived publically.

  • One can have many accounts, and people comprehend this.

  • Subject lines.

  • Email providers are neutral, commodity entities. Choosing one doesn't imply membership in a community.

Not like

  • Unreliable for communication, often due to bad anti-spam.

  • People sending one-line replies that don't add actual value or that miss the point entirely.

  • Encryption, security, privacy, rich media content, formatted messages, etc, are all built on top of older protocols, often resulting in unfortunate consequences.

  • Top quoting.

  • De-facto oligopoly.

  • Spam.

  • Abuse.

  • Configuring and administering email servers is complex.

  • Filters and organisation of email is often difficult. The tools provided are not always well suited for the task.

  • Threading is unreliable.

  • Email addresses are too tightly tied to your identity.

  • Searching is often inadequate.

Posted Sun Mar 8 09:33:00 2020 Tags:

A friend expressed interest in how I keep my journal, so I set up a demo site. In short:

Posted Sat Mar 7 11:32:00 2020

I wrote an alternative Debian installer as a toy, called v-i. One of the following two bullet points is correct:

  • v-i can install a very rudimentary Debian onto exactly one computer in the world: my very own spare Thinkpad x220 laptop. It might not work on your x220. v-i almost certainly won't work on any other kind of computer. If you try, it will probably delete all your data. Make sure your backups work.

  • v-i is perfect in every way. There are not even any typos in the manual. There are no bugs, and all features are fully implemented. Every possible use case is supported. Not only is there no danger to your data, v-i will prevent it from ever disappering. Even your hardware will never break again. Your laptop will have infinite battery life, and your screen resolution will require 64 bit integers to express.

The v-i installer is based on the vmdb2 tool, which I also wrote. It has nothing to do with debian-installer, which is the official Debian installer, also known as d-i. I use d-i, but have a couple of things I wanted to change:

  • I'd like something I can easily modify. d-i requires building special udeb packages for any software that's to be part of the installer. v-i is happy with normal debs.

  • Debian in general uses preseeding for automating an installation. Preseeding means providing answers, in a file, to questions the package may ask during its installation. This is fine, if a little cumbersome, but only helps when the packages ask the right questions. v-i lets you have the full power of Ansible during initial installation, which is much more flexible.

On the other hand, d-i is mature software and tested by a very large number of people, on a very large number of different hardware. v-i is not. v-i might, at best, be the beginning of something useful for a small number of people.

I can now install Debian onto my x220 with v-i. It's a very basic install, without LVM2, full-disk encryption, or a graphical desktop, but it does have sshd and I can configure the laptop further with Ansible from another host. I've installed the GNOME desktop that way, after rebooting into a v-i installed system. (In theory, I could install GNOME directly from v-i. In practice, there are bugs in packages and/or how vmdb2 runs Ansible.)

The installed system is also highly configured to my needs and preferences. It uses Finnish locales, and requires my SSH key to log in. The root account has no password. All of this could be made better with a bit of work.

The code is at https://gitlab.com/larswirzenius/v-i. Check the README for more instructions if you're curious. If you do give it a try, I'd love to hear from you, unless you just lost all your data. Please don't lose all your data.

If you'd like to help build a more viable installer from v-i, please talk to me. I dream of a future where I can install a bare metal machine as easily as I can create and configure a VM.

PS. A 128 GB USB3 flash drive can be had for as little as 20 euros, and that has enough disk space for v-i and a Debian mirror.

If you want to respond to this blog post, please email me (liw@liw.fi) or respond to this fediverse post.

Posted Sat Feb 29 20:12:00 2020 Tags: