Comments in the moderation queue:
Recent comments on posts in the blog:
Thank you for this! I laughed so loud that folk in the office glared at me!
Just the kind of thing I needed today
I think backups that "just worked" by default would be awesome.
Your example assumes:
* An experienced user that realizes the value of backups
* A user who remembers to backup (experience isn't enough)
* has an empty thumb drive
* has a remote admin
* has a remote server
As mentioned by other comments, it also assume the remote admin is willing to open up SSH. Why not set some kind of simple get/put based protocol, that way the server would have to trust the client less, and you could protect against compromised clients deleting everything.
If the server could support a nice simple object based protocol then you could get other desirable features. Like say discovery of other peers to trade storage with. It would be good for clients as well. Say trade 64GB with 4 other peers and maintain 4 copies if your encrypted backups on those peers. Those with more expertise and resources would of course prefer to backup to their own storage, but the have nots would rather N random peers have their encrypted backups than no backups at all.
Any reason Obnam wouldn't work with some simple object based key/value store? Maybe something simple using http://amp-protocol.net/ or protocol buffers to replace ssh.
Manual work! You're suggesting I do manual work? Eww.
Luckily, I could write a ten line Python script, throwaway, or a hundred or so line of Python application, if I wanted to make a proper tool. Maybe someday I will.
One thing I'd like to add. In some cases, like my personal workstation, there's no reason to backup all of that Debian stuff that will just be downloaded when I reinstall. Not backing up so many files that are easily replaced could speed up backups and reduce the amount of data to be deduplicated.
If debsums data is available, then it is fairly easy to generate a list of things that can safely be skipped - though we probably also want a dump of the packages installed to facilitate reinstallation.
First, "Has never seen the drive before" is an overly loose condition - if the system is going to magically offer to format the drive, it should restrict itself to drives that are unpartitioned, or only have empty partitions.
Second, the system you laid out for remote backups closely mimics Obnam's remote support (SFTP), so I'll air one of my concerns with that: Obnam's per-client encryption protects against disclosure and alteration, but not erasure. It does not provide sufficient isolation.
Third, SSH in general is tragically overused as a generic encrypted transport, when it is - in the end - a remote shell protocol. When the default behavior is "let the client run arbitrary commands" there's a lot of room to screw up the configuration in security-relevant ways.
Fourth, by wedding it to SSH rather than pluggable auth, you impose a presort on the possible implementations. Some people might prefer to use TLS - and not necessarily with PKI, since TLS has multiple authentication mechanisms including zero-knowledge password proofs.
Fifth (...and now I'm hearing Aahz comment that he's rather partial to fifths), Roger has a good point regarding priority and volume. Fortunately, a reasonably close approximate solution is to treat them as inversely related - and a better solution is that plus the ability to mark a directory or file as critical (which should apply recursively).
@Roger, there are actually a lot of reasons not to do that. The main one is that it will put performance straight into the toilet because of the additional seeks needed on any write. That gets to be less of an issue on SSDs... but those tend not to have all that much free space anyway.
Your Summary: "It just works"
There are two harder pieces to deal with. One is larger volume data items. For example videos (and even photos) from cameras can be very large, and certainly larger than the local backup storage available plus take a long time to send over the net. Right now it is way easier to use something like Dropbox for that.
The other piece is prioritising the data. Some files are more important than others and you really want to make additional copies of them so that they can survive more failures. For example a single bad sector on that backup flash drive shouldn't be a problem for your more important data.
I've wanted the latter piece for filesystems. It seems insane to me that filesystems have unused space. That space should be filled with additional copies of your most important data.