I've spent most Sundays for the past half a year implementing Obnam 3, the third generation of my backup program. I've posted a blog post of each three-hour session on the Obnam blog. It's way too much detail for anyone not passionately interested in this project. Here is a summary of what I've done. There is also an appeal for help.
I've implemented the lowermost storage layer of storing backups: the chunk. A chunk is a piece of data, either a small file or a part of a longer file. The chunk is encrypted in a way that also allows verifying the chunk hasn't been modified while in backup storage.
Each chunk is encrypted with a random, computer-generated symmetric key, which the user never sees. There can be any number of such keys, for different groups of chunks, although the implementation doesn't yet make it convenient to choose the key to use when encrypting a chunk. The chunk keys are stored in a client chunk, which itself is encrypted with another random, computer-generated key, the client key.
The client key is encrypted in various ways, and the result of each of those encryption operations is stored in a credential chunk. I've implemented credential encryption methods using OpenPGP software keys, and OpenPGP cards.
This part works and although it needs polish, I'm pretty happy with it.
There is also a rudimentary backup repository, which stores chunks in a local directory and allows searching for chunks by id or label. Chunk labels are short strings cryptographically attached to the chunk to give the type of a chunk, or the encrypted checksum of the plaintext data in a chunk, for de-duplication.
I've intentionally limited myself to a single Sunday session per week, at most three hours per session. This has been quite enjoyable: I am not in a hurry, and I can try to be careful and deliberate. In my profession that is not as common as I would like. Three hours a week has been enough to make progress, even if slowly. But fast enough for a hobby project.
i'm not yet sure what I will do next, but supporting remote backup
repositories seems like a sensible choice. I will need to do some research
for that: I will need to learn about the S3 API, and look at the Rust iroh
library for NAT hole punching.
Obnam is a large project, more than I can do by myself. Obnam needs, for example, documentation, even if at this stage for developers, not yet end users. There's code changes needed, too: more credential methods (password, TPM2 chip, ...), and all the code actually make backups. Someone will need to research and implement ways of splitting different kinds of files into chunks. It would be good to have a better idea of what's needed: use cases, acceptande criteria. There is no shortage of things to do.
What part of building backup software interests you? How would you like to help?