Some day Obnam, my backup program rewrite, will be ready for others to use. I know from experience that I get motivation from knowing my software is useful to others. Usually that happens when people have problems and ask for help, but that feels unsatisfactory.

I’d like to know how many people use Obnam, and how they use it.

I’ve started thinking about ways to get some indication of that. I’ve not decided on how, yet, and in any case it’s too early to start anything serious about this, when I don’t even use the software myself. However, I thought I’d write down my current thinking.

Some ground rules:

  • Whatever feedback mechanism I set up, it’s going to be optional. I am not going to require any kind of mandatory “call home” system to be used, and I’m not even going to enable anything like that by default. It would be creepy. Explicit opt-in is the way to go. This will skew the numbers, but that’s OK.

  • It will also be transparent what data I collect. I will not collect any sensitive information. I don’t want to know who uses Obnam.

  • I won’t publish the raw data I collect, but I would like to publish some aggregate statistics.

Below are some ideas for what and how to collect. Feedback on these would be welcome.

  • Since new the Obnam will have a server component, I could run an instance and that would give me some of the information I want. However, I am not in a financial position to run a server for many people, and I don’t want to run this as a business. So that’s out.

  • I will provide an installation package for Debian, possibly other ways. If I run the download server myself, I can count downloads. That gives me some indication of user count.

  • I can provide an obnam register command. It would invent a random identifier for the client instance, and send a message to my server. The message would basically only be this:

  • The server would collect the identifiers with registration timestamps in a log file. Counting unique identifiers would give me some indication of user count. This would not give me any other information. In theory, I could log IP addresses of hosts that send the registration to my server, but I hope people do that over Tor, and so IP addresses are useless to me. I wonder, how hard would it be to write code to make it always happen over Tor?

    If people run the registration command multiple times, they can give me feedback of continued use. That would be interesting as well.

  • User counts doesn’t tell me how people use Obnam. In order to guide the architecture evolution of Obnam, it would be interesting to have some rough numbers of the kind of data people back up with my software. I could have a separate command obnam report that sends to my server things like how often an Obnam backup has been made since the last report, how many live data files there were, how much data in those files, and so on.

    The report could be sent to my server directly, or written to a file. There could be a little script (wrapper around curl) that sends it to my server. This would allow people to inspect exactly what information is sent.

    I don’t see a need to tie the reporting to the registration. However, I don’t want the statistics reporting to be the only option, either.