I’m sometimes cruel to my tools to learn how they behave in unusual circumstances. For example, what is the fastest way to delete a very large number of files?

If the circumstances are just right, the answer may be mkfs to create a new, empty file system. But it’s not always the appropriate answer.

The more obvious answer is sudo rm -rf—or so you’d think. It turns out, however, that sometimes find -delete is a lot faster, or so I learned when I did a little speed testing some years ago. I’ve been told that rsync -delete is even faster than that. What is the current situation?

How would you test that? I would start by creating a large number of files in a directory tree, and then delete them. How many should I create?

Performance tends to only be interesting for large values of N. Thus, I’ve created a file system billion files. Since I pay for my own storage, I created empty files.

I did this twice: once using ext4, once with btrfs. Those are the file systems I currently care about. The disks were overwritten with zeroes first, and afterwards I stashed a way a copy compressed with xzip. The ext4 one is 11 GiB, the btrfs one 31 GiB.

I’ve unpacked the disk images to an LV attached to a VM, then run various commands to delete the file.

command ext4 (s) btrfs (s)
rm -rf 18647 19783
find -delete 18964 19702
rsync –del 24279 53516

Based this entirely unscientific benchmark, the fastest way to delete files is either rm of find, with not much difference between them. rsync seems to be significantly slower.

I didn’t measure memory use or other factors. If someone knows of a CS student who could do this more formally for class, I’m sure it’d be a fascinating project. Have at it.