The write intent bitmaps implemented by Linux software RAID seem to suck.

I’m building a file server for myself, and have been looking at various things to make it as good a server as possible. Write-intent bitmaps seem on the surface to be a good thing to use. Russel Coker explains them better than the mdadm manual page:

Linux software RAID has a feature known as write intent bitmaps which means that every time some data is about to be written the region of the RAID array is marked as dirty. Then after a power failure all that needs to be done to ensure that all disks in the array have matching data is to check the regions that are listed as dirty not the entire disk. So instead of spending an hour or more checking the data there would only be a few seconds of work required.

This seems like a good idea. However, Joerg Jaspert found them to be quite slow:

Or a Desktop System with a RAID5 (mad of 4 SATA discs), which normally unpacks a kernel tarball in some 10 seconds - now needs about 2 minutes to get it unpacked. Removing that tree now uses 5 minutes.

I did a little benchmark of my own:

  • create a RAID-5 out of four 250 gigabyte SATA hard disks
  • create an ext3 filesystem on the array
  • run bonnie++, Russel’s disk benchmarking program
  • run genbackupdata, my program to generate data sets for backup programs
  • run find|xargs cat to read all the data generated by genbackupdata

I did this once with an internal write-intent bitmap, and once without. In my file server situation, I don’t expect to have a separate drive for the bitmaps, so an internal one seems like the best way to go.

The results:

               no bitmap    bitmap       relative speed
mke2fs         247 s        600 s        0.4
bonnie, write  115 MiB/s    22 MiB/s     0.2
bonnie, read   126 MiB/s    124 MiB/s    1.0
genbackupdata  397 s        977 s        0.4
cat            471 s        523 s        0.9

The conclusion: the performance impact of write-intent bitmaps is huge. I can’t imagine anyone wanting to use them in a situation where write speed is at all interesting.

See some later discussion.