For my backup program, Obnam, I needed to find a way to insert many rows into an SQLite database table as quickly as possible. Obnam uses an SQLite database, in its own file, for each backup generation, to store the names and other metadata about the backed up files. The file content is stored separately. Thus, one row per backed up file, and I have over a million files.

Here’s where I started from, a very simple, obvious approach:

for i in 0..N {
    let t = conn.transaction()?;
    t.execute(
        "INSERT INTO files (fileno, filename, json, reason, is_cachedir_tag) VALUES (?1, ?2, ?3, ?4, ?5)",
        params![i, filename, "", "", false],
    )?;
}

I looked on the web for suggestions for making this faster. I found a Jason Wyatt article with some good tips. Specifically:

  • use one transaction, not one transaction per insert
    • in Obnam, either all inserts succeed, or the backup fails
    • thus, transactions aren’t needed for this case
    • for me, this almost doubled the speed
  • use prepared statements
    • for me, this was an almost 5x speed up

Here is what I ended up with:

let mut stmt = conn.prepare_cached("INSERT INTO files (fileno, filename, json, reason, is_cachedir_tag) VALUES (?1, ?2, ?3, ?4, ?5)")?;
let t = conn.unchecked_transaction()?;
for i in 0..N {
    let filename = format!("file-{}", i);
    stmt.execute(params![i, filename, "", "", false])?;
}
t.commit()?;

The results:

  • initial simple approach: 117509 inserts/s
  • one transaction: 209512 inserts/s
  • also prepared: 970874 inserts/s

It might be possible to improve on this further, but with this is already so fast it’s not even close to being a bottleneck for Obnam.

I then re-did the Obnam database abstraction, and it improves the speed for Obnam in one benchmark quite dramatically. The merge request is under review now.