Glen Pitt-Pladdy :: Blog
Filesystems for USB drive Backup
Large capacity USB drives have become popular for small scale backups (eg. home or small businesses) and offer good value and convenience. My weapon of choice for these types of backups is dirvish (rsync based) so to test some popular filesystems to see what performs best.
USB has far more limited bandwidth than SATA/SAS as well has high processing overheads. Add to that the fact the devices used in USB drives are typically low-power types (tuned for power efficiency rather than speed). The impact on backup speed is a significant bottle neck and the choice of filesystem can make a big difference.
There are a lot of old favourites (eg. ext2,ext3,jfs,xfs) as well as newcomers (eg. ext4, btrfs). As the aim here is benchmarks for backups, the better established filesystems with proven reliability are definitely where the real choice is while the newcomers are probably best used with caution until they are better established.
One filesystem which is often very fast is ReiserFS. Version 3.6 is included in these tests, however for the purposes of backup it does have some high risk areas such as problems recovering the filesystem where other ReiserFS images are stored. Past experience has also made me cautious of ReiserFS in that it doesn't seem as resilient or as easily recovered from corruption as others. For backups the robustness of a filesystem is critical and these factors need taking into account.
These tests where done on a Dell D630 (Core2 Duo 2.2GHz / 2GB) laptop running Ubuntu Lucid 64-bit with a Western Digital Element 2TB USB drive. No attempt is made to change the default (cfq) scheduler which could have significant effect on performance - tests are done as the USB device would be after connection.
The test data was a subset of directories under /usr of about 1.5GB which was copied on with rsync, then sync run to flush the data, the data was then SHA1 checksummed and then removed with another sync. The sync time has been added onto the time for the preceding operation as the aim here is to see how long an operation takes rather than how much it is buffered.
As data in the backups may be useful to people with malicious intent, it is often a good idea to encrypt it if the secure storage of the backup media can't be guaranteed. For this I also repeated the tests with the partition LUKS encrypted.
These will hopefully give a reasonable representation of backup performance.
Write (rsync) Results
Some interesting things show up here. Most striking is that btrfs has a convincing advantage, but the real surprise is that there is only a tiny difference between the performance with and without encryption - the encryption has a far smaller effect than with other filesystems. A conservative choice would be ext3 or perhaps ext4 which also are good performers after btrfs though worth noting that ext3 seems to perform more poorly with encryption and jfs is another serious consideration here.
Read (sha1sum) Results
Read performance seems to be far more consistent across all filesystems and btrfs again taking a lead. A surprise here is that while jfs write performance was good with encryption it's read performance with encryption puts it at the rear of the pack.
Remove (rm) Results
This is often far less significant and only applies to trying to expiring old backups. Most filesystems are fast enough for this not to be important with btrfs and ext4 again leaders. The surprise here is the drastic impact encryption seems to be having on xfs, to the extent that I have gone and re-checked my results. Also becoming significant is the impact of encryption on jfs. These are the only filessytems where the performance of removing old backups may become a significant concern.
With the bottlenecks involved in USB drives filesystem choice can have a significant impact on backup times - up to nearly double the time depending on your choices. Other tuning (eg. schedulers and filesystem options) is likely to also have noticeable effect on performance.
It's also worth noting that these tests apply only to this particular configuration and that with a different drive, processor or data set the results may be very different.
While it may be too new and unproven for use for backups, btrfs is the clear winner on performance, though also worth noting that this was with kernel 2.6.32-40 on Ubuntu Lucid and subsequent updates to btrfs are reputed to have reduced it's performance.
The more conservative choices are largely between ext4 and ext3 depending on how well proven you want your backup filesystem. Other filesystems such as jfs are also contenders though one thing that makes ext3 and ext4 attractive is the prospect of migrating ext3->ext4->btrfs in the longer term.
One thing that can also be useful (though impacts performance) with backups is compression which allows far longer periods to be backed up giving improved protection against late discovery of corruptions, bit-rot and accidental damage/deletions.
I was intending to test with fusecompress as well however every combination I have tried seems to result in the USB subsustem flaking out, sometimes completely stiffing the whole machine. I suspect some some kind of race condition or a specific bug with the kernel or USB chipset as I have not seen this on other machines where I do use fusecompress with USB devices successfully. Hardware tests show a good health so I have no reason to suspect a hardware problem is to blame.
Recent versions of btrfs also support compression which again makes btrfs an attractive option in the longer term when it is better proven.
Even given a well proven filesystem, for extreme safety it may still be wise to have multiple backup devices in rotation with different filesystems to mitigate the risks of filesystem bugs.
Copyright Glen Pitt-Pladdy 2008-2013