Glen Pitt-Pladdy :: Blog

Benchmarking Disk performance with RAID and iostat

Benchmarking drive performance is always a tricky business and emulating real-world usage patterns accurately is near impossible.

There is however an alternative approach I discovered on my workstation which over time has evolved a mdadm RAID5 array with 3 different brands of drive. Any RAID level that distributes IO requests evenly will work for this (perhaps RAID1 / RAID10 are not good choices due to seek optimisation that may occur) as this provides real-world IO to 3 drives and the comparative performance can be seen with iostat:

$ iostat -x
Linux 2.6.32-40-generic (****)     11/04/12     _x86_64_    (4 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.86    0.27    1.54   12.68    0.00   81.65

Device: rrqm/s  wrqm/s    r/s    w/s  rsec/s   wsec/s avgrq-sz avgqu-sz  await  %util
sda       6.45   11.80   5.74   2.95  169.92   113.26    32.58     0.19  22.03   3.18
sdb       6.33   11.68   5.69   2.99  168.90   112.53    32.39     0.06   6.89   2.68
sdc       6.38   12.03   5.66   2.88  169.17   114.47    33.21     0.11  12.45   3.28
.....

What this tells us of significance is that sdb is much faster than sdc or sda. Note the long average wait time for sda. Interestingly this brand/model is often regarded as a good performer and there is much debate in forums between the performance merits brands/modes of sda and sdb for best performance while one brand clearly is a significantly better performer.

The drive/model of sdc is actually a very old drive and an outsider brand but seems to give a convincing performance.

One thing to note with all this is that immediately after boot, sda would appear to be by far the strongest performer but the await (average wait since boot) drifts to this ordering with time.

In many ways that's just typical of benchmarking - unless you are able to very accurately replicate usage patterns it's easy to get benchmarks that are not relevant to your situation. In this case from other tests it appears that sda is slightly quicker on seek, but has lower linear read speed. The result is that with the massive thrash during boot when many files are being accessed (lots of seeks) it comes out in front, but with normal running without heavy seeking the higher linear read speed rules.

This is an important effect as it means that if I was to change usage habits or run a different OS, perhaps just a different IO scheduler, the results may be completely different.

I am purposely avoiding saying what makes/models these drives are because the aim of this posting is to highlight the importance of benchmarking with real access patterns and to take benchmarks with different access patterns (such as this posting) in context: only if your access patterns have been accurately replicated will a benchmark be valid.

Comments:




Are you human? (reduces spam)
Note: Identity details will be stored in a cookie. Posts may not appear immediately