Glen Pitt-Pladdy :: BlogFilesystems & Fragmentation | |||
Everyone knows how rapidly Windows filesystems fragment and the consequent impact on performance, but it's something that is seldom talked about with Unix. Apple goes as far as to say that it isn't necessary to defrag their filesystems though people have found significant performance improvements by defragging anyway. While it may be true that Unix filesystems are more resistant to fragmentation, it's not to say that all files don't get fragged up to the extent it can be a performance problem, and many Linux filesystems have no defragmentation tools. ext4 is a bit of an exception in that it goes all the way to having online defragmentation, just at this stage it's not considered release ready so not available in Debian, Ubuntu or probably many other distros. How bad is it?There is a Perl script in the Gentoo forums for finding fragmentation. Inspired by this I have written my own script along the same theme with increased safety measures and to provide further information: the maximum fragmented files, the number of bytes per fragment, plus a histogram of fragmentation in .csv format that can be loaded into a spreadsheet for further analysis or graphing. Download: fragmentation analysis Perl script To find all the fragmentation on your mounted filesystems try something like this: # for d in `mount | grep ^/ | awk '{ print $3 }'`; do n=`echo $d|sed 's/\//_/g'`; echo $n; /path/to/findfrag $d >/somewhere/safe/$n.csv; done I did 3 machines: my workstation, laptop and server The results where surprising. Considering there had never been any defragmentation done on any of these filesystems in several years there was very small proportion of fragmented files on them, though some massively fragmented files in specific areas. In some cases files with >10k extents. Typically only a few % of files are fragmented on each filesystem which shows how resilient these filesystems are to fragmentation. While this is far more resilient than I would have expected from Windows, it's still a major performance hit when badly fragmented files are accessed frequently. As an example, take a CD image that is badly fragmented with ~13k extents. That's a seek every 55kB of data read. I copied it and the copy had 208 extents or a seek every 3.5MB which is far more sensible. The question is how much that affects performance. After a reboot (to ensure cache is clear), copying the files to /dev/null with dd using 8k blocks yields: The ~13k extents (highly fragmented) file: 731453440 bytes (731 MB) copied, 78.8668 s, 9.3 MB/s The 208 extents file: 731453440 bytes (731 MB) copied, 36.3631 s, 20.1 MB/s The figures speak for themselves on the effects of fragmentation. It may not be a huge problem in many cases but on badly fragged files it is having an effect none the less. What is also important to consider is even if the performance degradation is not important for the fragmented file, it is none the less resulting in disk seeks which is also affecting access to other (possibly non-fragemented) files. The range of fragmentation on my systems is huge - one filesystem is at 7GB/fragment (negligible), another 17kB/fragment (severe!). On my home server /var and the squid cache are particularly bad. One place that this was having a big impact was things like Cacti .rrd files which where badly fragmented and around 1000 updates are performed on them when a poll occurs every 5 minutes. The resultant disk thrash was causing MySQL to log slow queries during polling as well as many other noticeable slow-downs coinciding with Cacti polling. Poor man's online-defraggerGiven the absence of practical defraggers an alternative approach is needed: If I make an archive copy (cp -a) of a file, everything remains the same (assuming no hard-links) but there is a reasonable chance that the copy will have less fragments. I wrote a script to do just that - it looks at the input files and if they are fragmented it makes an archive copy, if it's an improvement then it replaces the original file. It's not efficient, but it does improve fragmentation dramatically, and with that performance. Danger!Before we go any further, make sure you have backups and can restore completely from them. It seems obvious advice but I am always surprised how many people don't have backups or don't test restoring from their backups. Although there are precautions in the script, there are a few serious problems here which we can't fully protect against:
To discourage use of the defragger script without sufficient knowledge it ships in a state where it will not run - you have to make some minor modifications to the script in order to be able to run it. In useIn order to get the fragmentation info for the file (by running filefrag), the script has to be run as root. As a safety precaution, the script will drop down to the user and group privilege you specify on the command line when it can. This is useful for reducing risks when defragging areas where users may create files and there could be something malicious, but also remember this is a precaution and not a guaranteed solution. The remaining arguments are the files to defrag. Additionally there are some settings which go first on the command-line:
Download: defragger Perl script Examples# defragfiles --dorecent www-data www-data /var/lib/cacti/rra/*.rrd Will defrag the .rrd files used by Cacti which are highly prone to serious fragmentation. I suggest you watch the logs and only do this between Cacti finishing polling and before the next poll to avoid the risk of corruption, or better yet disable Cacti while doing this. # find ~joebloggs/.thunderbird/ -xdev -type f | defragfiles --verbose --readstdin joebloggs joebloggs Thunderbird can get things like mail files really fragged up so you may want to do this from time to time. Obviously make sure Thunderbird is not running else bad stuff could happen with your mail. # defragfiles --verbose --skiplowfrag=512 joebloggs joebloggs ~joebloggs/VMImages/* Defragging VM Images, but we really don't want to defrag these large files if we don't have to so we use --skiplowfrag=512 to skip files which have less than one fragmentation per 512kB. |
|||
This is a bunch of random thoughts, ideas and other nonsense, and is not intended to be taken seriously. I'm experimenting and mostly have no idea what I am doing with most of this so it should be taken with cuation and at your own risk. Intrustive technologies are minimised where possible. For the purposes of reducing abuse and other risks hCaptcha is used and has it's own policies linked from the widget.
Copyright Glen Pitt-Pladdy 2008-2023
|