Menu
Index

Contact
Atom Feed
Comments Atom Feed

Similar Articles

2015-12-27 23:40
Linux RAID (mdadm) and rebuild tuning

Recent Articles

2019-07-28 16:35
git http with Nginx via Flask wsgi application (git4nginx)
2018-05-15 16:48
Raspberry Pi Camera, IR Lights and more
2017-04-23 14:21
Raspberry Pi SD Card Test
2017-04-07 10:54
DNS Firewall (blackhole malicious, like Pi-hole) with bind9
2017-03-28 13:07
Kubernetes to learn Part 4

Glen Pitt-Pladdy :: Blog

Linux RAID (mdadm) Rewriting Bad Blocks

On my Home Lab storage server I have 15TB of storage in RAID6. With large drives there appears to be increased risk (more space = more opportunity) of developing a bad spot as they wear. That's exactly what has happened on one drive.

RAID6 gives N+2 redundancy so this is not a major problem but smartmontools warns me about the pending sectors when the machine boots up and it would be good to force them to reallocate and stop this.

Drives are built with spare capacity so that wearing areas on the drive can be reallocated to the spare area. This typically happens on a write when the data on the bad spit can't be read. The simple way of doing this would be to completely rebuild the array but that takes a very long time so I worked out a trick.

Bitmaps

A useful feature with Linux RAID is the ability to have a bitmap which tracks writes so that when the array needs to be rebuilt it knows specifically what blocks to resync. This saves enormous time. You can see if there are bitmaps with:

# cat /proc/mdstat

Which will give the bitmap status like 1/23 which essentially is the number of pages which have bits set. It also gives the bitmap chunk size which we will use later.

If the array doesn't have a bitmap then it's easily added with:

# mdadm --grow /dev/mdX --bitmap=internal

After this the bitmap allocation should fall to hopefully 0 (or nearly ongoing activity). This enables devices to be removed and re-added to the array and rebuilds to only have to hit the blocks where there are outstanding writes.

There are other things that you can do like optimising the bitmap chunk size, but this is really beyond what I'm looking at here.

Bad Blocks

I have identified some bad blocks on the wearing device. There are various ways of doing this including looking in logs, scanning with badblocks, running a long SMART test and checking the error position, and even just looking at the SMART logs.

Next we need to do some maths. We need to work out which bitmap chunk (hence bitmap number) the bad blocks fall in. This is basically dividing the position on the device by chunk size and rounding down.

In my case I've identified sector (512 byte) 4127647976 is a bad spot and have the default 65536k (64M) chunks. This turns into:

4127647976 / 2 = 2063823988k

2063823988 / 65536 = 31491.45....

Rounded down this is chunk 31491

This is the chunk number that we will need to use in the bitmap to force the array to re-write this area on our device.

It's important to keep in mind that this number has to be calculated based on the actual device added to the array. If this is a partition then you need to ensure it's the position of the bad block within that partition that you are using.

Rewriting

Essentially the process I'm following is:

  1. Manually fail and remove the device with bad blocks
  2. Manually set the bit for the region with bad blocks
  3. Re-add the device to the array resulting in a re-sync and forcing the re-sync of the chunk we manually set

To fail and remove the device from the array I use:

# mdadm --fail /dev/mdX /dev/sdYZ
# mdadm --remove /dev/mdX /dev/sdYZ

Now we need to manually set the bitmap with:

# echo 31491 >/sys/devices/virtual/block/mdX/md/bitmap_set_bits

If you want to do a range then it's easy enough to do a for loop:

# for (( i=31400; $i<31600; i=$(($i+1)) )); do echo $i >/sys/devices/virtual/block/mdX/md/bitmap_set_bits; done

Then just add the device back:

# mdadm --re-add /dev/mdX /dev/sdYZ

Then wait for the rebuild to complete:

# cat /proc/mdstat

If you check there will hopefully now be fewer Pending Sectors and/or Offline Uncorrectable sectors, and possibly more Reallocated Sectors.

In my case I have a few groups of sectors so need to hit a few areas with this method. Bad sectors have gone from 1168 to 24 after doing this. The remaining sectors are probably in another location that I haven't discovered yet.