Friday, September 10, 2010

Server disk and raid problems

I have two KVM servers that are pretty much identical, both running RAID1. I rarely log into the host systems, but did so today because I needed to create a new host on one of them. It turns out that I couldn't do anything because the host's drive had been set to read-only mode. I checked the other system, and it too was in read-only mode!

Upon researching the problem, I found that sdb appeared to be hosed. If the server experiences a disk problem, it automatically sets itself to read-only mode. However, I didn't know this because I haven't been good at monitoring the host servers, and all of the guest servers have been running just fine. Here is a snippet from the dmesg command:

[2116582.870838] Aborting journal on device dm-0:8.
[2116582.871461] EXT4-fs error (device dm-0): ext4_journal_start_sb: Detected aborted journal
[2116582.871512] EXT4-fs (dm-0): Remounting filesystem read-only

Upon looking at /var/log/syslog, I found the last entry to be dated Jun 18 on one server and Aug 7 on the other. I need to monitor these servers better...

I also found that both systems were running on only a single drive out of the RAID1 set. The output of cat /proc/mdstat indicated only one drive one active on each system. I don't know if this was related to the problems with sdb on each system, but it needed to be dealt with.

To fix the problem, I had to reboot each server (shutdown -r now). When the server came up, it told me I needed to manually run fsck, which I did. Then I rebooted again and the server came up properly. So next, I added the missing drive back into the raid array:

mdadm --manage /dev/md0 --add /dev/sdb1

Checking /proc/mdstat showed the partitions syncing. I rebooted just to make sure it all came back up properly, which it did. One of the servers finished syncing a little more than an hour later, the other one looks like it will take closer to 2 hours.

I do this so rarely, that I had forgotten the commands to use, but an earlier blog entry helped jog my memory.

No comments: