[SOLVED] Random change of RAID from /dev/md0 to /dev/md127

Forum rules
Before you post please read this

[SOLVED] Random change of RAID from /dev/md0 to /dev/md127

Postby Quasimodo on Thu Apr 05, 2012 12:18 pm

After considerable help in getting my RAID5 array (5 x Seagate 320GB SATA drives) to wake up after I changed from Ubuntu 11.04 to Linux Mint 12, I have sporadic problems on startup:

1. On power up, sometimes my system boots to the login screen with no problems; other times there's a little disk activity then a blank screen, no longer how long I wait.

2. If I use <Ctrl> <Alt> <Del> from the blank screen, the system reboots and presents me with the GRUB menu, offering normal boot (the default), boot into recovery mode and memtest. I then select the default.

3. The system sometimes boots to the login screen, but more worryingly sometimes reports that one or more of the partitions (sdb1, sdc1, scd1, sde1, sdf1) in the RAID array is faulty.

4. If the system reports problems with the RAID array, I select "N" to decline booting with the dgraded RAID array - this drops me to a Busybox prompt. <Ctrl> <Alt> <Del> reboots the system, which *usually* gets me to a login screen OK.

After this performance, I find that the RAID array has switched from being /dev/md0 to /dev/md127; according to the advice I received in the exchange of posts about waking up my RAID array, this is because mdadm gets confused by the content of /etc/mdadm/mdadm.conf. I've copied the content of this file below.
Code: Select all
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md0 metadata=1.2 UUID=f1802834-b239-4bd0-a805-d43954b77f95


The UUID of the RAID array matches what I get from
Code: Select all
sudo blkid


I've also made sure that the entry in /etc/fstab to mount the raid array at /home/shared uses the same UUID. Because I'm identifying the RAID array using the UUID in /etc/fstab, it mounts OK regardless of whether it's /dev/md0 or /dev/md127.

The most puzzling thing is that the output from sudo update-initramfs -u is:
Code: Select all
W: mdadm: the array /dev/md0 with UUID 66698c52:4c458d41:87b74864:110faf9b
W: mdadm: is currently active, but it is not listed in mdadm.conf. if
W: mdadm: it is needed for boot, then YOUR SYSTEM IS NOW UNBOOTABLE!
W: mdadm: please inspect the output of /usr/share/mdadm/mkconf, compare
W: mdadm: it to /etc/mdadm/mdadm.conf, and make the necessary changes.


When I do as recommended, and run /usr/share/mdadm/mkconf, the output is:
Code: Select all
# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md/TheBigOne metadata=1.2 UUID=66698c52:4c458d41:87b74864:110faf9b name=:TheBigOne


I noted that the output of /usr/share/mdadm/mkconf shows the array as /dev/md/TheBigOne, which matches the name shown by the Gnome disk utility.

I'm not sure whether the booting problem is linked to the peculiarities with mdadm, but I have my suspicions. I should add that I doubt whether it's a hardware problem; when I was running Ubuntu 11.04, I had no problems with erratic booting or the random change from the RAID array being /dev/md0 to being /dev/md127.

Can anyone point me to which log files I should check to track down what's happening with the boot process, as a starter, please?

Any help would be much appreciated...

Ian
Last edited by Quasimodo on Wed Apr 11, 2012 7:32 am, edited 1 time in total.
Quasimodo
Level 1
Level 1
 
Posts: 48
Joined: Mon Oct 31, 2011 9:18 am
Location: Newbury, Berkshire, UK

Linux Mint is funded by ads and donations.
 

Re: Random change of RAID array from /dev/md0 to /dev/md127

Postby kwisher on Fri Apr 06, 2012 3:09 pm

What happens when you remove the UUID from your fstab file?
The instructions suggested Windows XP or better, so I installed Linux :)
User avatar
kwisher
Level 5
Level 5
 
Posts: 635
Joined: Wed Mar 05, 2008 12:54 pm
Location: Greentown, Indiana USA

Re: Random change of RAID array from /dev/md0 to /dev/md127

Postby Quasimodo on Sat Apr 07, 2012 9:17 am

Thanks for the suggestion. I tried that (details below), and the system came up to the point where it tried to mount /home/shared/, then failed because it couldn't find the RAID array identified in /etc/fstab.

Overall procedure:
1. I unmounted /home/shared if necessary
2. Using gnome-disk-utility, I stopped the RAID array (which was identified as /dev/md127)
3. Using gnome-disk-utility, I started the RAID array (which came up as /dev/md0)
4. I edited /etc/fstab and rebooted. Both times, I wasn't all that surprised to find when I ran sudo blkid to find that the RAID array was /dev/md127.

changes to /etc/fstab:

1. I commented out the line which identified the RAID array by its UUID, and replaced it with a line which identified it as /dev/md0.

2. I modified the line which identified the RAID array as /dev/md0 to identify it as /dev/md/TheBigOne (as indicated when I ran sudo /usr/share/mdadm/mkconf).

It seems as though gnome-disk-utility is quite happy to use the data in /etc/mdadm/mdadm.conf to set up the RAID array, but when the system restarts, *something* is trying to bring up the RAID array using spurious data from somewhere else (as witness the different UUID for the RAID array which comes up when I run sudo update-initramfs -u and when I run sudo /usr/share/mdadm/mkconf).

I really would like to be able to track down why this is happening (at the "user" level, rather than the "climb into the workings" level, it would also be nice to have a system which just boots up properly without having to mess around using <Ctrl><Alt><Del> several times before I can log in!)

Ian
Quasimodo
Level 1
Level 1
 
Posts: 48
Joined: Mon Oct 31, 2011 9:18 am
Location: Newbury, Berkshire, UK

Re: Random change of RAID array from /dev/md0 to /dev/md127

Postby kwisher on Sat Apr 07, 2012 12:06 pm

The instructions suggested Windows XP or better, so I installed Linux :)
User avatar
kwisher
Level 5
Level 5
 
Posts: 635
Joined: Wed Mar 05, 2008 12:54 pm
Location: Greentown, Indiana USA

Re: Random change of RAID array from /dev/md0 to /dev/md127

Postby Quasimodo on Mon Apr 09, 2012 12:03 pm

Thanks for the suggestion.

Yes, It can - I keep a backup of the contents of my RAID array on a separate file server (also with a RAID5 array...). However I've already tried this once, using the instructions at http://www.ainer.org/raid-5-6-install-setup-configuration-guide-for-ubuntu-10-04-lts-lucid-lynx, which is what I used originally to set up the RAID when I was running Ubuntu 11.04 (and, incidentally, to set up the RAID on my file server, which is running Debian Squeeze). Things seem no better than before I spent the several hours rebuilding the RAID array.

Since I have quite a lot of space free on my primary hard drive, I think I'll use gparted to tweak my partitions and free up some space to install another distro alongside Mint, and see how I get on with that...

Ian
Quasimodo
Level 1
Level 1
 
Posts: 48
Joined: Mon Oct 31, 2011 9:18 am
Location: Newbury, Berkshire, UK

Re: Random change of RAID array from /dev/md0 to /dev/md127

Postby kwisher on Mon Apr 09, 2012 12:28 pm

Go to this forum discussion and read the post by Bigi with the date of 4/9/2012. This might provide the key for you if you are willing to rebuild your array again. http://www.technibble.com/forums/showthread.php?t=36159
The instructions suggested Windows XP or better, so I installed Linux :)
User avatar
kwisher
Level 5
Level 5
 
Posts: 635
Joined: Wed Mar 05, 2008 12:54 pm
Location: Greentown, Indiana USA

Re: Random change of RAID array from /dev/md0 to /dev/md127

Postby Quasimodo on Mon Apr 09, 2012 12:46 pm

Thanks for the speedy reaction - I see it's about lunchtime with you (nearly teatime for me, I must go and get the loaf out of the oven...).

OK, I've looked up the post by bigj at the URL you gave me, and I'll give that a go - it may well be the cause of the strangeness with different UUIDs for the same array... I'd already synchromised my RAID array with the file server, so I'll be ready to scrub it and rebuild (probably tomorrow, now).

I'll report back when I've done the rebuild.

Thanks

Ian
Quasimodo
Level 1
Level 1
 
Posts: 48
Joined: Mon Oct 31, 2011 9:18 am
Location: Newbury, Berkshire, UK

Re: Random change of RAID array from /dev/md0 to /dev/md127

Postby Quasimodo on Tue Apr 10, 2012 12:37 pm

Right, I've done it, but no improvement, I'm afraid :( . However I think I have a bit better idea of the *cause* of the problem, even though I'm no nearer finding a *cure*.

To the minor question of why I get the GRUB boot screen after I've used <Ctrl><Alt><Del> to escape from the blank screen at boot-up: it happened that last night I aborted from a boot-up on my netbook, and when I restarted it, *that* cane up with the GRUB boot screen. I'm pretty sure that's a precautionary measure which Mint takes after a failed boot sequence.

To the main problem: I follwed the procedure recommended by bigj in http://www.technibble.com/forums/showthread.php?t=36159 to clear the partition tables on the 5 drives in my RAID array, then worked through the recipe in http://www.linuxhomenetworking.com/wiki/index.php/Quick_HOWTO_:_Ch26_:_Linux_Software_RAID. I noticed that after I'd run
Code: Select all
sudo mdadm --create --verbose /dev/md0 --level=5 --raid-devices= /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sde1
, each of the drives in the RAID array had a 2-part UUID; see the output from blkid below.
Code: Select all
ian@arthur ~ $ sudo blkid
/dev/sda1: UUID="ef71c093-77e1-46ef-a077-ff5236a25568" TYPE="ext4"
/dev/sda2: UUID="3fa4d34d-34b5-4ead-9c0c-0a0c8cc62e1e" TYPE="ext4"
/dev/sda5: UUID="2b7617dd-54a6-43d6-8b21-c78d0b86726b" TYPE="ext4"
/dev/sda6: UUID="07e53944-6ef6-400d-b230-4d389ac50e05" TYPE="swap"
/dev/sdb1: UUID="3d0b7e37-4689-7b65-0e8d-1b3559dac058" UUID_SUB="536efee0-e89a-5da9-3826-1381a558da8e" LABEL="arthur:0" TYPE="linux_raid_member"
/dev/sdc1: UUID="3d0b7e37-4689-7b65-0e8d-1b3559dac058" UUID_SUB="411d64bf-6637-3665-16c7-9e727ee320e1" LABEL="arthur:0" TYPE="linux_raid_member"
/dev/md127: UUID="943d0174-7cbd-48af-8871-2cc8700b9d1b" TYPE="ext4"
/dev/sdd1: UUID="3d0b7e37-4689-7b65-0e8d-1b3559dac058" UUID_SUB="9d5a1c11-ec0c-ef09-18c4-6434519aa048" LABEL="arthur:0" TYPE="linux_raid_member"
/dev/sde1: UUID="3d0b7e37-4689-7b65-0e8d-1b3559dac058" UUID_SUB="de5384e6-c972-e5a7-b178-0e1728a0c1c2" LABEL="arthur:0" TYPE="linux_raid_member"
/dev/sdf1: UUID="3d0b7e37-4689-7b65-0e8d-1b3559dac058" UUID_SUB="6f43bf01-3434-d9b5-6b56-a6a283ec3d96" LABEL="arthur:0" TYPE="linux_raid_member"

The UUID_SUB varies from one drive to another; the UUID 3d0b7e37-4689-7b65-0e8d-1b3559dac058 is the UUID of the RAID array. However when I formatted the volume on the RAID array using
Code: Select all
sudo mkfs.ext4 /dev/md0

the RAID array finished up with a different UUID. I think *that* is why mdadm gets confused when the system boots up, and allocates /dev/md127 to the RAID array.

However I can't think how I can avoid the UUID which is allocated to the RAID array by mdadm being overwritten by mkfs.ext4. It seems odd that this problem hasn't been observed before - I find it hard to believe that nobody who's using Mint 12 has tried to set up a RAID5 array before...

I haven't yet restored the contants of the RAID array from the backup, so I'll have another go at working through what I did today, but hold off using mkfs.ext4...

Thanks for your patience - I hope to hear from you again!

Ian
Quasimodo
Level 1
Level 1
 
Posts: 48
Joined: Mon Oct 31, 2011 9:18 am
Location: Newbury, Berkshire, UK

Re: Random change of RAID array from /dev/md0 to /dev/md127

Postby kwisher on Tue Apr 10, 2012 1:54 pm

I would suggest to eliminate the UUID of the array or update mdadm.conf with the new UUID after formatting. On the two RAID-5 servers I have built for customers I have not used the UUID and have had no issues.
The instructions suggested Windows XP or better, so I installed Linux :)
User avatar
kwisher
Level 5
Level 5
 
Posts: 635
Joined: Wed Mar 05, 2008 12:54 pm
Location: Greentown, Indiana USA

Re: Random change of RAID array from /dev/md0 to /dev/md127

Postby DrHu on Tue Apr 10, 2012 3:40 pm

http://superuser.com/questions/117824/h ... king-again
--that is your data set loader sequence
    This user was initially using labels instead of uuid
    --re-sequencing from label to uuid solved it for that poster..

And one item that post suggests is to check this file
Code: Select all
cat /proc/mdstat
User avatar
DrHu
Level 16
Level 16
 
Posts: 6882
Joined: Wed Jun 17, 2009 8:20 pm

Re: Random change of RAID array from /dev/md0 to /dev/md127

Postby Quasimodo on Tue Apr 10, 2012 3:50 pm

Hurrah! I think I've cracked it!!!

When I re-read the instructions at http://www.ainer.org/raid-5-6-install-setup-configuration-guide-for-ubuntu-10-04-lts-lucid-lynx/4, I realised that I'd *mis-read* them before - the UUID which should go into /etc/mdadm/mdadm.conf is the UUID of *the RAID array* as revealed by
Code: Select all
sudo mdadm --detail --scan
rather than the UUID of the *file system* as revealed by
Code: Select all
sudo blkid
. Once I'd fixed that, I ran
Code: Select all
sudo update-initramfs -u
and it didn't object, and re-booted and the system went to the login screen with no hiccups, and the RAID array was /dev/md0 mounted at /home/shared.

I'm afraid this was definitely a case of WAEFRTI (When All Else Fails, Read The Instructions)!

Now to restore the contaents of the RAID array from the file server...

Thanks again for your patience.

All the best

Ian
Quasimodo
Level 1
Level 1
 
Posts: 48
Joined: Mon Oct 31, 2011 9:18 am
Location: Newbury, Berkshire, UK

Linux Mint is funded by ads and donations.
 

Return to Mounting Partitions

Who is online

Users browsing this forum: No registered users and 1 guest