[SOLVED] Random change of RAID from /dev/md0 to /dev/md127

Forum rules
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
Locked
Quasimodo
Level 2
Level 2
Posts: 55
Joined: Mon Oct 31, 2011 9:18 am
Location: Newbury, Berkshire, UK

[SOLVED] Random change of RAID from /dev/md0 to /dev/md127

Post by Quasimodo »

After considerable help in getting my RAID5 array (5 x Seagate 320GB SATA drives) to wake up after I changed from Ubuntu 11.04 to Linux Mint 12, I have sporadic problems on startup:

1. On power up, sometimes my system boots to the login screen with no problems; other times there's a little disk activity then a blank screen, no longer how long I wait.

2. If I use <Ctrl> <Alt> <Del> from the blank screen, the system reboots and presents me with the GRUB menu, offering normal boot (the default), boot into recovery mode and memtest. I then select the default.

3. The system sometimes boots to the login screen, but more worryingly sometimes reports that one or more of the partitions (sdb1, sdc1, scd1, sde1, sdf1) in the RAID array is faulty.

4. If the system reports problems with the RAID array, I select "N" to decline booting with the dgraded RAID array - this drops me to a Busybox prompt. <Ctrl> <Alt> <Del> reboots the system, which *usually* gets me to a login screen OK.

After this performance, I find that the RAID array has switched from being /dev/md0 to /dev/md127; according to the advice I received in the exchange of posts about waking up my RAID array, this is because mdadm gets confused by the content of /etc/mdadm/mdadm.conf. I've copied the content of this file below.

Code: Select all

# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md0 metadata=1.2 UUID=f1802834-b239-4bd0-a805-d43954b77f95
The UUID of the RAID array matches what I get from

Code: Select all

sudo blkid
I've also made sure that the entry in /etc/fstab to mount the raid array at /home/shared uses the same UUID. Because I'm identifying the RAID array using the UUID in /etc/fstab, it mounts OK regardless of whether it's /dev/md0 or /dev/md127.

The most puzzling thing is that the output from sudo update-initramfs -u is:

Code: Select all

W: mdadm: the array /dev/md0 with UUID 66698c52:4c458d41:87b74864:110faf9b
W: mdadm: is currently active, but it is not listed in mdadm.conf. if
W: mdadm: it is needed for boot, then YOUR SYSTEM IS NOW UNBOOTABLE!
W: mdadm: please inspect the output of /usr/share/mdadm/mkconf, compare
W: mdadm: it to /etc/mdadm/mdadm.conf, and make the necessary changes.
When I do as recommended, and run /usr/share/mdadm/mkconf, the output is:

Code: Select all

# mdadm.conf
#
# Please refer to mdadm.conf(5) for information about this file.
#

# by default, scan all partitions (/proc/partitions) for MD superblocks.
# alternatively, specify devices to scan, using wildcards if desired.
DEVICE partitions

# auto-create devices with Debian standard permissions
CREATE owner=root group=disk mode=0660 auto=yes

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR root

# definitions of existing MD arrays
ARRAY /dev/md/TheBigOne metadata=1.2 UUID=66698c52:4c458d41:87b74864:110faf9b name=:TheBigOne
I noted that the output of /usr/share/mdadm/mkconf shows the array as /dev/md/TheBigOne, which matches the name shown by the Gnome disk utility.

I'm not sure whether the booting problem is linked to the peculiarities with mdadm, but I have my suspicions. I should add that I doubt whether it's a hardware problem; when I was running Ubuntu 11.04, I had no problems with erratic booting or the random change from the RAID array being /dev/md0 to being /dev/md127.

Can anyone point me to which log files I should check to track down what's happening with the boot process, as a starter, please?

Any help would be much appreciated...

Ian
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 2 times in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
kwisher

Re: Random change of RAID array from /dev/md0 to /dev/md127

Post by kwisher »

What happens when you remove the UUID from your fstab file?
Quasimodo
Level 2
Level 2
Posts: 55
Joined: Mon Oct 31, 2011 9:18 am
Location: Newbury, Berkshire, UK

Re: Random change of RAID array from /dev/md0 to /dev/md127

Post by Quasimodo »

Thanks for the suggestion. I tried that (details below), and the system came up to the point where it tried to mount /home/shared/, then failed because it couldn't find the RAID array identified in /etc/fstab.

Overall procedure:
1. I unmounted /home/shared if necessary
2. Using gnome-disk-utility, I stopped the RAID array (which was identified as /dev/md127)
3. Using gnome-disk-utility, I started the RAID array (which came up as /dev/md0)
4. I edited /etc/fstab and rebooted. Both times, I wasn't all that surprised to find when I ran sudo blkid to find that the RAID array was /dev/md127.

changes to /etc/fstab:

1. I commented out the line which identified the RAID array by its UUID, and replaced it with a line which identified it as /dev/md0.

2. I modified the line which identified the RAID array as /dev/md0 to identify it as /dev/md/TheBigOne (as indicated when I ran sudo /usr/share/mdadm/mkconf).

It seems as though gnome-disk-utility is quite happy to use the data in /etc/mdadm/mdadm.conf to set up the RAID array, but when the system restarts, *something* is trying to bring up the RAID array using spurious data from somewhere else (as witness the different UUID for the RAID array which comes up when I run sudo update-initramfs -u and when I run sudo /usr/share/mdadm/mkconf).

I really would like to be able to track down why this is happening (at the "user" level, rather than the "climb into the workings" level, it would also be nice to have a system which just boots up properly without having to mess around using <Ctrl><Alt><Del> several times before I can log in!)

Ian
kwisher

Re: Random change of RAID array from /dev/md0 to /dev/md127

Post by kwisher »

Can the data be moved to another location and rebuild the array from scratch? Below are some links to mdadm info I used on previous systems.

http://www.hscripts.com/tutorials/linux ... mdadm.html
http://www.linuxhomenetworking.com/wiki ... tware_RAID
http://www.howtoforge.com/software-raid ... ebian-etch
Quasimodo
Level 2
Level 2
Posts: 55
Joined: Mon Oct 31, 2011 9:18 am
Location: Newbury, Berkshire, UK

Re: Random change of RAID array from /dev/md0 to /dev/md127

Post by Quasimodo »

Thanks for the suggestion.

Yes, It can - I keep a backup of the contents of my RAID array on a separate file server (also with a RAID5 array...). However I've already tried this once, using the instructions at http://www.ainer.org/raid-5-6-install-s ... lucid-lynx, which is what I used originally to set up the RAID when I was running Ubuntu 11.04 (and, incidentally, to set up the RAID on my file server, which is running Debian Squeeze). Things seem no better than before I spent the several hours rebuilding the RAID array.

Since I have quite a lot of space free on my primary hard drive, I think I'll use gparted to tweak my partitions and free up some space to install another distro alongside Mint, and see how I get on with that...

Ian
kwisher

Re: Random change of RAID array from /dev/md0 to /dev/md127

Post by kwisher »

Go to this forum discussion and read the post by Bigi with the date of 4/9/2012. This might provide the key for you if you are willing to rebuild your array again. http://www.technibble.com/forums/showthread.php?t=36159
Quasimodo
Level 2
Level 2
Posts: 55
Joined: Mon Oct 31, 2011 9:18 am
Location: Newbury, Berkshire, UK

Re: Random change of RAID array from /dev/md0 to /dev/md127

Post by Quasimodo »

Thanks for the speedy reaction - I see it's about lunchtime with you (nearly teatime for me, I must go and get the loaf out of the oven...).

OK, I've looked up the post by bigj at the URL you gave me, and I'll give that a go - it may well be the cause of the strangeness with different UUIDs for the same array... I'd already synchromised my RAID array with the file server, so I'll be ready to scrub it and rebuild (probably tomorrow, now).

I'll report back when I've done the rebuild.

Thanks

Ian
Quasimodo
Level 2
Level 2
Posts: 55
Joined: Mon Oct 31, 2011 9:18 am
Location: Newbury, Berkshire, UK

Re: Random change of RAID array from /dev/md0 to /dev/md127

Post by Quasimodo »

Right, I've done it, but no improvement, I'm afraid :( . However I think I have a bit better idea of the *cause* of the problem, even though I'm no nearer finding a *cure*.

To the minor question of why I get the GRUB boot screen after I've used <Ctrl><Alt><Del> to escape from the blank screen at boot-up: it happened that last night I aborted from a boot-up on my netbook, and when I restarted it, *that* cane up with the GRUB boot screen. I'm pretty sure that's a precautionary measure which Mint takes after a failed boot sequence.

To the main problem: I follwed the procedure recommended by bigj in http://www.technibble.com/forums/showthread.php?t=36159 to clear the partition tables on the 5 drives in my RAID array, then worked through the recipe in http://www.linuxhomenetworking.com/wiki ... tware_RAID. I noticed that after I'd run

Code: Select all

sudo mdadm --create --verbose /dev/md0 --level=5 --raid-devices= /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sde1
, each of the drives in the RAID array had a 2-part UUID; see the output from blkid below.

Code: Select all

ian@arthur ~ $ sudo blkid
/dev/sda1: UUID="ef71c093-77e1-46ef-a077-ff5236a25568" TYPE="ext4" 
/dev/sda2: UUID="3fa4d34d-34b5-4ead-9c0c-0a0c8cc62e1e" TYPE="ext4" 
/dev/sda5: UUID="2b7617dd-54a6-43d6-8b21-c78d0b86726b" TYPE="ext4" 
/dev/sda6: UUID="07e53944-6ef6-400d-b230-4d389ac50e05" TYPE="swap" 
/dev/sdb1: UUID="3d0b7e37-4689-7b65-0e8d-1b3559dac058" UUID_SUB="536efee0-e89a-5da9-3826-1381a558da8e" LABEL="arthur:0" TYPE="linux_raid_member" 
/dev/sdc1: UUID="3d0b7e37-4689-7b65-0e8d-1b3559dac058" UUID_SUB="411d64bf-6637-3665-16c7-9e727ee320e1" LABEL="arthur:0" TYPE="linux_raid_member" 
/dev/md127: UUID="943d0174-7cbd-48af-8871-2cc8700b9d1b" TYPE="ext4" 
/dev/sdd1: UUID="3d0b7e37-4689-7b65-0e8d-1b3559dac058" UUID_SUB="9d5a1c11-ec0c-ef09-18c4-6434519aa048" LABEL="arthur:0" TYPE="linux_raid_member" 
/dev/sde1: UUID="3d0b7e37-4689-7b65-0e8d-1b3559dac058" UUID_SUB="de5384e6-c972-e5a7-b178-0e1728a0c1c2" LABEL="arthur:0" TYPE="linux_raid_member" 
/dev/sdf1: UUID="3d0b7e37-4689-7b65-0e8d-1b3559dac058" UUID_SUB="6f43bf01-3434-d9b5-6b56-a6a283ec3d96" LABEL="arthur:0" TYPE="linux_raid_member"
The UUID_SUB varies from one drive to another; the UUID 3d0b7e37-4689-7b65-0e8d-1b3559dac058 is the UUID of the RAID array. However when I formatted the volume on the RAID array using

Code: Select all

sudo mkfs.ext4 /dev/md0
the RAID array finished up with a different UUID. I think *that* is why mdadm gets confused when the system boots up, and allocates /dev/md127 to the RAID array.

However I can't think how I can avoid the UUID which is allocated to the RAID array by mdadm being overwritten by mkfs.ext4. It seems odd that this problem hasn't been observed before - I find it hard to believe that nobody who's using Mint 12 has tried to set up a RAID5 array before...

I haven't yet restored the contants of the RAID array from the backup, so I'll have another go at working through what I did today, but hold off using mkfs.ext4...

Thanks for your patience - I hope to hear from you again!

Ian
kwisher

Re: Random change of RAID array from /dev/md0 to /dev/md127

Post by kwisher »

I would suggest to eliminate the UUID of the array or update mdadm.conf with the new UUID after formatting. On the two RAID-5 servers I have built for customers I have not used the UUID and have had no issues.
DrHu

Re: Random change of RAID array from /dev/md0 to /dev/md127

Post by DrHu »

http://superuser.com/questions/117824/h ... king-again
--that is your data set loader sequence
  • This user was initially using labels instead of uuid
    --re-sequencing from label to uuid solved it for that poster..
And one item that post suggests is to check this file

Code: Select all

cat /proc/mdstat
Quasimodo
Level 2
Level 2
Posts: 55
Joined: Mon Oct 31, 2011 9:18 am
Location: Newbury, Berkshire, UK

Re: Random change of RAID array from /dev/md0 to /dev/md127

Post by Quasimodo »

Hurrah! I think I've cracked it!!!

When I re-read the instructions at http://www.ainer.org/raid-5-6-install-s ... cid-lynx/4, I realised that I'd *mis-read* them before - the UUID which should go into /etc/mdadm/mdadm.conf is the UUID of *the RAID array* as revealed by

Code: Select all

sudo mdadm --detail --scan
rather than the UUID of the *file system* as revealed by

Code: Select all

sudo blkid
. Once I'd fixed that, I ran

Code: Select all

sudo update-initramfs -u
and it didn't object, and re-booted and the system went to the login screen with no hiccups, and the RAID array was /dev/md0 mounted at /home/shared.

I'm afraid this was definitely a case of WAEFRTI (When All Else Fails, Read The Instructions)!

Now to restore the contaents of the RAID array from the file server...

Thanks again for your patience.

All the best

Ian
Locked

Return to “Storage”