View previous topic :: View next topic |
Author |
Message |
fikiz Apprentice

Joined: 07 Mar 2005 Posts: 282 Location: Italy
|
Posted: Fri Dec 27, 2013 7:59 pm Post subject: software raid-1: data loss & "super non-persistent& |
|
|
Hello everybody.
yesterday I made a clean poweroff of my Gentoo machine while a software raid-1 array rebuilding was running, expecting the rebuild to start over at the next boot.
At the next boot, no rebuilding was running and the filesystem was corrupt (but mounted), getting some I/O errors. I stopped manually the array just to check the content of the two partitions realizing the I lost everything (don't worry... lots of backups lying around!). I didn't like this... In 10+ years running software raid-1 this is the first joke I get.
Then I created a fresh new raid-1 array on the same 2 partitions recovering data from backup. A full rebuild of the mirror completed successfully.
now I see a thing I didn't noticed before in /proc/mdstat:
Code: |
# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid1 sdb12[1] sda7[0]
188743680 blocks super non-persistent [2/2] [UU]
|
non-persistent? What? I googled around but I didn't find anything telling me exactly what this means. Somebody could kindly explain what this means and what I did wrong creating this array (and probably I did the same mistake before loosing data)?
thanks! |
|
Back to top |
|
 |
sabayonino l33t


Joined: 03 Jan 2012 Posts: 716
|
Posted: Fri Dec 27, 2013 8:40 pm Post subject: |
|
|
hi
item "non-persistent" refers to metadata type in your array(s)
see Linux Raid
i find this : https://raid.wiki.kernel.org/index.php/RAID_setup#External_Metadata
and
http://www.mjmwired.net/kernel/Documentation/md.txt
Quote: | metadata_version
This indicates the format that is being used to record metadata
about the array. It can be 0.90 (traditional format), 1.0, 1.1,
1.2 (newer format in varying locations) or "none" indicating that
the kernel isn't managing metadata at all.
Alternately it can be "external:" followed by a string which
is set by user-space. This indicates that metadata is managed
by a user-space program. Any device failure or other event that
requires a metadata update will cause array activity to be
suspended until the event is acknowledged. |
my RAID 1 is set :
Quote: | cat /proc/mdstat
Personalities : [raid0] [raid1] [linear] [multipath]
md0 : active raid1 sdd1[0] sde1[1]
976629568 blocks super 1.2 [2/2] [UU]
|
"1.2" is the metadata applied
my raid was created with
Code: | # mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdd1 /dev/sde1 |
and metadata apllied was 1.2(set to default) for mdadm v.3.x
other metadata info --->md(4) Linux Man Page
or simply
_________________ BOINC ed il calcolo distribuito
My LiveRecoverySystem Repo |
|
Back to top |
|
 |
frostschutz Advocate


Joined: 22 Feb 2005 Posts: 2971 Location: Germany
|
Posted: Sat Dec 28, 2013 3:40 am Post subject: |
|
|
which kernel/mdadm version are you using?
what command(s) did you use exactly when you replaced your disk and/or set up the new array? |
|
Back to top |
|
 |
fikiz Apprentice

Joined: 07 Mar 2005 Posts: 282 Location: Italy
|
Posted: Sat Dec 28, 2013 9:25 am Post subject: |
|
|
Sabayonino:
Thank you for your help, maybe I searched the wrong terms. I especially missed 'man md'.
frostschutz:
I'm running kernel 3.10.17, mdadm 3.2.6-r1. This is a brand new Gentoo installation.
After my opening post, I rebooted and the md raid block device wasn't available, and the filesystem was mounted directly on /dev/sda7 (mount made through LABEL reference). This, now, was expected: 'non-persistent' means no superblock at the beginning of the raid devices (hence the direct mounting of the filesystem) and no metadata at all, therefore no md raid block device upon reboot. This is how I understood reading the links suggested by Sabayonino. And that's ok to me.
So I re-created the raid:
Code: | mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda7 /dev/sdb12 |
this correctly created a new array with superblock metadata 1.2. While the rebuilding was running, I rebooted and the rebuilding resumed where it was interrupted. The only difference is now the raid md device is /dev/md127.
Before the data loss, I made a different thing: I first created a raid-1 with only 1 device, planning to add the second harddisk lately. I just repeated the steps I made:
Code: | mdadm --create /dev/md0 --level=1 --force --raid-devices=1 /dev/sda7 |
after physically installing the second harddisk, I booted and the (obviously non-reliable) raid device was still there, named /dev/md127, with metadata version 1.2. Then I added the second partition:
Code: | mdadm --manage /dev/md127 --add /dev/sdb12
mdadm --grow /dev/md127 --raid-devices=2 |
A new rebuilding started as expected, still with metadata version 1.2. I'm quite sure that after rebooting everything will be ok.
End of the story: likely I messed up something creating the array the first time, and this shouldn't be a bug in raid's software. I feel myself safer
I learned something. Thank you for your patience and support. |
|
Back to top |
|
 |
frostschutz Advocate


Joined: 22 Feb 2005 Posts: 2971 Location: Germany
|
Posted: Sat Dec 28, 2013 8:09 pm Post subject: |
|
|
fikiz wrote: | The only difference is now the raid md device is /dev/md127. |
That happens when the array is not listed in /etc/mdadm.conf
Quote: |
Code: | mdadm --create /dev/md0 --level=1 --force --raid-devices=1 /dev/sda7 |
|
If you already know there will be a 2nd disk later on, you can also use:
Code: | mdadm /dev/md42 --create --level=2 --raid-devices=2 /dev/sdx1 missing |
The array will then be listed as [U_] as the 2nd disk is not there yet. |
|
Back to top |
|
 |
fikiz Apprentice

Joined: 07 Mar 2005 Posts: 282 Location: Italy
|
Posted: Sun Dec 29, 2013 11:05 am Post subject: |
|
|
It sounds better your way to do the same thing.
thank you! |
|
Back to top |
|
 |
NeddySeagoon Administrator


Joined: 05 Jul 2003 Posts: 44171 Location: 56N 3W
|
Posted: Sun Dec 29, 2013 4:57 pm Post subject: |
|
|
fikiz,
The way that frostschutz proposed lets you practice replacing a failed drive in an array.
This is generally a good thing, since you practice when your array is almost empty, therefore, expendable. _________________ Regards,
NeddySeagoon
Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail. |
|
Back to top |
|
 |
|