Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
software raid-1: data loss & "super non-persistent"
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
fikiz
Apprentice
Apprentice


Joined: 07 Mar 2005
Posts: 282
Location: Italy

PostPosted: Fri Dec 27, 2013 7:59 pm    Post subject: software raid-1: data loss & "super non-persistent& Reply with quote

Hello everybody.

yesterday I made a clean poweroff of my Gentoo machine while a software raid-1 array rebuilding was running, expecting the rebuild to start over at the next boot.
At the next boot, no rebuilding was running and the filesystem was corrupt (but mounted), getting some I/O errors. I stopped manually the array just to check the content of the two partitions realizing the I lost everything (don't worry... lots of backups lying around!). I didn't like this... In 10+ years running software raid-1 this is the first joke I get.

Then I created a fresh new raid-1 array on the same 2 partitions recovering data from backup. A full rebuild of the mirror completed successfully.

now I see a thing I didn't noticed before in /proc/mdstat:

Code:

# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md0 : active raid1 sdb12[1] sda7[0]
      188743680 blocks super non-persistent [2/2] [UU]


non-persistent? What? I googled around but I didn't find anything telling me exactly what this means. Somebody could kindly explain what this means and what I did wrong creating this array (and probably I did the same mistake before loosing data)?

thanks!
Back to top
View user's profile Send private message
sabayonino
l33t
l33t


Joined: 03 Jan 2012
Posts: 697

PostPosted: Fri Dec 27, 2013 8:40 pm    Post subject: Reply with quote

hi

item "non-persistent" refers to metadata type in your array(s)

see Linux Raid

i find this : https://raid.wiki.kernel.org/index.php/RAID_setup#External_Metadata

and
http://www.mjmwired.net/kernel/Documentation/md.txt
Quote:
metadata_version
This indicates the format that is being used to record metadata
about the array. It can be 0.90 (traditional format), 1.0, 1.1,
1.2 (newer format in varying locations) or "none" indicating that
the kernel isn't managing metadata at all.
Alternately it can be "external:" followed by a string which
is set by user-space. This indicates that metadata is managed
by a user-space program. Any device failure or other event that
requires a metadata update will cause array activity to be
suspended until the event is acknowledged.




my RAID 1 is set :
Quote:
cat /proc/mdstat
Personalities : [raid0] [raid1] [linear] [multipath]
md0 : active raid1 sdd1[0] sde1[1]
976629568 blocks super 1.2 [2/2] [UU]


"1.2" is the metadata applied

my raid was created with
Code:
# mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdd1 /dev/sde1


and metadata apllied was 1.2(set to default) for mdadm v.3.x

other metadata info --->md(4) Linux Man Page

or simply
Code:
# man md

_________________
BOINC ed il calcolo distribuito

My LiveRecoverySystem Repo
Back to top
View user's profile Send private message
frostschutz
Advocate
Advocate


Joined: 22 Feb 2005
Posts: 2970
Location: Germany

PostPosted: Sat Dec 28, 2013 3:40 am    Post subject: Reply with quote

which kernel/mdadm version are you using?

what command(s) did you use exactly when you replaced your disk and/or set up the new array?
Back to top
View user's profile Send private message
fikiz
Apprentice
Apprentice


Joined: 07 Mar 2005
Posts: 282
Location: Italy

PostPosted: Sat Dec 28, 2013 9:25 am    Post subject: Reply with quote

Sabayonino:

Thank you for your help, maybe I searched the wrong terms. I especially missed 'man md'.

frostschutz:

I'm running kernel 3.10.17, mdadm 3.2.6-r1. This is a brand new Gentoo installation.


After my opening post, I rebooted and the md raid block device wasn't available, and the filesystem was mounted directly on /dev/sda7 (mount made through LABEL reference). This, now, was expected: 'non-persistent' means no superblock at the beginning of the raid devices (hence the direct mounting of the filesystem) and no metadata at all, therefore no md raid block device upon reboot. This is how I understood reading the links suggested by Sabayonino. And that's ok to me.

So I re-created the raid:
Code:
mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sda7 /dev/sdb12

this correctly created a new array with superblock metadata 1.2. While the rebuilding was running, I rebooted and the rebuilding resumed where it was interrupted. The only difference is now the raid md device is /dev/md127.
Before the data loss, I made a different thing: I first created a raid-1 with only 1 device, planning to add the second harddisk lately. I just repeated the steps I made:
Code:
mdadm --create /dev/md0 --level=1 --force --raid-devices=1 /dev/sda7

after physically installing the second harddisk, I booted and the (obviously non-reliable) raid device was still there, named /dev/md127, with metadata version 1.2. Then I added the second partition:
Code:
mdadm --manage /dev/md127 --add /dev/sdb12
mdadm --grow /dev/md127 --raid-devices=2

A new rebuilding started as expected, still with metadata version 1.2. I'm quite sure that after rebooting everything will be ok.

End of the story: likely I messed up something creating the array the first time, and this shouldn't be a bug in raid's software. I feel myself safer :-)

I learned something. Thank you for your patience and support.
Back to top
View user's profile Send private message
frostschutz
Advocate
Advocate


Joined: 22 Feb 2005
Posts: 2970
Location: Germany

PostPosted: Sat Dec 28, 2013 8:09 pm    Post subject: Reply with quote

fikiz wrote:
The only difference is now the raid md device is /dev/md127.


That happens when the array is not listed in /etc/mdadm.conf

Quote:

Code:
mdadm --create /dev/md0 --level=1 --force --raid-devices=1 /dev/sda7



If you already know there will be a 2nd disk later on, you can also use:

Code:
mdadm /dev/md42 --create --level=2 --raid-devices=2 /dev/sdx1 missing


The array will then be listed as [U_] as the 2nd disk is not there yet.
Back to top
View user's profile Send private message
fikiz
Apprentice
Apprentice


Joined: 07 Mar 2005
Posts: 282
Location: Italy

PostPosted: Sun Dec 29, 2013 11:05 am    Post subject: Reply with quote

It sounds better your way to do the same thing.

thank you!
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43178
Location: 56N 3W

PostPosted: Sun Dec 29, 2013 4:57 pm    Post subject: Reply with quote

fikiz,

The way that frostschutz proposed lets you practice replacing a failed drive in an array.
This is generally a good thing, since you practice when your array is almost empty, therefore, expendable.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum