Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
RAID array broken, can't boot
View unanswered posts
View posts from last 24 hours

Goto page Previous  1, 2, 3, 4  Next  
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1181
Location: Stuttgart, Germany

PostPosted: Sat Feb 09, 2019 5:25 pm    Post subject: Reply with quote

Hi Neddy,

it's now on the first retry. The error count was at 270 when it started, and now it's down to 268. Errsize is 574kB. It says it has 14h 30m remaining, so I'm going to let it run another few hours, then Ctrl-C, copy the file to a USB stick, and shut it down. Is that more or less the correct procedure?

Thanks,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43178
Location: 56N 3W

PostPosted: Sat Feb 09, 2019 5:50 pm    Post subject: Reply with quote

ExecutorElassus,

Thats it. When you start it up again, leave the log on the USB stick and point ddrescue to it there.
Then it will be maintained on the USB stick.

Code:
Errsize is 574kB
Thats a lot.
The damage depends on what is stored there, so that needs to be minimised.

Don't be in a hurry :)
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1181
Location: Stuttgart, Germany

PostPosted: Sat Feb 09, 2019 5:58 pm    Post subject: Reply with quote

Hi Neddy,

well, the first part of the partitions that live inside the logical volume from that array ore system partitions (/usr, /var, /etc), and /home, but that altogether is maybe only a couple hundred GB out of the array. The vast majority of the later partitions are all just mass storage of media, and I can live with them suffering some data corruption.

errsize is now 560kB, errors is 255.

the errors in the RAID recovery only started at around 75%; how likely is it that those errors affect those later partitions, and not anything necessary to assemble the array or boot?

My / is on a separate array that is properly synced with the new drive, so it's really only a matter of getting it to assemble at boot time.

But I'll shut it down in a couple hours, let it cool off, and start again in the morning.

Is there anything else I can do?

Oh, one other question: how do I prevent this array from attempting to re-assemble on boot?

Cheers,
EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43178
Location: 56N 3W

PostPosted: Sat Feb 09, 2019 6:35 pm    Post subject: Reply with quote

ExecutorElassus,

You need to know the mdadm and LVM data structures on the drive.
I think mdadm ver 1.2 puts its metadata at the start of the raid volume. That's what mdadm -E shows you.
I think LVM puts its metadata at the the start of the volume group too.
That would make the metadata safe if its correct.

I know that extents are allocated to logical volumes from the start of the volume group but it all gets messy when you extend a logical volume.

Previously we had
Code:
#       pos               size     status
0x00000000   0xAAA4A18000   +

That's 732,906,487,808 bytes good from the start of the partition.

We also know
Code:
Recovery Offset: 1389843520 sectors
which is 711,599,882,240 bytes from the start of the raid set.
The start of the partition and start of the raid set are not quite identical.

It looks like the first 1.5TB approx of user data space is intact.
The worrying things are filesystem metadata. Some is at the start of the filesystem, some is dynamically created/destroyed, e.g. directories.
And the writes that the raid sustained while sdc4 was disconnected.

We know that the first 1.5TB approx of user data space was resynced ...

If you use the trick with overlay filesystems, you can assemble that raid and have a look round without writing to the HDD at all.
The raid metadata changes due to using --force will go to the overlay.
If you don't like what you see, the changes are all in the overlay, so will drop out. I understand the theory but I've never done it.

It may come down to rock ... hard place.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1181
Location: Stuttgart, Germany

PostPosted: Sat Feb 09, 2019 6:45 pm    Post subject: Reply with quote

What is the overlay trick? I found this page, but it's confusing to me.

Right now, this array is identified as /dev/md127. It stopped recovery when the old sdb4 stopped reading, so sdc4 is still not recovered. But I couldn't stop the array, so I removed each sdX4 from it. Now I still can't stop the array. So, later on, when I get to the point where ddrescue either completes or gives up, how do I put the array back together again and test it with the overlay trick?

ddrescue is now on Retry 2. errsize is 531kB, errors is 226

Cheers,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43178
Location: 56N 3W

PostPosted: Sat Feb 09, 2019 7:14 pm    Post subject: Reply with quote

ExecutorElassus,

Read all of that page but pay particular attention to the section Making the harddisks read-only using an overlay file

Its a shame that the page uses parallel everywhere. That makes it hard to read and understand.
All parallel does is to run the command inside the single quotes several times at the same time.
Its so rarely useful to me that I never use it.

So
Code:
parallel 'test -e /dev/loop{#} || mknod -m 660 /dev/loop{#} b 7 {#}' ::: $DEVICES
sets up lots oy /dev/loopX devices in one line.
You need three. one each for sd[abc]4

Code:
parallel truncate -s4000G overlay-{/} ::: $DEVICES
creates overlay files. I recalled this being done with USB devices.
That was incorrect. The advice was to practice on something expendable, like USB devices.

Like you, I shuddered reading this for a broken 5 spindle raid, that was mostly my DVD collection.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1181
Location: Stuttgart, Germany

PostPosted: Sat Feb 09, 2019 7:20 pm    Post subject: Reply with quote

Hi Neddy,

well, I don't have parallel on this LiveCD, so how would I create all these overlays without it? Also, I don't have a USB stick nearly big enough for 3 1TB disks.

But I'll have to look into that in the morning. Before I shut down: How do I make sure the LiveCD doesn't try to re-start this array when I boot back up, altering the data further and using the old drive?

errsize is now 528kB, errors is 228. I'm on Retry 3.

Cheers,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43178
Location: 56N 3W

PostPosted: Sat Feb 09, 2019 7:39 pm    Post subject: Reply with quote

ExecutorElassus,

Its OK to let it assemble the raid and fail. If can't be assembled, it cant be used.
You could also unplug sda and sdc.

You don't need much space for the overlay filesystems. They will only contain the writes that would have gone to the HDD.
As long as you don't sync into the overlay :)

I have no idea how much is enough.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1181
Location: Stuttgart, Germany

PostPosted: Sat Feb 09, 2019 7:49 pm    Post subject: Reply with quote

OK, so I'll interrupt ddrescue, copy rescue.log to a usb stick, and re-start it in the morning.

Can I stop /dev/md127 while it's still attempting to rebuild when I boot? For some reason, even with all drives removed, it still says it can't get exclusive access to /dev/md127.

errsize: 525, errors: 229. Retry 4

Cheers,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43178
Location: 56N 3W

PostPosted: Sat Feb 09, 2019 8:15 pm    Post subject: Reply with quote

ExecutorElassus,

Yes. Stopping the rebuild is safe. However, a rebuild allows the raid to be used normally. You don't want that in case there are writes.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1181
Location: Stuttgart, Germany

PostPosted: Sun Feb 10, 2019 7:59 am    Post subject: Reply with quote

Hi Neddy,

good morning. On reboot, mdraid refused to build the array (which, given that I don't want it assembled and writing, is good). Here's what /proc/mdstat now shows:

Code:
cat /proc/mdstat
Personalities [SNIP]
md1: active raid1 sdb1[1]
         97536 blocks [3/1] [_U_]

md124: active raid1 sdc3[0] sda3[2] sdd3[1]
        9765505 blocks [3/3] [UUU]

md126: inactive sdb4[3](S)
        965920692 blocks super 1.2

md127: inactive sda4[2](S) sdc4[4](S) sdd4[3](S)
        2897762076 blocks super 1.2

md125: active raid1 sdc1[0] sda1[2] sdd1[1]
       97536 blocks [3/3] [UUU]

I'm a bit worried that md127 seems to think that it has four or more members, but it seems like at least the system recognized the metadata of sdd4 and could conceivably restart the array if I forced it. But now I'm re-running ddrescue; errsize is 524kB, errors 227.

Anything else I should do in the meantime?

Cheers,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43178
Location: 56N 3W

PostPosted: Sun Feb 10, 2019 10:34 am    Post subject: Reply with quote

ExecutorElassus,

It sees sd[abcd]4 all with the same raid UUID. That's expected.
ddrescue has copied the raid metadata already.

If you run mdadm /dev/sd[bd]4 you will see that they are both in the same slot as one is almost a copy of the other.

With software raid, what goes where in terms of hardware is not important.
mdadm will find all the bits if you assemble by uuid. You don't want any clones lying around when the time comes though.

-- edit --

Keep rerunning ddrescue with gravity assist in various directions.
The idea is to use gravity to help compensate for bearing wear, by running on a lesser worn part of the bearings.
It seems to help, even with modern air bearings, where there is no contact once the platter is close to the nominal speed.

Just one more read ...
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1181
Location: Stuttgart, Germany

PostPosted: Sun Feb 10, 2019 11:24 am    Post subject: Reply with quote

Hi Neddy,

unfortunately, the power connector for sdd is too tight to turn the drive much around. I could maybe shut down again, unplug sda and sdc, and keep going, trying the different orientations that way. But right now it looks like it's recovered about 10kB in the last 3h40m, which would mean it would take about 4 days for everything. I … maybe could live with that, if it meant I got all my data back.

If I added sdd into the array and tried to force assembly with four members, would that make it impossible to recover if I went back down to three (with sdc still marked as partially rebuilt)?

Errsize 515kb, errors 225

Cheers,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43178
Location: 56N 3W

PostPosted: Sun Feb 10, 2019 11:57 am    Post subject: Reply with quote

ExecutorElassus,

It will do better if you can move the drive. Can you rearrange power cables?

Typically, you try a new face/edge and you get lots of good reads very quickly.
The data recovery rate is not linear. Its doing retries over that 515kb and 225 regions. If it gets lucky, and some data is read, it won't try that sector again.

You can't assemble the raid with four members. Two are in the same slot, you have two sdb4s.

It should be safe to bring up the raid read only in degraded mode with sdb-new and sda.
Its mdadm that gets the read only option, not the filesystem. You only want to look.

man mdadm:
       -o, --readonly
              Start  the array read only rather than read-write as normal.  No
              writes will be allowed to the array, and no resync, recovery, or
              reshape  will be started. It works with Create, Assemble, Manage
              and Misc mode.


You need to examine every file to see what's damaged.

If you bring up the array in degraded mode with sdb-old and sda, You can try to cp -a the filesystem to /dev/null.
If it finishes, it didn't read any damaged blocks, so you don't need to recover them.
The first read error will tell the first encountered damaged file and the cp will stop.

You can try the copy --readonly with all three original drives too. That will be interesting.
I'm not sure what happens with the rebuild in progress. It may not use any data from sdc4 after the rebuild progress limit.

All this ddrescuing may have caused bad blocks to be reallocated on sdb too, in which case, it may appear to have partly healed.

You really do need to try all 6 faces/edges.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1181
Location: Stuttgart, Germany

PostPosted: Sun Feb 10, 2019 1:18 pm    Post subject: Reply with quote

Hi Neddy,

alright, I found a spare cable, shut down, plugged it in, and restarted. Now I'm rotating the drive around on retries. I'll let you know when it completes.

Errsize 512 kB, errors 225

Cheers,

EE
Back to top
View user's profile Send private message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1181
Location: Stuttgart, Germany

PostPosted: Sun Feb 10, 2019 4:59 pm    Post subject: Reply with quote

Hi Neddy,

I'm looking at the metadata for the four drives, and I have a couple questions.

sdb4 and add4 both show as Active Device 1 of an array with three active members and an even count of 1343686. This is expected as they should be copies of one another.

But sda4 and sdc4 both show as clean members of an array with two missing and one active, with sda4 set as Active device 2 and sdc4 set as spare.sda4 has an event count of 1347561, sdc4 has 1347560.

Is all of this something I can get into a workable state once ddrescue finishes? How do I start the array in degraded state?

ddrescue reported an I/O error with the USB stick, so I emergency-saved to a local file, swapped out the stick, and resumed. Errsize is now 503, errors 226.

By that count, I'm managing to recover about 3kB/hr, which means I'd take another six days to recover everything (if that's even possible). I'm rotating the dive around now every retry.

Thanks for the help,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43178
Location: 56N 3W

PostPosted: Sun Feb 10, 2019 5:13 pm    Post subject: Reply with quote

ExecutorElassus,

Where did the data in this post come from

How did you post it. Copy/paste or copy type?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1181
Location: Stuttgart, Germany

PostPosted: Sun Feb 10, 2019 5:28 pm    Post subject: Reply with quote

Hi Neddy,

copy type. the metadata for sdc4 no longer has a recovery offset. update time is now yesterday, and there is a Bad Block log entry with 512 entries available at offset 264 sectors.

This is why I was asking about assembling the array. It's possible that mdadm assembled the array at some point and reset the disk to clean.

But if it's a spare in the new array, it would just get overwritten in rebuild anyway, yes?

Thanks for the help,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43178
Location: 56N 3W

PostPosted: Sun Feb 10, 2019 5:46 pm    Post subject: Reply with quote

ExecutorElassus,

It looks like we will need to do the --create I wanted to avoid after all.

Both /dev/sda4 and /dev/sdc4 show
Code:
    Device Role: Active device 2
which can't be correct.

With three devices, there are six combinations for Role.
The feature map differs too. feature Map: 0x0 ... feature Map: 0x2

If we end up doing a --create we need a complete set of reliable metadata.

What does mdadm -E sdb-new say?
Please don't copy type. Without knowing the metadata correctly, it can't be fed back to --create.

I believe
Code:
    Raid Level: raid5
    raid Devices: 3
    Layout: left-symmetric
    Chunk Size: 512k

but what about
Code:
    feature Map:
    Data Offset: 2048 sectors
    Super Offset: 8 sectors
    Unused Space: before=1968 sectors, after=872 sectors
it all has to be spot on because the defaults keep changing.

Quoting from the post I linked
Code:
    Update Time: fri Feb 8 12:13:10 2019
    Checksum: fdedb47b - correct
    Events: 1340931


Now its
Quote:
sda4 has an event count of 1347561, sdc4 has 1347560
That's moved on by about 7000 writes. That's worrying.
Maybe its the recovery?
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1181
Location: Stuttgart, Germany

PostPosted: Sun Feb 10, 2019 6:06 pm    Post subject: Reply with quote

Hi Neddy,

one correction:

sdc4's metadata shows it as "spare", so only sda4 is Active device 2. Does this change anything?

Cheers,

EE
Addendum: as it's getting late again, and it's only on retry 7 of 20, should I let it run overnight, or shut down and let it cool off again? Errsize 500 kB, errors 224.
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43178
Location: 56N 3W

PostPosted: Mon Feb 11, 2019 12:01 am    Post subject: Reply with quote

ExecutorElassus,

Let it run all night.

We still need accurate metadata for the raid set, even if the drive slots have been lost.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1181
Location: Stuttgart, Germany

PostPosted: Mon Feb 11, 2019 7:08 am    Post subject: Reply with quote

Hi Neddy,

having realized I can ssh into the machine from my laptop, here is what mdadm -E /dev/sdX4 reports:
Code:
/dev/sda4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : d42e5336:b75b0144:a502f2a0:178afc11
           Name : domo-kun:carrier
  Creation Time : Wed Apr 11 00:10:50 2012
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1931841384 (921.17 GiB 989.10 GB)
     Array Size : 1931840512 (1842.35 GiB 1978.20 GB)
  Used Dev Size : 1931840512 (921.17 GiB 989.10 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=872 sectors
          State : clean
    Device UUID : 4a8d21e3:15026b07:bfacaedc:b5160599

    Update Time : Sat Feb  9 11:01:48 2019
       Checksum : 8b9a9869 - correct
         Events : 1347561

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : ..A ('A' == active, '.' == missing, 'R' == replacing)
Code:
 % mdadm -E /dev/sdb4
/dev/sdb4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : d42e5336:b75b0144:a502f2a0:178afc11
           Name : domo-kun:carrier
  Creation Time : Wed Apr 11 00:10:50 2012
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1931841384 (921.17 GiB 989.10 GB)
     Array Size : 1931840512 (1842.35 GiB 1978.20 GB)
  Used Dev Size : 1931840512 (921.17 GiB 989.10 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=872 sectors
          State : clean
    Device UUID : 6484cb2a:b50e63db:eead2787:af47cecc

    Update Time : Sat Feb  9 10:49:30 2019
       Checksum : 857fc458 - correct
         Events : 1343686

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)
Code:
 % mdadm -E /dev/sdc4
/dev/sdc4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x8
     Array UUID : d42e5336:b75b0144:a502f2a0:178afc11
           Name : domo-kun:carrier
  Creation Time : Wed Apr 11 00:10:50 2012
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1931841384 (921.17 GiB 989.10 GB)
     Array Size : 1931840512 (1842.35 GiB 1978.20 GB)
  Used Dev Size : 1931840512 (921.17 GiB 989.10 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1768 sectors, after=872 sectors
          State : clean
    Device UUID : 9a99d7ad:9b5a9b75:42cb3258:cfb40e04

    Update Time : Sat Feb  9 10:56:23 2019
  Bad Block Log : 512 entries available at offset 264 sectors - bad blocks present.
       Checksum : ab16b9a7 - correct
         Events : 1347560

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : spare
   Array State : ..A ('A' == active, '.' == missing, 'R' == replacing)
Code:
% mdadm -E /dev/sdd4
/dev/sdd4:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : d42e5336:b75b0144:a502f2a0:178afc11
           Name : domo-kun:carrier
  Creation Time : Wed Apr 11 00:10:50 2012
     Raid Level : raid5
   Raid Devices : 3

 Avail Dev Size : 1931841384 (921.17 GiB 989.10 GB)
     Array Size : 1931840512 (1842.35 GiB 1978.20 GB)
  Used Dev Size : 1931840512 (921.17 GiB 989.10 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=1951911088 sectors
          State : clean
    Device UUID : 6484cb2a:b50e63db:eead2787:af47cecc

    Update Time : Sat Feb  9 10:49:30 2019
       Checksum : 857fc458 - correct
         Events : 1343686

         Layout : left-symmetric
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAA ('A' == active, '.' == missing, 'R' == replacing)

I got your last message too late, so I'd already shut down for the night. It's up and running again now, this time from the ssh on my laptop, so I can copy stuff as needed.

ddrescue has now been running for five hours today. Errsize is 497 kB, errors is 227

Thanks for the help,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43178
Location: 56N 3W

PostPosted: Mon Feb 11, 2019 2:36 pm    Post subject: Reply with quote

ExecutorElassus'

Lovely ...

Code:
/dev/sda4:
...
    Update Time : Sat Feb  9 11:01:48 2019
       Checksum : 8b9a9869 - correct
         Events : 1347561
...
   Device Role : Active device 2


/dev/sdb4:
...
    Update Time : Sat Feb  9 10:49:30 2019
       Checksum : 857fc458 - correct
         Events : 1343686
...
   Device Role : Active device 1

/dev/sdc4:
...
     Update Time : Sat Feb  9 10:56:23 2019
  Bad Block Log : 512 entries available at offset 264 sectors - bad blocks present.
       Checksum : ab16b9a7 - correct
         Events : 1347560
...
   Device Role : spare
 

and /dev/sdd4 is a copy of /dev/sdb4 or should be but
Code:
   Unused Space : before=1968 sectors, after=1951911088 sectors

That should be a copy of the metadata on sdb4 but its not.

All the entries
Code:
 Avail Dev Size : 1931841384 (921.17 GiB 989.10 GB)
     Array Size : 1931840512 (1842.35 GiB 1978.20 GB)
  Used Dev Size : 1931840512 (921.17 GiB 989.10 GB)
    Data Offset : 2048 sectors
   Super Offset : 8 sectors
   Unused Space : before=1968 sectors, after=872 sectors
should be identical too.
Again, they differ.

We may need to try assembling all the combinations of two drives to see what happens.
At face value sdc4 does not look too healthy.
I've not seen
Code:
Bad Block Log : 512 entries available at offset 264 sectors - bad blocks present.
before but it wasn't there in your previous metadata post.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
ExecutorElassus
Veteran
Veteran


Joined: 11 Mar 2004
Posts: 1181
Location: Stuttgart, Germany

PostPosted: Mon Feb 11, 2019 2:55 pm    Post subject: Reply with quote

Hi Neddy,

well, as for the space after on sdd4, the new drive is 2TB instead of one, so having way more space afterwards is expected. That also explains the discrepancy in the Avail/Used Dev size entries.

So the only remaining issue should be the bad block data that's recently appeared on sdc4, yes? Is it possible that this data got put in during one of the attempts at rebuilding?

In any case, ddrescue has now been running since morning. After just under 8h, Errsize is 496, errors 227. How should we proceed?

Thanks for the help,

EE
Back to top
View user's profile Send private message
NeddySeagoon
Administrator
Administrator


Joined: 05 Jul 2003
Posts: 43178
Location: 56N 3W

PostPosted: Mon Feb 11, 2019 3:40 pm    Post subject: Reply with quote

ExecutorElassus,

You are in the middle of doing a dd from sdb4 to sdd4.
That means the data being copied is *identical* dd is a low level, block by block copy.
dd is uninformed of the meaning of whatever is being copied.

When you are fed up with ddrescue, try to assemble the raid from sda4 and sdb4 using mdadm --assemble --readonly
That's two old drives. If it assembles, mount it and look around. If it assembles, it may not mount, that requires the filesystem metadata to be intact too.

Code:
/dev/sda4:
...
    Update Time : Sat Feb  9 11:01:48 2019
       Checksum : 8b9a9869 - correct
         Events : 1347561
...
   Device Role : Active device 2

/dev/sdb4:
...
    Update Time : Sat Feb  9 10:49:30 2019
       Checksum : 857fc458 - correct
         Events : 1343686
...
   Device Role : Active device 1
Thats 4000 events missing.

To add sdc4, we need to know where it goes. Its either Active device 0 or Active device 3.
One of us needs to read up on how mdadm numbers Active devices.

If that works, try sda4 and sdd4 always with mdadm --assemble --readonly.

Since we have the metadata in this thread, you can try --force too.
If --force won't work, all that's left is --create but that is very much a last ditch thing.
_________________
Regards,

NeddySeagoon

Computer users fall into two groups:-
those that do backups
those that have never had a hard drive fail.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Goto page Previous  1, 2, 3, 4  Next
Page 2 of 4

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum