Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Kernel/ZFS issue while trying to read a device
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Dwosky
Tux's lil' helper
Tux's lil' helper


Joined: 07 Nov 2018
Posts: 95

PostPosted: Wed May 13, 2020 2:37 pm    Post subject: Kernel/ZFS issue while trying to read a device Reply with quote

I've recently upgraded my system with LVM and moved my current partition layout to it (excepting /boot). More or less I was able to leave the system working again without visible issues and all that I checked was working correctly. Today I wanted to check the ZFS raidz I have with media files in the system and I saw it wasn't mounted. Upon trying to import the pool I get an error that all devices are faulty:
Code:
   pool: kodi
     id: 9108213825698193774
  state: UNAVAIL
 status: One or more devices are faulted.
 action: The pool cannot be imported due to damaged devices or data.
 config:

        kodi                                          UNAVAIL  insufficient replicas
          raidz1-0                                    UNAVAIL  insufficient replicas
            ata-WDC_...1  FAULTED  too many errors
            ata-WDC_...2  FAULTED  too many errors
            ata-WDC_...3  FAULTED  too many errors


I've been checking the hardware in case some ID shifted or something like that and I didn't saw any issue, but trying to clear the drives in order to start clean its giving me a headache, because they seem to recreate itself.

I've tried clearing the drives with the following
Code:
parted /dev/sdX rm 1
parted /dev/sdX rm 1
dd if=/dev/zero of=/dev/sdX bs=512 count=1
zpool labelclear /dev/sdX


Which cleans all the current partition layout, but if I restart the system after those commands, on the next startup a fdisk -l /dev/sdX gives me the old partiton layout, as if I didn't remove it.

Aside from that I've tried creating a new zfs pool from within the same HDDs, but it gives me an error and checking the message logs I get the following:
Code:
May 13 16:32:59 server kernel: CPU: 0 PID: 28876 Comm: vdev_open Tainted: P           O      5.4.38-gentoo #4
May 13 16:32:59 server kernel: Hardware name: System manufacturer System Product Name/P8H77-I, BIOS 1101 03/07/2014
May 13 16:32:59 server kernel: Call Trace:
May 13 16:32:59 server kernel:  dump_stack+0x50/0x70
May 13 16:32:59 server kernel:  spl_panic+0x1ab/0x1de [spl]
May 13 16:32:59 server kernel:  vdev_cache_stat_fini+0x8ee/0x16f0 [zfs]
May 13 16:32:59 server kernel:  ? vdev_queue_io+0x174/0x210 [zfs]
May 13 16:32:59 server kernel:  zio_push_transform+0x10d8/0x12d0 [zfs]
May 13 16:32:59 server kernel:  zio_nowait+0xa4/0x140 [zfs]
May 13 16:32:59 server kernel:  vdev_probe+0xff/0x270 [zfs]
May 13 16:32:59 server kernel:  vdev_open+0x4f0/0x6b0 [zfs]
May 13 16:32:59 server kernel:  vdev_open_children+0x169/0x180 [zfs]
May 13 16:32:59 server kernel:  taskq_dispatch_delay+0x58c/0x8a0 [spl]
May 13 16:32:59 server kernel:  ? __switch_to_asm+0x34/0x70
May 13 16:32:59 server kernel:  ? wake_up_q+0x60/0x60
May 13 16:32:59 server kernel:  kthread+0xf6/0x130
May 13 16:32:59 server kernel:  ? taskq_dispatch_delay+0x2d0/0x8a0 [spl]
May 13 16:32:59 server kernel:  ? kthread_park+0x80/0x80
May 13 16:32:59 server kernel:  ret_from_fork+0x35/0x40
May 13 16:32:59 server zed[28893]: eid=33 class=io pool_guid=0xC06C10AAC43EB994 vdev_path=/dev/disk/by-id/ata-WDC_...3-part1
May 13 16:32:59 server zed[28896]: eid=34 class=probe_failure pool_guid=0xC06C10AAC43EB994 vdev_path=/dev/disk/by-id/ata-WDC_...3-part1
May 13 16:32:59 server zed[28898]: eid=35 class=statechange pool_guid=0xC06C10AAC43EB994 vdev_path=/dev/disk/by-id/ata-WDC_...3-part1 vdev_state=FAULTED
May 13 16:32:59 server zed[28912]: eid=36 class=io pool_guid=0xC06C10AAC43EB994 vdev_path=/dev/disk/by-id/ata-WDC_...2-part1
May 13 16:32:59 server zed[28914]: eid=37 class=probe_failure pool_guid=0xC06C10AAC43EB994 vdev_path=/dev/disk/by-id/ata-WDC_...2-part1
May 13 16:32:59 server zed[28916]: eid=38 class=statechange pool_guid=0xC06C10AAC43EB994 vdev_path=/dev/disk/by-id/ata-WDC_...2-part1 vdev_state=FAULTED
May 13 16:32:59 server zed[28930]: eid=39 class=io pool_guid=0xC06C10AAC43EB994 vdev_path=/dev/disk/by-id/ata-WDC_...1-part1
May 13 16:32:59 server zed[28932]: eid=40 class=probe_failure pool_guid=0xC06C10AAC43EB994 vdev_path=/dev/disk/by-id/ata-WDC_...1-part1
May 13 16:32:59 server zed[28934]: eid=41 class=statechange pool_guid=0xC06C10AAC43EB994 vdev_path=/dev/disk/by-id/ata-WDC_...1-part1 vdev_state=FAULTED
May 13 16:32:59 server zed[28948]: eid=42 class=vdev.no_replicas pool_guid=0xC06C10AAC43EB994


I'm aware I did some changes between the LVM migration, since I updated the kernel parameters for LVM + initramfs. Also there was a minor update of the kernel (from 5.4.28 to 5.4.38), but this was working without issues previously... I don't mind losing the data on these drives, but the problem is that I can't use them right now since the zpool doesn't let me add them and it seems I'm not able to clear them. Any idea on why is it failing now or what to check?
Back to top
View user's profile Send private message
peje
Tux's lil' helper
Tux's lil' helper


Joined: 11 Jan 2003
Posts: 99

PostPosted: Wed May 13, 2020 3:08 pm    Post subject: Reply with quote

I normaly clean disks with sgdisk --zap-all /dev/sd$X
cu peje
Back to top
View user's profile Send private message
ununu
n00b
n00b


Joined: 19 Apr 2020
Posts: 31

PostPosted: Wed May 13, 2020 9:28 pm    Post subject: Reply with quote

peje wrote:
I normaly clean disks with sgdisk --zap-all /dev/sd$X
cu peje


This is a good one, I've forgotten about it.

I thought changing partition type with mktable in parted from gpt to mbr and back to gpt would achieve the same end.

Anyway, about the zfs, remove hostid and delete the zpool.cache. Perhaps when you're attempting to make the initramfs is picking those up with the old layout and wrecking havoc afterwards.

Besides that, did you RECOMPILE zfs and zfs-kmod with the new kernel? I mean after symlinking to the new version. Thats also important I believe. As a matter fact, create the initramfs after zfs, otherwise it won't include zfs modules.

regards.
Back to top
View user's profile Send private message
Dwosky
Tux's lil' helper
Tux's lil' helper


Joined: 07 Nov 2018
Posts: 95

PostPosted: Fri May 15, 2020 9:29 am    Post subject: Reply with quote

Ok, I think it might be something related with the cache as you're saying. I've removed the zfs package and directory, did a new dd over one hdd, restarted the system and now its correctly shown as a blank disk.

Another thing that was making me crazy is that /dev/disk/by-id/ layout wasn't being refreshed with the changes, for example, the old partitions where still there as broken symlinks, but now it seems to be working as expected. I'm going to see if I can clear the other two disks in order to rebuild the pool, but man, what a pain in the ass to hit a wall like this one...

As for the kernel modules, I always perform an emerge @module-rebuild for every new kernel compilation, so I believe that shouldn't be an issue. I think it might be something related with its cache or configuration being corrupted due to the LVM move, but not really sure.
Back to top
View user's profile Send private message
paddlaren
Tux's lil' helper
Tux's lil' helper


Joined: 23 Nov 2005
Posts: 104
Location: Hörby, Sweden

PostPosted: Sat May 16, 2020 5:57 pm    Post subject: Reply with quote

There is always wipefs to clean out the partition signatures.

// Erik
Back to top
View user's profile Send private message
Dwosky
Tux's lil' helper
Tux's lil' helper


Joined: 07 Nov 2018
Posts: 95

PostPosted: Tue May 19, 2020 5:29 pm    Post subject: Reply with quote

There are several tools to basically do the same (dd, wipefs, etc...), in the end the issue I believe was with a corruption with the ZFS local cache file, since a dd cleaned the disk without issues, but upon restart the partition table was recreated.

In the end I removed the ZFS files, cleared the HDD, reinstalled ZFS and now its back to normal again... sigh...
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum