Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
LVM cache recurring corruption -- how to reinitialize device
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Thu Sep 14, 2017 9:39 am    Post subject: LVM cache recurring corruption -- how to reinitialize device Reply with quote

Hello all!

Since yesterday I have strange problem with my LVM RAID 5 (HDD) + LVM cache (on SSD) setup. Here are some details:

- this is relatively new minimal setup (new install on 25th of August 2017) on J3455-ITX (4 core Celeron), only cryptsetup, LVM RAID 5 (builtin), 1 x SSD as /dev/sda (rootfs/system is not encypted), 5 x HDD Samsung 1,5 TB encrypted: /dev/sdb1 -> /dev/mapper/crypt1 and so on
- on top of crypt[1-5] devices is LVM RAID 5 (no MD Raid layer)
- added 100 GB SSD cache on encypted /dev/sda4 partition -- referene: https://rwmj.wordpress.com/2014/05/22/using-lvms-new-cache-feature/

All was working OK for like 3 weeks. Two days ago I had to replace one HDD as it began to fail.
So:
- I had to uncache the LVM (went ok) -- reference: https://rwmj.wordpress.com/2014/05/23/removing-the-cache-from-an-lv/
- removed one drive
- added new encrypted 1,5 TB drive
- resynced LVM RAID 5
- fsck -- all OK
- LVM, filesystem status -- healthy

- then I reattached SSD cache. All seemed to be working OK, cache was up and running
- first reboot: LVM missing, cache device corrupted

I did cache removal / attach before a few times as a test before the drive was replaced. It went without any errors then.

Since the cache corruption I had to do a manual recovery to uncache the LVM and get access to the data: I had to edit vgcfgbackup by hand and do vgcfgrestore.
This is similar https://www.redhat.com/archives/linux-lvm/2016-December/msg00015.html not a single tool could help me to uncache LVM with corrupt cache.

Anyway, after manual recovery the data was intact, so I tried to do it again.

I did pvremove on ssd, pvreate, vgextend and so on. None of those commands displayed any error message, so I was sure the SSD cache was properly reinitialized

LVM was cached until next reboot (today), when it went missing again. Seems it is not usable now for reason unknown to me.

Is there any way to check / wipe the SSD cache partition (apart from overwriting it with /dev/zero)?
Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Thu Sep 14, 2017 5:01 pm    Post subject: Reply with quote

I don't understand this:

Code:
root@carbon:~# lvcreate -n cache0meta -L 120M vg0 /dev/mapper/luks_cache
  Logical volume "cache0meta" created.
root@carbon:~# lvcreate -n cache0 -l 25568 vg0 /dev/mapper/luks_cache
  Logical volume "cache0" created.
root@carbon:~# ls /dev/mapper/
control     luks_lvm1  luks_lvm3  luks_lvm5   vg0-cache0meta  vg0-lvol0_rimage_0  vg0-lvol0_rimage_2  vg0-lvol0_rimage_4  vg0-lvol0_rmeta_1  vg0-lvol0_rmeta_3
luks_cache  luks_lvm2  luks_lvm4  vg0-cache0  vg0-lvol0       vg0-lvol0_rimage_1  vg0-lvol0_rimage_3  vg0-lvol0_rmeta_0   vg0-lvol0_rmeta_2  vg0-lvol0_rmeta_4
root@carbon:~# cache_check
No input file provided.
Usage: cache_check [options] {device|file}
Options:
  {-q|--quiet}
  {-h|--help}
  {-V|--version}
  {--clear-needs-check-flag}
  {--super-block-only}
  {--skip-mappings}
  {--skip-hints}
  {--skip-discards}
root@carbon:~# cache_check /dev/mapper/vg0-cache0
examining superblock
  superblock is corrupt
    bad checksum in superblock
root@carbon:~# cache_check /dev/mapper/vg0-cache0meta
examining superblock
  superblock is corrupt
    bad checksum in superblock
root@carbon:~# lvconvert --type cache-pool --poolmetadata vg0/cache0meta vg0/cache0
  Using 128,00 KiB chunk size instead of default 64,00 KiB, so cache pool has less then 1000000 chunks.
  WARNING: Converting logical volume vg0/cache0 and vg0/cache0meta to cache pool's data and metadata volumes with metadata wiping.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
Do you really want to convert vg0/cache0 and vg0/cache0meta? [y/n]: y
  Converted vg0/cache0_cdata to cache pool.
root@carbon:~# ls /dev/mapper/
control     luks_lvm1  luks_lvm3  luks_lvm5  vg0-lvol0_rimage_0  vg0-lvol0_rimage_2  vg0-lvol0_rimage_4  vg0-lvol0_rmeta_1  vg0-lvol0_rmeta_3
luks_cache  luks_lvm2  luks_lvm4  vg0-lvol0  vg0-lvol0_rimage_1  vg0-lvol0_rimage_3  vg0-lvol0_rmeta_0    vg0-lvol0_rmeta_2  vg0-lvol0_rmeta_4
root@carbon:~# cache_check /dev/mapper/vg0-cache0meta
/dev/mapper/vg0-cache0meta: No such file or directory
root@carbon:~# cache_check /dev/mapper/luks_
luks_cache  luks_lvm1   luks_lvm2   luks_lvm3   luks_lvm4   luks_lvm5   
root@carbon:~# cache_check /dev/mapper/luks_cache
examining superblock
  superblock is corrupt
    bad checksum in superblock


This is on newly wiped/trimmed SSD partition, with NEW luks key, luksFormat, pvcreate, etc.

If I add the cache to my LVM RAID 5, then it will get b0rked on reboot.
Back to top
View user's profile Send private message
MageSlayer
Apprentice
Apprentice


Joined: 26 Jul 2007
Posts: 250
Location: Ukraine

PostPosted: Fri Sep 15, 2017 9:15 am    Post subject: Reply with quote

Are you sure your SSD is ok?
Back to top
View user's profile Send private message
Roman_Gruber
Advocate
Advocate


Joined: 03 Oct 2006
Posts: 3806
Location: Austro Bavaria

PostPosted: Fri Sep 15, 2017 11:59 am    Post subject: Reply with quote

Quote:
superblock is corrupt
bad checksum in superblock


Did you checked your cables, connection, power supply, redo the wiring?

OFC latest firmware on the drive. is the drive healthy?

Quote:
/dev/mapper/vg0-cache0meta: No such file or directory


Looks like it does not exists or is not visible to the operating system.

Sometimes i had to initialize, make it visible to the OS, with vg-scan, vg -ay (or what the commands are, please check manpage!) Sometimes only a reboot did the trick on some sysrescue-cd discs.

--

I never had a broken SSD, sold my 5 year old daily used Plextor SSD recently. I usually sell, replace HDDs out of habbits every seond, third year average.
Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Fri Sep 15, 2017 4:51 pm    Post subject: Reply with quote

SSD seems to be healthy.
/dev/sda2 is 16GB system partition that has no trouble reading, writing, updating.
SMART info is clean, dmesg also, no errors, even crc32.

But I'll convert sda4 to plain ext4 and make some tests with files.

vg0-cache0meta0 is hidden by LVM after it is added to pool as a csche for HDD. Hence you can't check it explicitly.
Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Sat Sep 16, 2017 6:44 am    Post subject: Reply with quote

SDD is OK, I just did long SMART test and 25GB copy and md5 checksum test on BTRFS partition (of course I rebooted the machine in the meantime):

Code:
smartctl -a /dev/sda
smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.12.0-1-amd64] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     SAMSUNG MZNTD128HAGM-00000
Serial Number:    S15YNYAD625624
LU WWN Device Id: 5 002538 50003cf55
Firmware Version: DXT2300Q
User Capacity:    128,035,676,160 bytes [128 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Sep 16 08:29:39 2017 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

(...)

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always       -       4090
 12 Power_Cycle_Count       0x0032   096   096   000    Old_age   Always       -       3664
177 Wear_Leveling_Count     0x0013   095   095   000    Pre-fail  Always       -       56
179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail  Always       -       0
181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age   Always       -       0
182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age   Always       -       0
183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail  Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0032   069   048   000    Old_age   Always       -       31
195 Hardware_ECC_Recovered  0x001a   200   200   000    Old_age   Always       -       0
199 UDMA_CRC_Error_Count    0x003e   100   100   000    Old_age   Always       -       0
235 Unknown_Attribute       0x0012   099   099   000    Old_age   Always       -       252
241 Total_LBAs_Written      0x0032   099   099   000    Old_age   Always       -       10202610775

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      4089         -


root@carbon:~# free && sync && echo 3 > /proc/sys/vm/drop_caches && free
              total        used        free      shared  buff/cache   available
Mem:        8021580      232696      137676        5104     7651208     7497636
Swap:       7812092        4608     7807484
              total        used        free      shared  buff/cache   available
Mem:        8021580      232176     7703052        5104       86352     7599356
Swap:       7812092        4608     7807484



[dest dir on SSD, btrfs, after reboot] md5sum *
c7caf4e97cadf52a2489a176284ed8f4  1.mkv
1095527685e2aba668bee2c2958229af  2.mkv
19e2153cce10b4317e2add27747c4356  3.mkv
62b1cac0498a245a69056506e4d6356c  4.mkv
26f9230e3da7158a87c60d526ed7eb26  5.mkv
79e4d0db765a93ac5dee1b2ed1b53e39  6.mkv
a8c438783ee10fe75fcb3ba2cd636238  7.mkv
81e141a09074e7a16756fb472458df9e  8.mkv
706a2ee617186be6796b20d425eb836d  9.mkv
a13e62ddad20cc0e796f7e9a46c09a83  10.mkv

[source dir on HDD] md5sum *
c7caf4e97cadf52a2489a176284ed8f4  1.mkv
1095527685e2aba668bee2c2958229af  2.mkv
19e2153cce10b4317e2add27747c4356  3.mkv
62b1cac0498a245a69056506e4d6356c  4.mkv
26f9230e3da7158a87c60d526ed7eb26  5.mkv
79e4d0db765a93ac5dee1b2ed1b53e39  6.mkv
a8c438783ee10fe75fcb3ba2cd636238  7.mkv
81e141a09074e7a16756fb472458df9e  8.mkv
706a2ee617186be6796b20d425eb836d  9.mkv
a13e62ddad20cc0e796f7e9a46c09a83  10.mkv

Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Sat Sep 16, 2017 8:41 am    Post subject: Reply with quote

Seems I'm getting onto something:

Code:
root@carbon:~# dd if=/dev/zero of=/dev/mapper/luks_cache status=progress
433003520 bajtów (433 MB, 413 MiB), 9 s, 48,1 MB/s      ^C^C^C


root@carbon:~#
root@carbon:~#
root@carbon:~#
root@carbon:~#
root@carbon:~# dd if=/dev/zero of=/dev/mapper/luks_cache bs=8M status=progress
3447717888 bajtów (3,4 GB, 3,2 GiB), 3,00394 s, 1,1 GB/s
dd: błąd zapisu '/dev/mapper/luks_cache': Brak miejsca na urządzeniu
489+0 przeczytanych rekordów
488+0 zapisanych rekordów
4093915136 bajtów (4,1 GB, 3,8 GiB), 3,80779 s, 1,1 GB/s


In short, I tried to wipe the encrypted block device (on top of 100 GB /dev/sda4 partition) and the write process failed after just over 4 GB of data with "no space left on device" message.

dmesg has this at the end:
Code:
wrz 16 10:27:27 carbon systemd[1]: Stopped target Encrypted Volumes.
wrz 16 10:27:27 carbon systemd[1]: Stopping Cryptography Setup for luks_cache...
wrz 16 10:27:27 carbon systemd[1]: Stopped Cryptography Setup for luks_cache.
wrz 16 10:27:37 carbon kernel: CMCI storm detected: switching to poll mode


Encrypted block device "luks_cache" was simply kicked of the system and I think it is connected to the "CMCI storm" (first time I see this).
https://forums.gentoo.org/viewtopic-p-8115134.html <-- here is similar hardware (Intel Celeron J3355 CPU).

Slower writes to the plain BTRFS were approx. 100 MB/s speed (copying from HDD) and the system handled 25 GB with no problem.
4 GB of high speed writes (~1 GB/s) seems to overwhelm it. Where to look next -- software, hardware?
Is there any kernel switch that can help here (I'm still on 4.12 series on this machine)?
Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Sat Sep 16, 2017 1:19 pm    Post subject: Reply with quote

Writing to raw (unencrypted) sda4 device seems OK, no storm here:
Code:
107487428608 bajtów (107 GB, 100 GiB), 872,027 s, 123 MB/s
dd: błąd zapisu '/dev/sda4': Brak miejsca na urządzeniu
25630+0 przeczytanych rekordów
25629+0 zapisanych rekordów
107497914368 bajtów (107 GB, 100 GiB), 887,286 s, 121 MB/s
Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Sat Sep 16, 2017 2:23 pm    Post subject: Reply with quote

OK, the question is:
why raw write to encrypted device is much faster (I suspect some kind of buffer in dm-mapper layer?) than raw write to unencrypted device?
Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Sat Sep 16, 2017 4:50 pm    Post subject: Reply with quote

Small success here: just like in the referenced thread, upgrading the kernel to 4.13.x seems to have solved the "superfast writes" and device kicked out of dm-mapper:
Code:
dd if=/dev/zero of=/dev/mapper/luks_cache bs=4M status=progress
9441378304 bajtów (9,4 GB, 8,8 GiB), 67,0111 s, 141 MB/s
...

Write speed seems normal, dmesg reports no storm.
Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Sun Sep 17, 2017 6:45 am    Post subject: Reply with quote

I reinitialized the LVM cache and I'm at loss here:

Code:
root@carbon:~# cache_check /dev/mapper/luks_cache
examining superblock
  superblock is corrupt
    bad checksum in superblock


EDIT:

I did a quick test:

Code:
root@carbon:~# lvconvert --type cache --cachepool vg0/cache0 vg0/lvol0
Do you want wipe existing metadata of cache pool vg0/cache0? [y/n]: y
  WARNING: Data redundancy is lost with writeback caching of raid logical volume!
  Logical volume vg0/lvol0 is now cached.

...

root@carbon:~# lvremove vg0/cache0
Do you really want to remove and DISCARD logical volume vg0/cache0? [y/n]: y
  Flushing 0 blocks for cache vg0/lvol0.
  Logical volume "cache0" successfully removed


No rebooting with cache enabled.
Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Sun Sep 17, 2017 5:02 pm    Post subject: Reply with quote

This is the last episode in this series :) (I hope) -- or "how I learned to stop worrying and love the cache".

After doing extensive testing on unencrypted device (I even moved the partition to another location by 20 gigabytes, also used smaller size), wiped with zeroes, I came to conclusion that check_cache "superblock corruption" status is probably a bug. Even tried with downgraded to 0.6.1 version.
I disabled the cache_check in lvm.conf and my LVM RAID 5 with BTRFS survived 3 reboots already and btrfsck after each reboot (uncached and cached -- no errors).

Code:
root@carbon:~# ./lvmcache-statistics.sh
-------------------------------------------------------------------------
LVM [2.02.173(2)] cache report of found device /dev/vg0/lvol0
-------------------------------------------------------------------------
- Cache Usage: 4.6% - Metadata Usage: 23.7%
- Read Hit Rate: 27.1% - Write Hit Rate: 65.9%
- Demotions/Promotions/Dirty: 0/5412/0
- Feature arguments in use: metadata2 writeback
- Core arguments in use : migration_threshold 2048 smq 0
  - Cache Policy: stochastic multiqueue (smq)
- Cache Metadata Mode: rw
- MetaData Operation Health: ok
root@carbon:~# lvs
  LV    VG  Attr       LSize  Pool     Origin        Data%  Meta%  Move Log Cpy%Sync Convert
  lvol0 vg0 Cwi-aoC--- <5,46t [cache0] [lvol0_corig] 4,68   23,74           0,00     


Now we wait.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum