Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
ext4, linux-4.0.4 and data corruption
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
bammbamm808
Guru
Guru


Joined: 08 Dec 2002
Posts: 487
Location: Hawaii

PostPosted: Sat May 30, 2015 7:29 am    Post subject: ext4, linux-4.0.4 and data corruption Reply with quote

My reading seems to strongly imply that the corruption issues with >linux.4.0.1 are limited to RAID0 setups, but there are a few claiming that they have experienced this issue without a RAID setup.

What say you? Has it been fixed in 4.0.4? Want to play with the new kernel, but I just migrated my Gen2 to this hardware, and don't want to lose any data.
_________________
Asrock X470 Taichi
Ryzen 2700x
32Gb Samsung B-die (16GB dual rank x2) DDR4
Geforce GTX 1060 6GB
Samsung Evo 840 500Gb +Seagate 1TB HDD
Etc....
Back to top
View user's profile Send private message
Jack Hair
n00b
n00b


Joined: 10 Jul 2013
Posts: 21

PostPosted: Sat May 30, 2015 8:16 am    Post subject: Reply with quote

I've been running 4.0.4 since the day it was released and haven't had problems with my ext4 partitions so far. From what I understand only (single) SSD's and RAID0 arrays are affected. But on a single HDD it's ok.
Back to top
View user's profile Send private message
bammbamm808
Guru
Guru


Joined: 08 Dec 2002
Posts: 487
Location: Hawaii

PostPosted: Sat May 30, 2015 10:20 am    Post subject: Reply with quote

Hmmm .....single SSD here. I will let others test the waters further. Thanks.
_________________
Asrock X470 Taichi
Ryzen 2700x
32Gb Samsung B-die (16GB dual rank x2) DDR4
Geforce GTX 1060 6GB
Samsung Evo 840 500Gb +Seagate 1TB HDD
Etc....
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6108
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Mon Jun 01, 2015 11:59 pm    Post subject: Reply with quote

just disable NCQ to be safer:

append the following to grub or other bootloader:

Code:
libata.force=noncq



There was also a data-corruption issue related to faulty NCQ implementation with SSDs and kernel 4.0.y


edit:

not sure if yours is related or the same
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D


Last edited by kernelOfTruth on Thu Jun 04, 2015 10:50 pm; edited 1 time in total
Back to top
View user's profile Send private message
EmaRsk
Apprentice
Apprentice


Joined: 07 Sep 2004
Posts: 158
Location: Italy

PostPosted: Tue Jun 02, 2015 10:12 am    Post subject: Reply with quote

From the the Arch forum thread (https://bbs.archlinux.org/viewtopic.php?id=197400&p=2):
matthew02 wrote:
Disabling NCQ obviously didn't help.
Back to top
View user's profile Send private message
EmaRsk
Apprentice
Apprentice


Joined: 07 Sep 2004
Posts: 158
Location: Italy

PostPosted: Tue Jun 02, 2015 11:51 am    Post subject: Reply with quote

There is a fix already, waiting to be pushed:
http://git.neil.brown.name/?p=md.git;a=commitdiff;h=a81157768a00e8cf8a7b43b5ea5cac931262374f
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6108
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Wed Jun 03, 2015 9:20 am    Post subject: Reply with quote

EmaRsk wrote:
From the the Arch forum thread (https://bbs.archlinux.org/viewtopic.php?id=197400&p=2):
matthew02 wrote:
Disabling NCQ obviously didn't help.


There's yet another issue:

http://marc.info/?l=linux-kernel&m=143195981313563&w=2

<-- this is the one I'm referring to


http://marc.info/?l=linux-kernel&m=143326750830569&w=2

Quote:
> I've been running with NCQ disabled and been stress testing for awhile and the
> issue is indeed gone. Thanks for the workaround!
>
> So it seems the issue is somehow related to the combination of NCQ, dm-crypt,
> and possibly (some?) SSDs.

Hi

I suspect that this is a bug in kernel NCQ processing or in SSD firmware
and recent dm-crypt changes made the bug show up.

I suggest this:

If you have some test that reliably reproduces the bug, please do this:
take kernel 3.19 or 3.18 and apply dm-crypt parallelization patches
(commits f3396c58fd8442850e759843457d78b6ec3a9589,
cf2f1abfbd0dba701f7f16ef619e4d2485de3366,
7145c241a1bf2841952c3e297c4080b357b3e52d,
94f5e0243c48aa01441c987743dc468e2d6eaca2,
dc2676210c425ee8e5cb1bec5bc84d004ddf4179,
0f5d8e6ee758f7023e4353cca75d785b2d4f6abe,
b3c5fd3052492f1b8d060799d4f18be5a5438add) on it. If the bug doesn't show
up with the older kernel and dm-crypt parallelization patches, use git
bisect to find out which patch broken NCQ. When you test a kernel with
bisect, apply the above mentioned patches to it.

Mikulas

_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
EmaRsk
Apprentice
Apprentice


Joined: 07 Sep 2004
Posts: 158
Location: Italy

PostPosted: Wed Jun 03, 2015 9:33 am    Post subject: Reply with quote

OK, so it's a different bug altogether. Is the one mentioned by the OP? He was talking about RAID, not dm-crypt.
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6108
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Wed Jun 03, 2015 10:07 am    Post subject: Reply with quote

Well, he didn't mention it - but he also didn't exclude it

so posting it here in case it applies
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
jfranz2
n00b
n00b


Joined: 24 Oct 2014
Posts: 37

PostPosted: Wed Jun 03, 2015 10:30 am    Post subject: Reply with quote

The problem is very specific as it seems to only affect people who are running SSD's in raid 0 while using TRIM/discard options. If you aren't doing the aforementioned you should be fine.
_________________
#TYBG
Back to top
View user's profile Send private message
EmaRsk
Apprentice
Apprentice


Joined: 07 Sep 2004
Posts: 158
Location: Italy

PostPosted: Wed Jun 03, 2015 10:31 am    Post subject: Reply with quote

kernelOfTruth wrote:
posting it here in case it applies

kernelOfTruth wrote:
just disable NCQ and you should be mostly save

Still, this is false. Disabling NCQ does not fix the bug the OP was probably referring to: if the kernel is not patched, a fstrim operation can cause data loss.

If NCQ is relevant to another bug, thank you for reporting, it's good to know, but please be aware (and make us aware) that it's a separate issue, otherwise one would gain a false sense of security by fixing it, still being exposed to the other bug.

EDIT: Reading my post again, I realized that it could be read in a confronting or rude tone. It wasn't meant to be that, sorry.
I'm just not really good to convey a friendly vibe while writing, and English not being my mother language doesn't help, I suppose.
I'll add some smilies to make up: :) :) :)
Back to top
View user's profile Send private message
kernelOfTruth
Watchman
Watchman


Joined: 20 Dec 2005
Posts: 6108
Location: Vienna, Austria; Germany; hello world :)

PostPosted: Thu Jun 04, 2015 10:52 pm    Post subject: Reply with quote

EmaRsk wrote:
kernelOfTruth wrote:
posting it here in case it applies

kernelOfTruth wrote:
just disable NCQ and you should be mostly save

Still, this is false. Disabling NCQ does not fix the bug the OP was probably referring to: if the kernel is not patched, a fstrim operation can cause data loss.

If NCQ is relevant to another bug, thank you for reporting, it's good to know, but please be aware (and make us aware) that it's a separate issue, otherwise one would gain a false sense of security by fixing it, still being exposed to the other bug.

EDIT: Reading my post again, I realized that it could be read in a confronting or rude tone. It wasn't meant to be that, sorry.
I'm just not really good to convey a friendly vibe while writing, and English not being my mother language doesn't help, I suppose.
I'll add some smilies to make up:
:) :) :)


Haha - yeah,

it ended up like that here :oops:


My bad, the fault's however also on my side,

I corrected the wording in the post so that people affected don't get into a false sense of safety when they aren't


Thanks for the push in the right direction :wink:
_________________
https://github.com/kernelOfTruth/ZFS-for-SystemRescueCD/tree/ZFS-for-SysRescCD-4.9.0
https://github.com/kernelOfTruth/pulseaudio-equalizer-ladspa

Hardcore Gentoo Linux user since 2004 :D
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum