Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Kernel issue 4.8.15 with xen
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
MasterPrenium
Tux's lil' helper
Tux's lil' helper


Joined: 07 Dec 2006
Posts: 89

PostPosted: Wed Dec 21, 2016 1:02 pm    Post subject: Kernel issue 4.8.15 with xen Reply with quote

Hello Guys,

I've having some trouble on a new system I'm setting up.

Dmesg errors :
Code:
Dec 21 13:49:23 Node_1 kernel: [  413.220915] ------------[ cut here ]------------
Dec 21 13:49:23 Node_1 kernel: [  413.221072] kernel BUG at drivers/md/raid5.c:527!
Dec 21 13:49:23 Node_1 kernel: [  413.221293] invalid opcode: 0000 [#1] SMP
Dec 21 13:49:23 Node_1 kernel: [  413.221381] Modules linked in: x86_pkg_temp_thermal coretemp crc32c_intel aesni_intel aes_x86_64 ablk_helper mei_me mei mpt3sas
Dec 21 13:49:23 Node_1 kernel: [  413.221735] CPU: 2 PID: 5598 Comm: btrfs-transacti Not tainted 4.8.15-gentoo #3
Dec 21 13:49:23 Node_1 kernel: [  413.221901] Hardware name: Supermicro Super Server/X10SDV-4C-7TP4F, BIOS 1.0b 11/21/2016
Dec 21 13:49:23 Node_1 kernel: [  413.222073] task: ffff880267851a00 task.stack: ffff8802520e8000
Dec 21 13:49:23 Node_1 kernel: [  413.222202] RIP: e030:[<ffffffff819c5cac>]  [<ffffffff819c5cac>] raid5_get_active_stripe+0x5cc/0x670
Dec 21 13:49:23 Node_1 kernel: [  413.222421] RSP: e02b:ffff8802520eb8a0  EFLAGS: 00010086
Dec 21 13:49:23 Node_1 kernel: [  413.222546] RAX: ffff8802520eb8e0 RBX: ffff880265d0d800 RCX: ffff880265d0d9c0
Dec 21 13:49:23 Node_1 kernel: [  413.222708] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880263d244e8
Dec 21 13:49:23 Node_1 kernel: [  413.222848] RBP: ffff8802520eb930 R08: ffff88026438f1c0 R09: 0000000000000000
Dec 21 13:49:23 Node_1 kernel: [  413.223010] R10: ffff880265d0d9c8 R11: 0000000000000000 R12: ffff880265d0d800
Dec 21 13:49:23 Node_1 kernel: [  413.223149] R13: ffff880263d244e8 R14: ffff88016d3b8400 R15: ffff880263d244e8
Dec 21 13:49:23 Node_1 kernel: [  413.223324] FS:  0000000000000000(0000) GS:ffff880270c80000(0000) knlGS:ffff880270c80000
Dec 21 13:49:23 Node_1 kernel: [  413.223496] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 21 13:49:23 Node_1 kernel: [  413.223623] CR2: 00007fb58b66c318 CR3: 0000000257aeb000 CR4: 0000000000042660
Dec 21 13:49:23 Node_1 kernel: [  413.223786] Stack:
Dec 21 13:49:23 Node_1 kernel: [  413.223834]  ffff88016d39ba80 00000000ffffffff 0000000000000000 ffff880265d0d9c8
Dec 21 13:49:23 Node_1 kernel: [  413.224018]  ffff8802520eb8e0 0000000000000000 000000000ee23800 ffff880265d0d808
Dec 21 13:49:23 Node_1 kernel: [  413.224225]  0000000000000001 ffff8802520eb930 ffff88016d3b8498 0000000000000000
Dec 21 13:49:23 Node_1 kernel: [  413.224431] Call Trace:
Dec 21 13:49:23 Node_1 kernel: [  413.224490]  [<ffffffff819c5ec7>] raid5_make_request+0x177/0xdb0
Dec 21 13:49:23 Node_1 kernel: [  413.224648]  [<ffffffff810b6c70>] ? wait_woken+0x80/0x80
Dec 21 13:49:23 Node_1 kernel: [  413.224749]  [<ffffffff819cf562>] md_make_request+0xe2/0x220
Dec 21 13:49:23 Node_1 kernel: [  413.224878]  [<ffffffff8147457b>] generic_make_request+0xcb/0x1a0
Dec 21 13:49:23 Node_1 kernel: [  413.225034]  [<ffffffff814746b9>] submit_bio+0x69/0x120
Dec 21 13:49:23 Node_1 kernel: [  413.225162]  [<ffffffff813be50e>] btrfs_map_bio+0xfe/0x340
Dec 21 13:49:23 Node_1 kernel: [  413.225266]  [<ffffffff81392d98>] btrfs_submit_bio_hook+0xb8/0x180
Dec 21 13:49:23 Node_1 kernel: [  413.225424]  [<ffffffff813ad796>] submit_one_bio+0x66/0xa0
Dec 21 13:49:23 Node_1 kernel: [  413.225551]  [<ffffffff813ad952>] flush_epd_write_bio+0x42/0x60
Dec 21 13:49:23 Node_1 kernel: [  413.225682]  [<ffffffff813b38a3>] extent_writepages+0x53/0x60
Dec 21 13:49:23 Node_1 kernel: [  413.225811]  [<ffffffff81394ef0>] ? btrfs_set_bit_hook+0x270/0x270
Dec 21 13:49:23 Node_1 kernel: [  413.225944]  [<ffffffff813ae64a>] ? free_extent_state+0x3a/0xa0
Dec 21 13:49:23 Node_1 kernel: [  413.226075]  [<ffffffff81391b03>] btrfs_writepages+0x23/0x30
Dec 21 13:49:23 Node_1 kernel: [  413.226205]  [<ffffffff8115aa39>] do_writepages+0x19/0x30
Dec 21 13:49:23 Node_1 kernel: [  413.226332]  [<ffffffff8114ebac>] __filemap_fdatawrite_range+0x6c/0x90
Dec 21 13:49:23 Node_1 kernel: [  413.226467]  [<ffffffff8114ec6e>] filemap_fdatawrite_range+0xe/0x10
Dec 21 13:49:23 Node_1 kernel: [  413.226627]  [<ffffffff813a789b>] btrfs_fdatawrite_range+0x1b/0x50
Dec 21 13:49:23 Node_1 kernel: [  413.226762]  [<ffffffff813d8ae2>] __btrfs_write_out_cache.isra.26+0x432/0x480
Dec 21 13:49:23 Node_1 kernel: [  413.226928]  [<ffffffff813d91a3>] btrfs_write_out_cache+0x93/0x130
Dec 21 13:49:23 Node_1 kernel: [  413.227062]  [<ffffffff8137e101>] btrfs_start_dirty_block_groups+0x211/0x430
Dec 21 13:49:23 Node_1 kernel: [  413.227203]  [<ffffffff8138fc5a>] btrfs_commit_transaction+0x15a/0xa40
Dec 21 13:49:23 Node_1 kernel: [  413.227362]  [<ffffffff813905d1>] ? start_transaction+0x91/0x4d0
Dec 21 13:49:23 Node_1 kernel: [  413.227496]  [<ffffffff8138a760>] transaction_kthread+0x1f0/0x220
Dec 21 13:49:23 Node_1 kernel: [  413.227630]  [<ffffffff8138a570>] ? btrfs_cleanup_transaction+0x5f0/0x5f0
Dec 21 13:49:23 Node_1 kernel: [  413.227792]  [<ffffffff81098cb4>] kthread+0xc4/0xe0
Dec 21 13:49:23 Node_1 kernel: [  413.232373]  [<ffffffff8102b855>] ? __switch_to+0x355/0x7a0
Dec 21 13:49:23 Node_1 kernel: [  413.236942]  [<ffffffff81c8d0bf>] ret_from_fork+0x1f/0x40
Dec 21 13:49:23 Node_1 kernel: [  413.241540]  [<ffffffff81098bf0>] ? kthread_park+0x50/0x50
Dec 21 13:49:23 Node_1 kernel: [  413.245941] Code: 0f 85 3f fd ff ff 0f 0b f3 90 8b 43 70 a8 01 75 f7 89 45 98 e9 33 fe ff ff f0 ff 83 48 02 00 00 e9 63 fd ff ff 0f 0b 0f 0b 0f 0b <0f> 0b 49 8b 84 24 a0 02 00 00 a8 10 0f 85 d4 fb ff ff f0 41 80
Dec 21 13:49:23 Node_1 kernel: [  413.255592] RIP  [<ffffffff819c5cac>] raid5_get_active_stripe+0x5cc/0x670
Dec 21 13:49:23 Node_1 kernel: [  413.260122]  RSP <ffff8802520eb8a0>
Dec 21 13:49:23 Node_1 kernel: [  413.264518] ---[ end trace 523662f52765a413 ]---
Dec 21 13:49:23 Node_1 kernel: [  417.769176] BUG: unable to handle kernel NULL pointer dereference at           (null)
Dec 21 13:49:23 Node_1 kernel: [  417.769211] IP: [<ffffffff810b6656>] __wake_up_common+0x26/0x80
Dec 21 13:49:23 Node_1 kernel: [  417.769228] PGD 257277067 PUD 257163067 PMD 0
Dec 21 13:49:23 Node_1 kernel: [  417.769271] Oops: 0000 [#2] SMP
Dec 21 13:49:23 Node_1 kernel: [  417.769283] Modules linked in: x86_pkg_temp_thermal coretemp crc32c_intel aesni_intel aes_x86_64 ablk_helper mei_me mei mpt3sas
Dec 21 13:49:23 Node_1 kernel: [  417.769314] CPU: 2 PID: 5598 Comm: btrfs-transacti Tainted: G      D         4.8.15-gentoo #3
Dec 21 13:49:23 Node_1 kernel: [  417.769394] Hardware name: Supermicro Super Server/X10SDV-4C-7TP4F, BIOS 1.0b 11/21/2016
Dec 21 13:49:23 Node_1 kernel: [  417.769412] task: ffff880267851a00 task.stack: ffff8802520e8000
Dec 21 13:49:23 Node_1 kernel: [  417.769412] RIP: e030:[<ffffffff810b6656>]  [<ffffffff810b6656>] __wake_up_common+0x26/0x80
Dec 21 13:49:23 Node_1 kernel: [  417.769412] RSP: e02b:ffff8802520ebe48  EFLAGS: 00010086
Dec 21 13:49:23 Node_1 kernel: [  417.769437] RAX: 0000000000000200 RBX: ffff8802520ebf18 RCX: 0000000000000000
Dec 21 13:49:23 Node_1 kernel: [  417.769439] RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffff8802520ebf18
Dec 21 13:49:23 Node_1 kernel: [  417.769440] RBP: ffff8802520ebe80 R08: 0000000000000000 R09: 0000000000000000
Dec 21 13:49:23 Node_1 kernel: [  417.769445] R10: 0000000000000008 R11: 0000000000000000 R12: ffff8802520ebf20
Dec 21 13:49:23 Node_1 kernel: [  417.769450] R13: 0000000000000200 R14: 0000000000000000 R15: 0000000000000003
Dec 21 13:49:23 Node_1 kernel: [  417.769505] FS:  0000000000000000(0000) GS:ffff880270c80000(0000) knlGS:ffff880270c80000
Dec 21 13:49:23 Node_1 kernel: [  417.769510] CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
Dec 21 13:49:23 Node_1 kernel: [  417.769543] CR2: 0000000000000000 CR3: 0000000257aeb000 CR4: 0000000000042660
Dec 21 13:49:23 Node_1 kernel: [  417.769627] Stack:
Dec 21 13:49:23 Node_1 kernel: [  417.769647]  0000000181022fc6 0000000000000000 ffff8802520ebf18 ffff8802520ebf10
Dec 21 13:49:23 Node_1 kernel: [  417.769652]  0000000000000200 0000000000000000 ffff8802520eb7f8 ffff8802520ebe90
Dec 21 13:49:23 Node_1 kernel: [  417.769652]  ffffffff810b66be ffff8802520ebeb8 ffffffff810b7172 ffff8802678520d8
Dec 21 13:49:23 Node_1 kernel: [  417.769652] Call Trace:
Dec 21 13:49:23 Node_1 kernel: [  417.769671]  [<ffffffff810b66be>] __wake_up_locked+0xe/0x10
Dec 21 13:49:23 Node_1 kernel: [  417.769679]  [<ffffffff810b7172>] complete+0x32/0x50
Dec 21 13:49:23 Node_1 kernel: [  417.769727]  [<ffffffff81078dc0>] mm_release+0xc0/0x160
Dec 21 13:49:23 Node_1 kernel: [  417.769757]  [<ffffffff8107ddc9>] do_exit+0x139/0xb80
Dec 21 13:49:23 Node_1 kernel: [  417.769846]  [<ffffffff81c8f227>] rewind_stack_do_exit+0x17/0x20
Dec 21 13:49:23 Node_1 kernel: [  417.769853]  [<ffffffff81098bf0>] ? kthread_park+0x50/0x50
Dec 21 13:49:23 Node_1 kernel: [  417.770077] Code: 00 00 00 00 00 55 48 89 e5 41 57 41 89 f7 41 56 41 89 ce 41 55 41 54 4c 8d 67 08 53 48 83 ec 10 89 55 cc 48 8b 57 08 4c 89 45 d0 <48> 8b 32 48 8d 42 e8 49 39 d4 4c 8d 6e e8 75 05 eb 38 49 89 d5
Dec 21 13:49:23 Node_1 kernel: [  417.770111] RIP  [<ffffffff810b6656>] __wake_up_common+0x26/0x80
Dec 21 13:49:23 Node_1 kernel: [  417.770198]  RSP <ffff8802520ebe48>
Dec 21 13:49:23 Node_1 kernel: [  417.770202] CR2: 0000000000000000
Dec 21 13:49:23 Node_1 kernel: [  417.770210] ---[ end trace 523662f52765a414 ]---
Dec 21 13:49:23 Node_1 kernel: [  417.789818] Fixing recursive fault but reboot is needed!


Kernel config (gentoo-sources 4.8.15) (with or without experimental patch): http://pastebin.com/p0EcHjbu


Xen Version :
app-emulation/xen-4.6.4-r3
app-emulation/xen-tools-4.6.4-r4
(same issue with xen 4.8)

This is happening when I'm making huge i/o on a raid 5 RAID stack.
I've to reset system to make it work again.

Here is configuration :
- 3x Hard Drives running on RAID 5 Software raid created by mdadm
- On top of it, I'm running DRBD for replication over another node (Active/passive cluster)
- On top of it, a BTRFS FileSystem with a few subvolumes
- On top of it, XEN VMs running.


Kernel bug ? Or any idea on how to fix it ?

Bests

[Moderator edit: changed [quote] tags to [code] tags to preserve output layout. -Hu]
Back to top
View user's profile Send private message
bandreabis
Advocate
Advocate


Joined: 18 Feb 2005
Posts: 2426
Location: イタリアのロディで

PostPosted: Thu Mar 02, 2017 8:26 am    Post subject: Reply with quote

UP?
_________________
Il numero di post non fa di me un esperto! Anzi!
Back to top
View user's profile Send private message
MasterPrenium
Tux's lil' helper
Tux's lil' helper


Joined: 07 Dec 2006
Posts: 89

PostPosted: Tue Oct 24, 2017 3:56 pm    Post subject: Reply with quote

Race condition ...
Fixed in 4.9.15 ;)
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum