Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Intel C602 SAS/SATA controller causes kernel panic
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Mon Nov 14, 2016 7:55 am    Post subject: Intel C602 SAS/SATA controller causes kernel panic Reply with quote

I recently (like last week) upgraded my home server to Supermicro X9DRL-iF board with Xeon E5-2670 processor.
My problems started when I tried to use the additional SATA controller (ports 7 to 10 on the board) that are supported by the C602 chipset:
C602 chipset 4-Port SATA Storage Control Unit
https://cateee.net/lkddb/web-lkddb/SCSI_ISCI.html
SATA ports 1 to 6 are "internal" and are working OK.

My problems are similar to that topic https://forums.gentoo.org/viewtopic-t-958726-start-0.html
Drives weren't detected at all, I found that I have to enable CONFIG_SCSI_SAS_ATA config option, when I did that, boom, kernel panic, null pointer dereference (gentoo hardened 4.8.7, same was on 4.7.x):

https://goo.gl/photos/SgN9reyPw6jjm8LS8

Attached are SATA drives, 3Gbps, 1.5 TB, Samsungs. C602 should support them.
Anybody has any experience with this chipset?
Back to top
View user's profile Send private message
Roman_Gruber
Advocate
Advocate


Joined: 03 Oct 2006
Posts: 3806
Location: Austro Bavaria

PostPosted: Mon Nov 14, 2016 12:16 pm    Post subject: Reply with quote

You may check bugs.kernel.org (or what its called) and report there please.

Only idea. Check the wiring of the box. replug everthing. try a livemedia to check if it happens there too.
Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Mon Nov 14, 2016 6:40 pm    Post subject: Reply with quote

Thanks for kernel bugzilla hint.

I found an old issue there for C602 chipset: https://bugzilla.kernel.org/show_bug.cgi?id=60644

Quote:
I ran more detailed tests this weekend.

ASPM & MSI disabled = stable machine under zfs load

ASPM disabled / MSI enabled = stable machine under zfs load

ASPM enabled / MSI disabled = unstable, lost an HBA under zfs load


Hardware:
Supermicro X8DTH-iF, BIOS 2.1b (current)
2x Xeon X5670, 48GB DDR3 1333Mhz Reg/ECC
3x LSI 9207-8i, phase 18 firmware
36x Seagate ST32000444SS

It appears to be ASPM and vulnerability to issue may vary by chipset.


I remeber that I enabled quite aggresive power management in BIOS and also many PM options in kernel. Worth investigating.
Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Tue Nov 29, 2016 6:53 am    Post subject: Reply with quote

I just found out that I enabled this option:

Code:
 CONFIG_SCSI_MQ_DEFAULT:                                                                                                                                                                   
  x                                                                                                                                                                                           
  x This option enables the new blk-mq based I/O path for SCSI                                                                                                                               
  x devices by default.  With the option the scsi_mod.use_blk_mq                                                                                                                             
  x module/boot option defaults to Y, without it to N, but it can                                                                                                                             
  x still be overridden either way.                                                                                                                                                           
  x                                                                                                                                                                                           
  x If unsure say N.                                                                                                                                                                         
  x                                                                                                                                                                                           
  x Symbol: SCSI_MQ_DEFAULT [=y]                                                                                                                                                             
  x Type  : boolean                                                                                                                                                                           
  x Prompt: SCSI: use blk-mq I/O path by default                                                                                                                                             
  x   Location:                                                                                                                                                                               
  x     -> Device Drivers                                                                                                                                                                     
  x       -> SCSI device support                                                                                                                                                             
  x   Defined at drivers/scsi/Kconfig:49                                                                                                                                                     
  x   Depends on: SCSI [=y]                       


And the kernel panic clearly references blk-mq code. I'll try to disable it and test again. Maybe the bug is somewhere there.
Back to top
View user's profile Send private message
mbar
Veteran
Veteran


Joined: 19 Jan 2005
Posts: 1979
Location: Poland

PostPosted: Wed Nov 30, 2016 4:43 pm    Post subject: Reply with quote

It kinda works after disabling blk-mq code. Disks are accessible and working, but I have this in dmesg:

Code:
[   52.904287] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[   52.929201] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[   52.929205] sas: ata13: end_device-0:2: cmd error handler
[   52.929262] sas: ata11: end_device-0:0: dev error handler
[   52.929275] sas: ata12: end_device-0:1: dev error handler
[   52.929280] sas: ata13: end_device-0:2: dev error handler
[   52.936164] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[  172.948132] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[  172.948150] sas: ata13: end_device-0:2: cmd error handler
[  172.948212] sas: ata11: end_device-0:0: dev error handler
[  172.948230] sas: ata12: end_device-0:1: dev error handler
[  172.948236] sas: ata13: end_device-0:2: dev error handler
[  172.955224] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[  172.976115] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[  172.976121] sas: ata13: end_device-0:2: cmd error handler
[  172.976189] sas: ata11: end_device-0:0: dev error handler
[  172.976203] sas: ata12: end_device-0:1: dev error handler
[  172.976208] sas: ata13: end_device-0:2: dev error handler
[  172.983135] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[  173.004110] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[  173.004115] sas: ata13: end_device-0:2: cmd error handler
[  173.004172] sas: ata11: end_device-0:0: dev error handler
[  173.004200] sas: ata12: end_device-0:1: dev error handler
[  173.004205] sas: ata13: end_device-0:2: dev error handler
[  173.011160] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[  173.040125] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[  173.040130] sas: ata13: end_device-0:2: cmd error handler
[  173.040221] sas: ata11: end_device-0:0: dev error handler
[  173.040235] sas: ata12: end_device-0:1: dev error handler
[  173.040240] sas: ata13: end_device-0:2: dev error handler
[  173.047200] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[  173.068107] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[  173.068112] sas: ata13: end_device-0:2: cmd error handler
[  173.068168] sas: ata11: end_device-0:0: dev error handler
[  173.068194] sas: ata12: end_device-0:1: dev error handler
[  173.068199] sas: ata13: end_device-0:2: dev error handler
[  173.075148] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1
[  173.096140] sas: Enter sas_scsi_recover_host busy: 1 failed: 1
[  173.096145] sas: ata13: end_device-0:2: cmd error handler
[  173.096199] sas: ata11: end_device-0:0: dev error handler
[  173.096215] sas: ata12: end_device-0:1: dev error handler
[  173.096220] sas: ata13: end_device-0:2: dev error handler
[  173.103144] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 1 tries: 1


I also have PCI-Ex SiI 3132 Serial ATA Raid II Controller that is working stable and I may just as well use that.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum