Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] New PC build, random power failures
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
AaylaSecura
Tux's lil' helper
Tux's lil' helper


Joined: 09 Jun 2011
Posts: 122

PostPosted: Mon Aug 10, 2015 12:16 pm    Post subject: [SOLVED] New PC build, random power failures Reply with quote

Greetings! I just finished building my new PC and I'm having a problem the cause of which is a mystery to me. I am experiencing random power failures followed by system reboot. My hardware is:
ASUS X99 PRO
Intel Core i7 5930K
nVidia GTX 970 (which is probably irrelevant in this case)
Let me start by saying it is not a hardware problem, because everything is stable under Windows. Furthermore it seems to be running perfectly stable on the latest Gentoo minimal installation CD. I first noticed these power failures under the Ubuntu 14.04 live CD and after some googling I saw multiple bug reports saying people were seeing a similar issue starting with 14.04 (kernel v3.16, if I remember correctly) but ALL of them had MSI motherboard and ATI graphics cards and concluded that the issue was with this hardware only. I don't know if my problem is the same or even related but I decided to boot into the Gentoo minimal installation CD and was pleased when the PC did not reboot a single time in two days. So I proceeded with installing Gentoo onto my SSD from there. All went well, I rebooted successfully but only after a few minutes, the system rebooted. What it does is it loses power (all the fans stop), the Q-code transitions to 61 which is NVRAM initialization and the system reboots. Under normal operation (Windows, minimal install CD) the Q-code is AA, whcih means the system has transitioned to ACPI mode, interrupt controller is in APIC mode. Under Ubuntu Live CD, the code is again AA, although I am experiencing power failures and under my local Gentoo install, the code is FF, which in the manual is listed as "reserved for future AMI errors"... I tried just copying the config from the install cd, building everything into it as opposed to modules (since I'm not using initrd) and compiling the kernel (v4.1.3) with it - same problem. I tried compiling v4.0.5 (same as on the install CD) and copying the initrd from the live CD - same. Although I haven't tested it for long enough to actually get the system to suddenly reboot (it is not reproducible, it's random), the Q-code is still FF so I presume there is still something wrong. Nothing in dmesg hints to where the problem is. For starters I'd like to run the exact configuration from the install CD and from there, change the kernel config little by little until the problem is back but I apparently copying the kernel config isn't enough...


Last edited by AaylaSecura on Wed Aug 12, 2015 9:00 am; edited 1 time in total
Back to top
View user's profile Send private message
Roman_Gruber
Advocate
Advocate


Joined: 03 Oct 2006
Posts: 3806
Location: Austro Bavaria

PostPosted: Mon Aug 10, 2015 4:51 pm    Post subject: Reply with quote

OVERCLOCKING IS NOT SUPPORTED HERE !

afaik
Quote:
Intel Core i7 5930K
is an overclocker cpu, right? overclocking is not offically supported here...

http://geizhals.at/intel-core-i7-5930k-cm8064801548338-a1121207.html
Quote:
Hexa-Core: "Haswell-E" • Taktfrequenz: 3.50GHz, Turbo: 3.70GHz • TDP: 140W • Fertigung: 22nm • Interface: DMI, 5GT/s • L2-Cache: 6x 256kB • L3-Cache: 15MB shared • Stepping: R2 • Einführung: 2014/Q3 • Grafik: N/A • PCIe-Lanes: 40x PCIe 3.0 • Sockel: 2011-3, max. 1 CPU • Memory Controller: Quad Channel PC4-17000U (DDR4-2133), 68GB/s, max. 64GB • Features: SSE4.1, SSE4.2, AVX, AVX2, FMA3, Turbo Boost 2.0, Hyper-Threading, VT-x EPT, VT-d, Intel 64, Idle States, EIST, Thermal Monitoring, IPT, AES-NI, XD Bit, Multiplikator frei wählbar
=> free multiplier, so its an overclocker cpu!


did you run a stresstest in windows?

did you updated the firmware of your components?

you mention msi board but you have an asus branded one so how does this correlate?

you can still use kernel 3.10 but the longterm support ends quite soon for that ... 3.18 is next stable kernel org long term kernel.

did you run memtest?

Quote:
Let me start by saying it is not a hardware problem,

Quote:
suddenly reboot (it is not reproducible, it's random)


So please evaluate why this is not a hardware problem when you get random reboots?

Is this an overclocked Computer? under/overvolting? any fancy settings in the bios?

did you checked the wiring and connections of your components? sometimes it helps when you replug every conection, even the mainboard and the other components.

one way to find the culprint is to remove all components. Than add as mcuh as necessary and make a stresstest when this works, add another components and go on until you get these reboots.
Means, cpu / 1 ram / mainboard / 1 harddisc / 1 gpu
Back to top
View user's profile Send private message
AaylaSecura
Tux's lil' helper
Tux's lil' helper


Joined: 09 Jun 2011
Posts: 122

PostPosted: Tue Aug 11, 2015 12:22 am    Post subject: Reply with quote

Thank you for trying to help, however I don't think you read my post carefully.
tw04l124 wrote:
free multiplier, so its an overclocker cpu!

I thought for it to be overclocked I need to have done that manually from BIOS?? In any case, this is irrelevant as I will explain below.
tw04l124 wrote:
did you run a stresstest in windows?

tw04l124 wrote:
did you run memtest?

Yes, I ran Memtest on each DIMM, I ran Prime 95 overnight in Windows (the torture test for maximum power draw), I ran 3DMark for the GPU - no problems at all.
tw04l124 wrote:
did you updated the firmware of your components?

I tried, the webpage for my model said (content not found) - I just looked again and was able to download the update, so I'll try that tonight.
tw04l124 wrote:
you mention msi board but you have an asus branded one so how does this correlate?

Yeah, exactly - it does not! I merely mentioned it and said it is probably an unrelated problem, since it's a completely different hardware (and kernel version).
tw04l124 wrote:
you can still use kernel 3.10 but the longterm support ends quite soon for that ... 3.18 is next stable kernel org long term kernel.

But I do not want to use an old version! It probably won't have support for half of the hardware I have.
tw04l124 wrote:
So please evaluate why this is not a hardware problem when you get random reboots?

Because as I said it runs perfectly fine on Windows even when stress tested so apparently it is not a hardware problem. Furthermore it runs fine on the Gentoo install CD so it clearly CAN run fine one Linux kernel 4.0.5!
tw04l124 wrote:
Is this an overclocked Computer? under/overvolting? any fancy settings in the bios?

It is not overclocked, settings are default.
tw04l124 wrote:
did you checked the wiring and connections of your components? sometimes it helps when you replug every conection, even the mainboard and the other components.

one way to find the culprint is to remove all components. Than add as mcuh as necessary and make a stresstest when this works, add another components and go on until you get these reboots.
Means, cpu / 1 ram / mainboard / 1 harddisc / 1 gpu

As I said, I only get these reboots on certain kernel configurations, so this is irrelevant, correct me if I'm wrong.

Bottom line is - it works on the install CD, I want to transfer its configuration to my local Gentoo install, I can't seem to do that. Can someone help with that?
Back to top
View user's profile Send private message
Tony0945
Advocate
Advocate


Joined: 25 Jul 2006
Posts: 3100
Location: Illinois, USA

PostPosted: Tue Aug 11, 2015 12:58 am    Post subject: Reply with quote

tw04l124 wrote:
OVERCLOCKING IS NOT SUPPORTED HERE !

afaik
Quote:
Intel Core i7 5930K
is an overclocker cpu, right? overclocking is not offically supported here...



That indicates nothing. I run an AMD Phenom II Black Edition which also has an unlocked multiplier. I've never overclocked it. It was a good fast six core CPU at a sale price. I intend to replace it with an i7-4790K Devil's Canyon. I don't plan to overclock that either. Just because a CPU CAN be overclocked doesn't mean that IS overclocked. I for one, wouldn't risk blowing up my $300+ CPU.
Back to top
View user's profile Send private message
Tony0945
Advocate
Advocate


Joined: 25 Jul 2006
Posts: 3100
Location: Illinois, USA

PostPosted: Tue Aug 11, 2015 1:58 am    Post subject: Reply with quote

If it's OK on Ubuntu and it's OK on sysrecuecd, but not on your kernel, I'd guess that something is missing from your kernel. Do you have the "meld" package? I'd use it to compare the sysrescuecd kernel config with your kernel config, particularly in the CPU and acpi areas. Another approach is to build a genkernel image by emerging genkernel and running it with base configuration, i.e. no menuconfig. It will add itself to your boot menu and you can see if it is stable. If it is, again compare the configs. That's my best shot.
Back to top
View user's profile Send private message
AaylaSecura
Tux's lil' helper
Tux's lil' helper


Joined: 09 Jun 2011
Posts: 122

PostPosted: Tue Aug 11, 2015 3:00 am    Post subject: Reply with quote

Tony0945 wrote:
If it's OK on Ubuntu and it's OK on sysrecuecd, but not on your kernel, I'd guess that something is missing from your kernel. Do you have the "meld" package? I'd use it to compare the sysrescuecd kernel config with your kernel config, particularly in the CPU and acpi areas. Another approach is to build a genkernel image by emerging genkernel and running it with base configuration, i.e. no menuconfig. It will add itself to your boot menu and you can see if it is stable. If it is, again compare the configs. That's my best shot.

It is NOT ok on Ubuntu - the Q-code is AA, which is normal but it still reboots randomly on its own. It is only ok on the minimal install CD and I did copy the config from there as well as the initrd and still I'm having the problem... It's beyond me why it wouldn't work, it's the same kernel version, same config...
Back to top
View user's profile Send private message
Roman_Gruber
Advocate
Advocate


Joined: 03 Oct 2006
Posts: 3806
Location: Austro Bavaria

PostPosted: Tue Aug 11, 2015 7:56 am    Post subject: Reply with quote

do you use mircode? please install and set it up ...

Code:
eix microcode
[I] sys-apps/microcode-ctl
     Available versions:  1.23 (~)1.27 (~)1.28 {selinux}
     Installed versions:  1.28(19:24:38 18.04.2015)(-selinux)
     Homepage:            https://fedorahosted.org/microcode_ctl/
     Description:         Intel processor microcode update utility

[I] sys-apps/microcode-data
     Available versions:  20140430 (~)20140624 (~)20140913 20150121
     Installed versions:  20150121(10:20:32 09.02.2015)
     Homepage:            http://inertiawar.com/microcode/ https://downloadcenter.intel.com/Detail_Desc.aspx?DwnldID=24661
     Description:         Intel IA32 microcode update data



did not those cpus have issues with tsx or what its called?

Feel free to overclock but when its not done right it causes also such issues. So I had to ask....
Back to top
View user's profile Send private message
AaylaSecura
Tux's lil' helper
Tux's lil' helper


Joined: 09 Jun 2011
Posts: 122

PostPosted: Wed Aug 12, 2015 8:59 am    Post subject: Reply with quote

Well ok, after a lot of trial and error, playing with kernel configuration, cmdline passed via grub I can conclude that the error Q-code is unrelated to the power faliures which are gone btw. So copying the kernel config from the install cd was enough to fix whatever was causing reboots. As for the error code - it is related to the graphics: setting gfxmode for grub seems to not have any effect (desired resolution is not set, code is FF), however passing vga=791 to the kernel sets the correct resolution and for the first 1-2 seconds (while the kernel is loading, before OpenRC takes over, the code is AA (no error) but then switches to FF.
I tried using the closed source nvidia driver and auto-loading it at boot, no change. I tried using nouveau, setting DRM for nouveau in the kernel config, passing nouveau.modeset=1 to kernel - no effect (meaning, resolution is low, code is FF, lshw however does list nouveau as being used for the device). I'll keep trying to get the graphics (as well as a bunch of other unrecognised hardware including the wifi adapter) working. Other than that, I guess the original problem with the reboots is solved.

Edit: Fixed the GPU issue - I had forgotten to set gfxmode_linux to keep (it was set to text). Now everything works fine with the vesa fb driver + proprietry nvidia driver


Last edited by AaylaSecura on Wed Aug 12, 2015 2:01 pm; edited 1 time in total
Back to top
View user's profile Send private message
Buffoon
Veteran
Veteran


Joined: 17 Jun 2015
Posts: 1074
Location: EU or US

PostPosted: Wed Aug 12, 2015 9:30 am    Post subject: Reply with quote

Did you have watchdog enabled in your kernel?
Back to top
View user's profile Send private message
AaylaSecura
Tux's lil' helper
Tux's lil' helper


Joined: 09 Jun 2011
Posts: 122

PostPosted: Wed Aug 12, 2015 10:00 am    Post subject: Reply with quote

Buffoon wrote:
Did you have watchdog enabled in your kernel?

No, I never did and don't right now. Not sure if Ubuntu had it but it probably wasn't causing the problem with the reboots (if that's what you meant).
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum