Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
CPU temp above threshold
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
LIsLinuxIsSogood
Veteran
Veteran


Joined: 13 Feb 2016
Posts: 1013

PostPosted: Sat Aug 19, 2017 8:04 pm    Post subject: CPU temp above threshold Reply with quote

Turning to the forum for some help with whatever is going on with ACPI and CPU temp (especially!). What is this and should I go about to be fixing it? Do these messages seem at all related?

Code:
dmesg
[    0.580397] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20170303/psargs-364)
[    0.581508] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT5._GTF] (Node ffff8802160af460), AE_NOT_FOUND (20170303/psparse-543)
[    0.583843] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20170303/psargs-364)
[    0.585060] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT3._GTF] (Node ffff8802160af370), AE_NOT_FOUND (20170303/psparse-543)
[    0.587609] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20170303/psargs-364)
[    0.588316] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT1._GTF] (Node ffff8802160af280), AE_NOT_FOUND (20170303/psparse-543)
[    0.589741] ata2.00: ATAPI: HL-DT-ST DVD-RW GSA-H60L, DC07, max UDMA/100
[    0.590490] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20170303/psargs-364)
[    0.591246] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT0._GTF] (Node ffff8802160af208), AE_NOT_FOUND (20170303/psparse-543)
[    0.592843] ata6.00: ATA-9: WDC WD30EZRX-00DC0B0, 80.00A80, max UDMA/133
[    0.593623] ata6.00: 5860533168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[    0.594412] ata4.00: ATA-8: ST1000LM024 HN-M101MBB, 2AR10002, max UDMA/133
[    0.595193] ata4.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
[    0.596252] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20170303/psargs-364)
[    0.597070] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT1._GTF] (Node ffff8802160af280), AE_NOT_FOUND (20170303/psparse-543)
[    0.598716] ata2.00: configured for UDMA/100
[    0.599565] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20170303/psargs-364)
[    0.600423] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT5._GTF] (Node ffff8802160af460), AE_NOT_FOUND (20170303/psparse-543)
[    0.601820] usb 1-1: new high-speed USB device number 2 using ehci-pci
[    0.603619] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20170303/psargs-364)
[    0.604547] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT3._GTF] (Node ffff8802160af370), AE_NOT_FOUND (20170303/psparse-543)
...[
  407.369289] ata1.00: exception Emask 0x10 SAct 0x400000 SErr 0x280100 action 0x6 frozen
[  407.369292] ata1.00: irq_stat 0x08000000, interface fatal error
[  407.369294] ata1: SError: { UnrecovData 10B8B BadCRC }
[  407.369296] ata1.00: failed command: READ FPDMA QUEUED
[  407.369300] ata1.00: cmd 60/00:b0:28:00:07/01:00:12:00:00/40 tag 22 ncq dma 131072 in
                        res 40/00:b0:28:00:07/00:00:12:00:00/40 Emask 0x10 (ATA bus error)
[  407.369302] ata1.00: status: { DRDY }
[  407.369305] ata1: hard resetting link
[  407.679403] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[  407.690316] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20170303/psargs-364)
[  407.690322] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT0._GTF] (Node ffff8802160af208), AE_NOT_FOUND (20170303/psparse-543)
[  407.710284] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20170303/psargs-364)
[  407.710290] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT0._GTF] (Node ffff8802160af208), AE_NOT_FOUND (20170303/psparse-543)
[  407.720350] ata1.00: configured for UDMA/133
[  407.720363] ata1: EH complete
...
[  883.456241] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
[  883.456243] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
[  883.456246] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
[  883.457245] CPU0: Core temperature/speed normal
[  883.457247] CPU1: Package temperature/speed normal
[  883.457249] CPU0: Package temperature/speed normal
[ 1183.689492] CPU0: Core temperature above threshold, cpu clock throttled (total events = 48376)
[ 1183.689494] CPU1: Package temperature above threshold, cpu clock throttled (total events = 48376)
[ 1183.689496] CPU0: Package temperature above threshold, cpu clock throttled (total events = 48376)
[ 1183.690530] CPU0: Core temperature/speed normal
[ 1183.690532] CPU0: Package temperature/speed normal
[ 1183.690543] CPU1: Package temperature/speed normal
[ 3869.784891] CPU0: Core temperature above threshold, cpu clock throttled (total events = 49284)
[ 3869.784893] CPU1: Package temperature above threshold, cpu clock throttled (total events = 49284)
[ 3869.784895] CPU0: Package temperature above threshold, cpu clock throttled (total events = 49284)
[ 3869.785917] CPU0: Core temperature/speed normal
[ 3869.785919] CPU1: Package temperature/speed normal
[ 3869.785921] CPU0: Package temperature/speed normal
Back to top
View user's profile Send private message
eccerr0r
Watchman
Watchman


Joined: 01 Jul 2004
Posts: 7128
Location: almost Mile High in the USA

PostPosted: Sat Aug 19, 2017 9:46 pm    Post subject: Reply with quote

They look separate from each other, your cpu probably really was overheating if you were running it under heavy load. I get those a lot during the summer, I may also need to really clean my heatsink/fan and get some fresh thermal interface material...
_________________
Intel Core i7 2700K@ 4.1GHz/HD3000 graphics/8GB DDR3/180GB SSD
What am I supposed watching?
Back to top
View user's profile Send private message
russK
l33t
l33t


Joined: 27 Jun 2006
Posts: 630

PostPosted: Sat Aug 19, 2017 11:25 pm    Post subject: Reply with quote

I don't know what the ACPI issue is.
The CRC error with the disk is potentailly bad, but the disk may have handled it gracefully. It happened around 8 minutes before the cpu.
It's a good idea to keep heatsinks free of dirt and fans running OK. If machine is in a warm place, help it out.
Heat can effect a hard drive too. You may want to check the hard drive smartctl info. smartctl can also tell you temperature.

https://wiki.gentoo.org/wiki/Smartmontools
Back to top
View user's profile Send private message
Section_8
Guru
Guru


Joined: 22 May 2004
Posts: 566
Location: Arlington, TX, US

PostPosted: Sat Aug 19, 2017 11:41 pm    Post subject: Reply with quote

I see those temperature messages sometimes, as my system runs a couple of boinc projects. I wouldn't be surprised to see them if you're compiling some big package.
Back to top
View user's profile Send private message
LIsLinuxIsSogood
Veteran
Veteran


Joined: 13 Feb 2016
Posts: 1013

PostPosted: Mon Aug 21, 2017 9:37 pm    Post subject: Reply with quote

The disk error is still happening today...could I please get some help with these very foreign looking messages I do not require detailed explanation, but more a simple overview...

Thanks to those with suggestions, like the steps I will be taking to implement SMART, in the meantime what can I do to query the hardware to check it or find out more info. I am still unaware how the device is being referred to in the output of the log as ata1? How does that correspond to /dev/sdx for naming of disks? Does that refer to the entire disk or just a partition of the disk, and where can I find this information?

I can't seem to understand the messages except that "bus error" and "hard resetting link" those don't seem good...I know that much but that's all I know. Help!



Code:

[68437.088944] ata1.00: exception Emask 0x10 SAct 0x10 SErr 0x280100 action 0x6 frozen
[68437.088946] ata1.00: irq_stat 0x08000000, interface fatal error
[68437.088948] ata1: SError: { UnrecovData 10B8B BadCRC }
[68437.088950] ata1.00: failed command: READ FPDMA QUEUED
[68437.088954] ata1.00: cmd 60/00:20:80:00:de/01:00:03:00:00/40 tag 4 ncq dma 131072 in
                        res 40/00:20:80:00:de/00:00:03:00:00/40 Emask 0x10 (ATA bus error)
[68437.088955] ata1.00: status: { DRDY }
[68437.088959] ata1: hard resetting link
[68437.394033] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[68437.404948] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20170303/psargs-364)
[68437.404957] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT0._GTF] (Node ffff8802160af208), AE_NOT_FOUND (20170303/psparse-543)
[68437.424951] ACPI Error: [DSSP] Namespace lookup failure, AE_NOT_FOUND (20170303/psargs-364)
[68437.424961] ACPI Error: Method parse/execution failed [\_SB.PCI0.SAT0.SPT0._GTF] (Node ffff8802160af208), AE_NOT_FOUND (20170303/psparse-543)
[68437.434961] ata1.00: configured for UDMA/133
[68437.434977] ata1: EH complete
Back to top
View user's profile Send private message
russK
l33t
l33t


Joined: 27 Jun 2006
Posts: 630

PostPosted: Tue Aug 22, 2017 12:44 am    Post subject: Reply with quote

I'm no expert but I'll take a stab at some of it ...

The errors are a repeat & similar to the first post.
Both times, the initial messages happened within one second, as in the 2nd episode shown by [68437.nnnnnn].

Within the first 11 microseconds of the first message, the driver decided to reset the link. I suspect the drive was trying in earnest to read some unrecoverable data at 68437.088944, 68437.088946, 68437.088948 and 68437.088950 and the driver was either not prepared to hear the bad news or simply impatient. The driver may have been upset about how long it took for the drive to respond. Some of this is conjecture on my part.

Finally the driver decided to reset the link to get back to a known state. At 68437.434977 the reset was complete and the drive was ready.

I suspect there is some data on the disk that was not recorded properly. Every time the drive tries to read the data, it retries over and over and then finally gives up. During this period, the driver gets a little impatient and resets the link.

smartctl can tell you about the health of the drive.
There are tools and web pages about discovering bad blocks and getting them re-allocated.
ddrescue is a good tool for recovering data from a bad drive.

Your drive may or may not be in need of replacement. I would use smartctl to see if the drive believes it is healthy.

Depending on how many drives you have, something like this might be useful:
Code:
# for d in /dev/sd? ; do echo ========= DISPLAY INFO FOR $d ================== && smartctl -a $d ; done | less
Back to top
View user's profile Send private message
russK
l33t
l33t


Joined: 27 Jun 2006
Posts: 630

PostPosted: Tue Aug 22, 2017 3:47 am    Post subject: Reply with quote

Also note, sometimes errors like this can be due to the cables or bad connections.
Back to top
View user's profile Send private message
Small_Penguin
Tux's lil' helper
Tux's lil' helper


Joined: 27 May 2005
Posts: 137

PostPosted: Tue Aug 22, 2017 1:14 pm    Post subject: Reply with quote

You can also investigate with sys-apps/gsmartcontrol which has a more user-friendly interface.
Back to top
View user's profile Send private message
bunder
Bodhisattva
Bodhisattva


Joined: 10 Apr 2004
Posts: 5838

PostPosted: Fri Aug 25, 2017 7:51 am    Post subject: Reply with quote

Code:
[ 3869.784893] CPU1: Package temperature above threshold, cpu clock throttled (total events = 49284)
[ 3869.784895] CPU0: Package temperature above threshold, cpu clock throttled (total events = 49284)
[ 3869.785917] CPU0: Core temperature/speed normal
[ 3869.785919] CPU1: Package temperature/speed normal
[ 3869.785921] CPU0: Package temperature/speed normal


Intel Turbo Boost. You can disable it with

Code:
echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo


but you'll probably have to keep toggling it if you're on a laptop since every time you plug/unplug it gets reset back to 0 (on).
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum