Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Kernel sleep debugging?
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1519
Location: KUUSANKOSKI, Finland

PostPosted: Sat May 13, 2017 3:57 pm    Post subject: Kernel sleep debugging? Reply with quote

Before I start:
  • I run systemd on ~amd64 Linux 4.11.0. However systemd package has ~amd64 masked (for reasons).
  • Yes. The problem has been there on earlier kernels too.
  • Sometimes systemd does not "see" that I've pressed the PwrBtn (this is also random). Nothing appears in journal. However, initiating the suspend/hibernate from command line always works (meaning systemd starts the process of entering the sleep state), so this problem may actually not be systemd's fault.
  • shellcmd: inxi -zzzCGNPMc0 :
    Machine:   Device: desktop Mobo: ASRock model: 970M Pro3
               UEFI [Legacy]: American Megatrends v: P1.60 date: 06/17/2016
    CPU:       Octa core AMD FX-8350 Eight-Core (-MCP-) cache: 16384 KB
               clock speeds: max: 4000 MHz 1: 1400 MHz 2: 1400 MHz 3: 2100 MHz
               4: 1400 MHz 5: 2100 MHz 6: 2100 MHz 7: 1400 MHz 8: 1400 MHz
    Graphics:  Card: Advanced Micro Devices [AMD/ATI] Fiji [Radeon R9 FURY / NANO Series]
               Display Server: X.Org 1.19.3 drivers: amdgpu (unloaded: radeon)
               Resolution: 1920x1200@59.95hz, 1920x1080@60.00hz
               GLX Renderer: Gallium 0.4 on AMD FIJI (DRM 3.10.0 / 4.11.0-gentoo-wren, LLVM 5.0.0)
               GLX Version: 3.0 Mesa 17.2.0-devel (git-5c92b1bf07)
    Network:   Card-1: Mellanox MT25204 [InfiniHost III Lx HCA] driver: ib_mthca
               Card-2: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
               driver: r8169
    Partition: ID-1: / size: 1.1T used: 238G (25%) fs: btrfs dev: /dev/sda3
               ID-2: /var size: 1.1T used: 238G (25%) fs: btrfs dev: /dev/sda3
               ID-3: /home size: 1.1T used: 238G (25%) fs: btrfs dev: /dev/sda3
               ID-4: /boot size: 496M used: 452M (97%) fs: ext2 dev: /dev/md1
               ID-5: swap-1 size: 18.72GB used: 0.00GB (0%) fs: swap dev: /dev/md5


== The problem ==
I have about 75% success rate on hibernating or suspending my desktop PC.
Now I want to try to find out why. I guess I need to raise kernel debugging verbosity.
I can do it by echoing 1 to /proc/sys/kernel/sysrq and then hitting [alt]+[SysRq]+[r] and then [alt]+[SysRq]+[9]. But I guess there a way to set the verbosity on kernel command line, which I would maybe need in order to be able to see all the messages from the beginning.
Also I'm getting a plethora of errors like this in dmesg:
snippet of dmesg:
[  +0.000000] AMD-Vi: Event logged [
[  +0.000002] IO_PAGE_FAULT device=01:00.0 domain=0x000f address=0x000000f40029b6c0 flags=0x0010]
... and by heaps ... wc -l told me a total of 511 IO_PAGE_FAULT -messages.

== Symptons ==
Usually screens turn off but fans are spinning, leds are on. Keyboard on the other hand has all it's leds turned off (backlit, USB).
If hibernating there's a change that at this point if I hold the power button down to forcibly turn the system off, it resumes from the image succesfully on next reboot. This, of course, fails if suspend was used to sleep.

Now... What's the best way to start digging this? The kernel command line parameter to enable verbose logging?

Any info I need to paste? Just tell.
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
Zucca
Veteran
Veteran


Joined: 14 Jun 2007
Posts: 1519
Location: KUUSANKOSKI, Finland

PostPosted: Mon May 15, 2017 10:10 am    Post subject: It happened again Reply with quote

It happened again. This time I was trying to hibernate my system and just before writing the image system froze/stopped. Both monitors were off indicating GPU has cut the signaling.
shellcmd: journalctl --boot=-1 | tail -n 30 :
May 15 03:01:16 wren backup-sync.sh[8381]: total size is 32.29G  speedup is 980.97
May 15 03:01:16 wren backup-sync.sh[8381]: Backup synced.
May 15 03:01:16 wren backup-sync.sh[8381]: No daily snapshotting needed.
May 15 03:01:16 wren backup-sync.sh[8381]: Currently storing 62 snaps.
May 15 03:01:16 wren systemd[1]: Started Backup script.
May 15 03:01:16 wren systemd[1]: Reached target Sleep.
May 15 03:01:16 wren systemd[1]: Starting Module un-load...
May 15 03:01:16 wren sh[8392]: modprobe: WARNING: Module ib_core is in use.
May 15 03:01:16 wren sh[8392]: rmmod ib_mthca
May 15 03:01:16 wren systemd-networkd[6716]: ib0: Lost carrier
May 15 03:01:16 wren sh[8392]: modprobe: WARNING: Module ib_cm is in use.
May 15 03:01:16 wren sh[8392]: rmmod ib_ipoib
May 15 03:01:16 wren sh[8392]: rmmod ib_umad
May 15 03:01:16 wren sh[8392]: rmmod rpcrdma
May 15 03:01:16 wren kernel: RPC: Unregistered rdma transport module.
May 15 03:01:16 wren kernel: RPC: Unregistered rdma backchannel transport module.
May 15 03:01:16 wren sh[8392]: rmmod rdma_cm
May 15 03:01:16 wren sh[8392]: rmmod ib_cm
May 15 03:01:16 wren sh[8392]: rmmod iw_cm
May 15 03:01:16 wren sh[8392]: rmmod ib_core
May 15 03:01:16 wren systemd[1]: Started Module un-load.
May 15 03:01:16 wren systemd[1]: Starting Preparing for hibernation. Dropping caches and syncing....
May 15 03:01:18 wren pre-hibernate.sh[8439]: Disk caches dropped.
May 15 03:01:18 wren kernel: pre-hibernate.s (8439): drop_caches: 3
May 15 03:01:18 wren pre-hibernate.sh[8439]: Cached writes synced to disks.
May 15 03:01:19 wren systemd[1]: Started Preparing for hibernation. Dropping caches and syncing..
May 15 03:01:19 wren systemd[1]: Starting Beeping just for fun...
May 15 03:01:19 wren systemd[1]: Started Beeping just for fun.
May 15 03:01:19 wren systemd[1]: Starting Hibernate...
May 15 03:01:19 wren kernel: PM: Hibernation mode set to 'shutdown'
As you can see I also have a service that drops caches before hibernating. This is because I get significant speedup when writing hibernation image. systemd does not seem to drop caches automatically (why?).
After that I've made another service that makes PC-speaker beeps. This way I know that all the services have been ran before actual hibernating.
After the beeps screens turn off. They usually turn back on (with frozen view of "last seen" content before sleeping) when kernel writes the image and then system powers off.

I've used 'platform' mode before, but it had the exact same symptons.
I also tried 'freeze' mode when suspending, but it failed right at the same spot - monitors off, fans spin, leds are lit.

This kind of random behaviour is infuriating. Worst still is that I don't see any error messages.
_________________
..: Zucca :..

Code:
ERROR: '--failure' is not an option. Aborting...
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum