Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
Predictable and weird computer hangs.
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
Verialneth
n00b
n00b


Joined: 27 Feb 2020
Posts: 2

PostPosted: Thu Feb 27, 2020 10:09 pm    Post subject: Predictable and weird computer hangs. Reply with quote

I don't know whether to post this as bug. It is the issue I had with gentoo when started. I guess it's probably my fault, but I'm asking how to diagnose _why_ this happens, in order to be able to repair and/or file a bug.

How to reproduce (unfortunately works for me every time)

1. Play yt video on firefox when charging
2. Unplug charger and stop the video.
3. Try to restart video. It doesn't work.

WHAT HAPPENS:
When you try to open terminal there is no "username@localhost ~ $" string. The terminal window is empty with cursor at top. You can't type anything. In some cases when issue is userspace process' fault, I usually go to console tty login as root and kill off all "username"'s processes. But this isn't the case. When I try to login I type "root" and then nothing. No text "password". The same problem as with terminal emulators. The only thing that "unblocks everything" (does _something_) is REISUB.

CAUSE :?:
I guess it is at least root process that failled.I suspect it is ACPI issue. when I unplug charger udevd (eudev) starts to do something, and then acpid starts to run something, but I don't know how to rule out or confirm this.

SYSTEM PROFILE
It happened with three previous kernel versions. Now I have 4.19.86-gentoo, but I don't currently have time for update (I think the issue will persist). But if there is high chance that it is kernel bug I will try to update to 5.x version and then verify. It also happened with different firefox versions (currently 73.0).

The issue was present regardless of WM, graphics model. I started using Gentoo with vanilla i3. I wanted to rule out that X is the problem (it was suid process then). So now I'm using vanilla swaywm (I like it so I won't switch anyway) with no suid thingy. Problem persists. So at least I think this can be ruled out.

QUESTION
Have You ever faced something similar, if so how did You diagnose the problem?
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7368

PostPosted: Thu Feb 27, 2020 11:29 pm    Post subject: Reply with quote

1/ i would kill X without my help, to see if system is still running or not
a) with sshd
b) with "they way you launch X" & sleep 300 && kill -9 $(pidof X) now you have 5 minutes to redo your stuck scenario, after 5 minutes, X will be killed or not and you know if the system is frozen and not just the UI

2/ i would open a terminal with udevadm monitor running, to have a visual on what's going on and dig scripts triggered by last event when it froze
Back to top
View user's profile Send private message
Verialneth
n00b
n00b


Joined: 27 Feb 2020
Posts: 2

PostPosted: Sun Mar 08, 2020 4:01 pm    Post subject: Reply with quote

Ok, during hang I can kill most of user processes, except those who are in uninterruptible sleep (access disk). It looks like it is laptop-mode issue, though it causes drivers errors. It seems to be related to https://github.com/rickysarraf/laptop-mode-tools/issues/123 , https://github.com/rickysarraf/laptop-mode-tools/issues/129 , https://bugs.gentoo.org/689970 , https://forums.gentoo.org/viewtopic-t-1089722.html . I guess it is a common issue, which doesn't have exact answer, besides "if you put <something>, the issue will probably be less common, idk why" or "you can't use feature <xyz> with hardware <abcd>, idk why". It doesn't happen just with yt. It happens ~30% of all tries (when I try to force hang it doesn't hang, in all the other cases it happens :P ) when I unplug charger and try to do something in browser. Recently caught hang produced log:

Code:

Mar 8 12:57:45 gentoo kernel:  ? wait_woken+0x6a/0x6a
Mar 8 12:57:45 gentoo kernel:  ? __switch_to_asm+0x35/0x70
Mar 8 12:57:45 gentoo kernel:  ? lock_timer_base+0x4d/0x72
Mar 8 12:57:45 gentoo kernel:  ? try_to_del_timer_sync+0x50/0x6e
Mar 8 12:57:45 gentoo kernel:  kjournald2+0xec/0x218
Mar 8 12:57:45 gentoo kernel:  ? wait_woken+0x6a/0x6a
Mar 8 12:57:45 gentoo kernel:  ? commit_timeout+0x9/0x9
Mar 8 12:57:45 gentoo kernel:  kthread+0x110/0x118
Mar 8 12:57:45 gentoo kernel:  ? kthread_park+0x89/0x89
Mar 8 12:57:45 gentoo kernel:  ret_from_fork+0x35/0x40
Mar 8 12:57:45 gentoo kernel: INFO: task syslog_ng:31876 blocked for more than 120 seconds.
Mar 8 12:57:45 gentoo kernel:       Tainted: G     U            4.19.86-gentoo #27
Mar 8 12:57:45 gentoo kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 8 12:57:45 gentoo kernel: syslog-ng       D    0 31876   3926 0x00000000
Mar 8 12:57:45 gentoo kernel: Call Trace:
Mar 8 12:57:45 gentoo kernel:  ? __schedule+0x68b/0x6f6
Mar 8 12:57:45 gentoo kernel:  ? bit_wait+0x43/0x43
Mar 8 12:57:45 gentoo kernel:  schedule+0x65/0x6e
Mar 8 12:57:45 gentoo kernel:  io_schedule+0xd/0x2e
Mar 8 12:57:45 gentoo kernel:  bit_wait_io_0x8/0x43
Mar 8 12:57:45 gentoo kernel:  __wait_on_bit+0x45/0x73
Mar 8 12:57:45 gentoo kernel:  out_of_line_wait_on_bit+0x6c/0x86
Mar 8 12:57:45 gentoo kernel:  ? init_wait_var_entry+0x3b/0x3b
Mar 8 12:57:45 gentoo kernel:  do_get_write_access+0x236/0x360
Mar 8 12:57:45 gentoo kernel:  jbd2_journal_get_write_access+0x29/0x4d
Mar 8 12:57:45 gentoo kernel:  __ext4_journal_get_write_access+0x30/0x5d
Mar 8 12:57:45 gentoo kernel:  ? ext4_dirty_inode+0x3f/0x5b
Mar 8 12:57:45 gentoo kernel:  ext4_reserve_inode_write+0x54/0x91
Mar 8 12:57:45 gentoo kernel:  ext4_mark_inode_dirty+0x8f/0x1b1
Mar 8 12:57:45 gentoo kernel:  ? jb2__journal_start+0xcb/0x1a7
Mar 8 12:57:45 gentoo kernel:  ext4_dirty_inode+0x3f/0x5b
Mar 8 12:57:45 gentoo kernel:  __mark_inode_dirty+0xbc/0x303
Mar 8 12:57:45 gentoo kernel:  generic_update_time+0xa1/0xa5
Mar 8 12:57:45 gentoo kernel:  file_update+time+0xcf/0x100
Mar 8 12:57:45 gentoo kernel:  ? __switch_to_asm+0x35/0x70
Mar 8 12:57:45 gentoo kernel:  __generic_file_write_iter+0x7b/0x173
Mar 8 12:57:45 gentoo kernel:  ? __switch_to_asm+0x41/0x70
Mar 8 12:57:45 gentoo kernel:  ? __switch_to_asm+0x35/0x70
Mar 8 12:57:45 gentoo kernel:  ext4_file_write_iter+0x26f/0x31e
Mar 8 12:57:45 gentoo kernel:  ? __switch_to_asm+0x35/0x70
Mar 8 12:57:45 gentoo kernel:  do_iter_readv_writev+0x110/0x146
Mar 8 12:57:45 gentoo kernel:  do_iter_write+0x86/0x15c
Mar 8 12:57:45 gentoo kernel:  vfs_writev+0x90/0xe2
Mar 8 12:57:45 gentoo kernel:  ? ep_poll+0x2d7/0x311
Mar 8 12:57:45 gentoo kernel:  ? wake_up_q+0x48/0x48
Mar 8 12:57:45 gentoo kernel:  do_writev+0x6b/0xe2
Mar 8 12:57:45 gentoo kernel:  do_syscall_64+0x57/0xf2
Mar 8 12:57:45 gentoo kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9
Mar 8 12:57:45 gentoo kernel: RIP: 0033:0x7f7223ecc471
Mar 8 12:57:45 gentoo kernel: Code: Bad RIP value:
Mar 8 12:57:45 gentoo kernel: RSP: 002b:00007f72227b0980 EFLAGS: 00000293 ORIG_RAX:  000000000000014
Mar 8 12:57:45 gentoo kernel: RAX: ffffffffffffffda RBX: 000055609a4811e0 RCX: 00007f7223ecc471
Mar 8 12:57:45 gentoo kernel: RDX: 0000000000000001 RSI: 000055609a481288 RDI: 0000000000000011


(It was different than usual, weirder)
normally (with normal, annoying hang) it looks like this

Code:

Mar  1 22:43:32 gentoo kernel: sd 0:0:0:0: [sda] Synchronizing SCSI cache
Mar  1 22:43:32 gentoo kernel: sd 1:0:0:0: [sdb] Synchronizing SCSI cache
Mar  1 22:43:32 gentoo kernel: sd 0:0:0:0: [sda] Stopping disk
Mar  1 22:43:32 gentoo kernel: ahci 0000:00:17.0: port does not support device sleep
Mar  1 22:43:32 gentoo kernel: sd 1:0:0:0: [sdb] Stopping disk
Mar  1 22:43:32 gentoo kernel: EXT4-fs (dm-2): re-mounted. Opts: discard,errors=remount-ro,commit=600
Mar  1 22:43:32 gentoo kernel: EXT4-fs (dm-3): re-mounted. Opts: discard,commit=600
Mar  1 22:43:32 gentoo kernel: sd 1:0:0:0: [sdb] Starting disk
Mar  1 22:43:32 gentoo kernel: EXT4-fs (dm-4): re-mounted. Opts: discard,commit=600
Mar  1 22:43:32 gentoo kernel: EXT4-fs (dm-5): re-mounted. Opts: commit=600
Mar  1 22:43:32 gentoo kernel: sd 0:0:0:0: [sda] Starting disk


SCSI cache on ATA disk :?:

Code:

# lshw -class disk
  *-disk
       description: SCSI Disk
       product: UD04GC8001G34237
       vendor: Xmore
       physical id: 0.0.0
       bus info: scsi@3:0.0.0
       logical name: /dev/sdc
       version: PMAP
       serial: 047E098130C0
       size: 3824MiB (4009MB)
       capabilities: removable
       configuration: ansiversion=4 logicalsectorsize=512 sectorsize=512
     *-medium
          physical id: 0
          logical name: /dev/sdc
          size: 3824MiB (4009MB)
          capabilities: gpt-1.00 partitioned partitioned:gpt
          configuration: guid=b11f3362-a11e-45c9-813d-c2dbd6191dfc
  *-disk
       description: ATA Disk
       product: TOSHIBA MQ04ABF1
       vendor: Toshiba
       physical id: 0.0.0
       bus info: scsi@0:0.0.0
       logical name: /dev/sda
       version: 0J
       serial: Z7MJC68ST
       size: 931GiB (1TB)
       capabilities: gpt-1.00 partitioned partitioned:gpt
       configuration: ansiversion=5 guid=0f06cb08-8519-4b27-9eb4-4d1f2e3ad0f0 logicalsectorsize=512 sectorsize=4096
  *-disk
       description: ATA Disk
       product: Micron_1100_MTFD
       physical id: 0.0.0
       bus info: scsi@1:0.0.0
       logical name: /dev/sdb
       version: U020
       serial: 170615D18AFA
       size: 476GiB (512GB)
       capabilities: gpt-1.00 partitioned partitioned:gpt
       configuration: ansiversion=5 guid=ab236b75-c25b-44ae-a402-0e390237add6 logicalsectorsize=512 sectorsize=512


sda is hdd, sdb is ssd, both luks+lvm+ext4, sdc is usb with efistub linux. Udevadm monitor doesn't show something unusal. During hang, I can't login anywhere, but if I logged before, I can run some commands, for example I can kill all processes in interruptible state. I saw that there was laptop-mode running, which apparently was in state of mounting home partition (on sdb) rw, and it was in uninterruptible sleep state. Because it didn't seem to be willing to end, after ~30 minutes I rebooted.

EDIT:
This time I saw a hang again, even ctrl+alt+f12 didn't work 8O . The issue is exactly what I linked (in one of them even yt is mentioned), because it affects the same version of laptop-mode-tools (1.72.2-r1) which is the _newest_ in portage tree, and it needs a new maintainer (it is version from 3 Feb 2018). The only difference is that if it is a kernel bug, as they report, it is not solved and persists in different versions.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum