Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
blank X screen, high udev load, help needed
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
bukulin
n00b
n00b


Joined: 15 Dec 2018
Posts: 5

PostPosted: Sat Dec 15, 2018 8:58 am    Post subject: blank X screen, high udev load, help needed Reply with quote

Hi all!

(this is my first in this community, so please be forgiving if I'm doing something wrong)

I ran into a trouble with nvidia drivers that I couldn't solve. After I updated nvidia-drivers from 390.87 to 410.78 and later to 415.18 Xorg stoped working at all. After startx or Xorg there is only a blank screen. I cannot switch back to other terminals. System is up, I can reach it with ssh, but there is no Xorg log, even if I started Xorg with verbose log option. There are no funny lines in dmesg. The only observable thing is that I have a systemd-udev process with 100% CPU load, that I cannot terminate, so system restart could be done with magic SysRq keys.

I tried to start Xorg with strace, but somehow it prints only raw pointers for system call arguments instead of strings, so the only thing that I could see in the output that the Xorg process hung during an 'openat' call:
...
openat(AT_FDCWD, 0x7ffdb7e43110, O_RDONLY) = 9
fstat(9, 0x7ffdb7e42690) = 0
read(9, 0x55c50824eea0, 1024) = 626
close(9) = 0
stat(0x7ffdb7e43190, 0x7ffdb7e43030) = 0
openat(AT_FDCWD, 0x7ffdb7e432b0, O_RDWR

After I've synced the portage today morning I couldnt compile nvidia-drivers-390 anymore, so I'm stucked now, because the new drivers compile but I have no screen, the old driver works, but now I cannot compile it.


Some system information:

$ lspci
...
01:00.0 VGA compatible controller: NVIDIA Corporation GM107 [GeForce GTX 750 Ti] (rev a2)
...

$ uname -a
Linux XXX 4.14.83-gentoo #2 SMP Tue Dec 11 06:33:57 CET 2018 x86_64 Intel(R) Core(TM) i5-4590 CPU @ 3.30GHz GenuineIntel GNU/Linux

$ qlist -Ivv sys-fs/udev
sys-fs/udev-239
sys-fs/udev-init-scripts-32

$ qlist -Ivv nvidia-drivers
x11-drivers/nvidia-drivers-410.78

$ dmesg | tail
[ 3.242039] EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
[ 3.352485] EXT4-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null)
[ 3.446522] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: (null)
[ 3.560859] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: (null)
[ 4.192247] ip (4369) used greatest stack depth: 13144 bytes left
[ 4.602117] r8169 0000:03:00.0 enp3s0: link down
[ 4.602133] r8169 0000:03:00.0 enp3s0: link down
[ 4.602236] ip (4636) used greatest stack depth: 11616 bytes left
[ 4.653131] 8021q: 802.1Q VLAN Support v1.8
[ 6.826777] r8169 0000:03:00.0 enp3s0: link up


Any help, suggestion is highly appreciated. Thank you in advance!


EDIT: I've just realized that I have high CPU load caused by systemd-udevd even if I don't start Xorg.

-- Response to self merged by NeddySeagoon --

Might related to this

https://forums.gentoo.org/viewtopic-t-1061632-start-0-postdays-0-postorder-asc-highlight-.html

But the solution there (sleep 1.5) does not help

Keeps the topic in the Unanswered Posts Search

EDIT 2: I replaced udev by eudev. With this I was able to figure out that udev process hung by nvidia-smi.
Back to top
View user's profile Send private message
genterminl
Guru
Guru


Joined: 12 Feb 2005
Posts: 488
Location: Connecticut, USA

PostPosted: Sat Dec 15, 2018 11:02 pm    Post subject: Reply with quote

First, I suggest going to the nvidia site to be sure which driver version supports your card. I don't know the card/chipset numbering well enough to know if you now need to use one of the legacy drivers, but it's worth checking.

Second, why can't you compile the 390 driver? If it's still in portage, it should compile, so that may point to a different issue.

Other thoughts - when you say can't switch to other terminals, is the keyboard responsive at all? To anything other than the magic SysRq?

I don't use systemd at all, so I can't comment on that 100% cpu process, but are you saying that if you ssh in, typing reboot doesn't have any effect?

Have you looked at all the files in /var/log (and/or whatever logging you can get out of systemd) for any other strange things?
Back to top
View user's profile Send private message
bukulin
n00b
n00b


Joined: 15 Dec 2018
Posts: 5

PostPosted: Sun Dec 16, 2018 7:06 am    Post subject: Reply with quote

Thank you for the suggestions.

I try to answer them in order:

1) Nvidia says that the latest stable version for my GPU is 410.78, the latest beta is 415.13. The 410.78 version matches with the portage
2) Unfortunately I cannot tell you why 390 driver won't compile, because I've synced my portage this morning, and now it compiles.
3) I think the keyboard is responsive. Just think, because there is no caps lock or num lock LED-s on it. C-M-F2, C-M-Backspace does not work. However the system itself is responsible. I can ssh into it, everything works well except from X and the jammed udev process. This later caused by a jammed nvidia-smi process.
4) What do you use for /dev management? What is recommended nowdays? I set up this system for about 10 years ago and I only just moved it, as the hardware changed. Back to those days dbus and hald was the mainstream. So eudev or udev? I tried both, both jammed with nvidia-smi call.
5) When I ssh in, I can type reboot, but the reboot process hung, because it cannot terminate udev, because the nvidia-smi process is in 'D' (uninterruptible sleep) state. After a while somehow the reboot process tries to continue its work, but it stops at remounting / in read only mode because it is in use. I think by the udev process. At this point my only possibility to restart the system is magic SysRq
6) I use openrc, not systemd, only udev is from systemd, but again, I've tried eudev as well. To answer your question, I've checked syslog, messages, dmesg, Xorg.0.log, and a general ls -latr in /var/log. I think, that should be enough, but if I missed something, please let me know.

One thing to add:
Nouveau works, I might be go on with that. 390 driver compiles, so I can live with that.

Thank you once again.
Back to top
View user's profile Send private message
APolozov
Tux's lil' helper
Tux's lil' helper


Joined: 28 Sep 2006
Posts: 134
Location: Voronezh, Russia

PostPosted: Sun Dec 16, 2018 11:09 am    Post subject: Reply with quote

After upgrading nvidia-drivers from 396.54 to 415.18 I had the same "black screen" (console with prom to login) and freezing with manualy "modprobe nvidia".
I was able to run my X server with nvidia-drivers-415.18 just comenting one last string "#options nvidia NVreg_DeviceFileMode=432 NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=27 NVreg_ModifyDeviceFiles=1" in /etc/modprobe.d/nvidia.conf and rebooting.
_________________
Excuse my bad English, I only study it.

I also connected to: velo36.ru openstreetmap.org
Back to top
View user's profile Send private message
genterminl
Guru
Guru


Joined: 12 Feb 2005
Posts: 488
Location: Connecticut, USA

PostPosted: Sun Dec 16, 2018 5:48 pm    Post subject: Reply with quote

I also use openrc, not systemd, and I use eudev. The only other thing I can think of is what kernel are you using. Sometimes a newer kernel will not work with the latest nvidia drivers. However, I believe there is an appropriate warning if you try to install a driver that is not yet patched for the current kernel.
Back to top
View user's profile Send private message
bukulin
n00b
n00b


Joined: 15 Dec 2018
Posts: 5

PostPosted: Tue Dec 18, 2018 7:22 am    Post subject: Reply with quote

genterminl wrote:
I also use openrc, not systemd, and I use eudev. The only other thing I can think of is what kernel are you using. Sometimes a newer kernel will not work with the latest nvidia drivers. However, I believe there is an appropriate warning if you try to install a driver that is not yet patched for the current kernel.


I use stable gentoo-sources 4.14.83. I suppose that should work.

Yet another thing to add: If I update from 390 to 415 and NOT reboot, just remove old nvidia module and insert the new one, everything works well. So it might be rather an udev/modprobe/nvidia module options issue.
Back to top
View user's profile Send private message
bukulin
n00b
n00b


Joined: 15 Dec 2018
Posts: 5

PostPosted: Tue Dec 18, 2018 7:41 am    Post subject: Reply with quote

APolozov wrote:
After upgrading nvidia-drivers from 396.54 to 415.18 I had the same "black screen" (console with prom to login) and freezing with manualy "modprobe nvidia".
I was able to run my X server with nvidia-drivers-415.18 just comenting one last string "#options nvidia NVreg_DeviceFileMode=432 NVreg_DeviceFileUID=0 NVreg_DeviceFileGID=27 NVreg_ModifyDeviceFiles=1" in /etc/modprobe.d/nvidia.conf and rebooting.


I've tried this trick with the module options, but unfortunately it did not help.
Back to top
View user's profile Send private message
APolozov
Tux's lil' helper
Tux's lil' helper


Joined: 28 Sep 2006
Posts: 134
Location: Voronezh, Russia

PostPosted: Wed Dec 19, 2018 4:24 pm    Post subject: Reply with quote

bukulin wrote:
I've tried this trick with the module options, but unfortunately it did not help.

Maybe the reason is different hardware or kernel version|option?
My VideoChip is GK107 (GeForce GT 640) and kernel is 4.13.0-pf4 (pf-sources)
_________________
Excuse my bad English, I only study it.

I also connected to: velo36.ru openstreetmap.org
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7071

PostPosted: Thu Dec 20, 2018 2:19 pm    Post subject: Reply with quote

See this thread and the bug report given by i4dnf in that post https://forums.gentoo.org/viewtopic-p-8274712.html#8274712
Back to top
View user's profile Send private message
bukulin
n00b
n00b


Joined: 15 Dec 2018
Posts: 5

PostPosted: Fri Dec 21, 2018 6:35 am    Post subject: Reply with quote

krinn wrote:
See this thread and the bug report given by i4dnf in that post https://forums.gentoo.org/viewtopic-p-8274712.html#8274712


Thanks, that works. Hopefully nvidia will fix this in the future releases of the driver.
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum