Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
[SOLVED] ksoftirqd kills GUI
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
cfgauss
Guru
Guru


Joined: 18 May 2005
Posts: 550
Location: USA

PostPosted: Mon Feb 11, 2019 3:08 am    Post subject: [SOLVED] ksoftirqd kills GUI Reply with quote

I have a quad-core CPU and about every five minutes, ksoftirqd/0 forces one core to 100% rendering the GUI (in my case, KDE Plasma) completely unresponsive for about 10 seconds. The load average at that point, as measured by htop, isn't particularly high, between 2.0 and 4.0, but nonetheless Plasma is dead. I note that TIME+ as measured by htop is now 15:28.83 for ksoftirqd/0, 0:04.72 for ksoftirqd/1, 0:03.80 for ksoftirqd/2, and 0:04.31 for ksoftirqd/3.

All debugging help will be gratefully received.

[SOLVED] The main culprit was a defective video card. But it appears that dead NICs on the motherboard might also have had an effect. Here is an Nvidia Forum post with details. [/SOLVED]


Last edited by cfgauss on Sun Feb 24, 2019 1:53 am; edited 4 times in total
Back to top
View user's profile Send private message
cfgauss
Guru
Guru


Joined: 18 May 2005
Posts: 550
Location: USA

PostPosted: Wed Feb 13, 2019 9:12 pm    Post subject: Reply with quote

Here's my cat /proc/interrupts where it appears that ksoftirqd/0 has not handled more interrupts than, say, ksoftirqd/1 or ksoftirqd/3. But TIME+ from htop is currently 9:15.22 for ksoftirqd/0, 0:01.76 for ksoftirqd/1 and 1:03.74 for ksoftirqd/3. Why is that?

I installed sys-apps/irqbalance to see if that would help. It may have somewhat but I still get occasional freezes.

Any explanation of how to better things will be gratefully received.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 5773

PostPosted: Wed Feb 13, 2019 10:38 pm    Post subject: Reply with quote

You're using nvidia; it might be a good idea to say that up front. Are your drivers up to date?
Back to top
View user's profile Send private message
cfgauss
Guru
Guru


Joined: 18 May 2005
Posts: 550
Location: USA

PostPosted: Wed Feb 13, 2019 11:35 pm    Post subject: Reply with quote

Ant P. wrote:
You're using nvidia; it might be a good idea to say that up front. Are your drivers up to date?

Yes. I have nvidia-drivers-390.87 which I believe is the latest driver for my GeForce GT 730 card.
Back to top
View user's profile Send private message
cfgauss
Guru
Guru


Joined: 18 May 2005
Posts: 550
Location: USA

PostPosted: Thu Feb 14, 2019 4:38 am    Post subject: Reply with quote

IRQ 22 which services my nvidia card currently has the following distribution of interrupts per CPU:

CPU0 1254773
CPU1 0
CPU2 0
CPU3 755842

This is the time spend by each kernel thread:

htop's TIME+:
ksoftirqd/0 21:43.92
ksoftirqd/1 00:03.44
ksoftirqd/2 00:04.19
ksoftirqd/3 01:05.55

Is it reasonable to suspect that my nvidia card/driver is flooding ksoftirqd/0 with interrupts, causing intermittent GUI freezes? If so, would reverting to a previous nvidia driver be a possible solution?

Thanks for suggesting nvidia as the culprit.


Last edited by cfgauss on Thu Feb 14, 2019 5:42 am; edited 1 time in total
Back to top
View user's profile Send private message
cfgauss
Guru
Guru


Joined: 18 May 2005
Posts: 550
Location: USA

PostPosted: Thu Feb 14, 2019 5:07 am    Post subject: Reply with quote

I was wrong. The nvidia-drivers-410.93 supports my GeForce GT 730 according to nvidia. I'll install it and test.
Back to top
View user's profile Send private message
cfgauss
Guru
Guru


Joined: 18 May 2005
Posts: 550
Location: USA

PostPosted: Thu Feb 14, 2019 2:05 pm    Post subject: Reply with quote

Here nvidia says that its 410.93 driver supports my GeForce GT 730 card but here nvidia says that only 390.xx will work. In fact, 410.93 emerges successfully but with a warning that it will not work with my video card and that I should mask >=x11-drivers/nvidia-drivers-391.0.0. And, of course, 410.93 does not work. Here nvidia states that the legacy driver series 390.xx will continue until the end of 2022.

So, I've reverted to 340.107 to see if I get the same ksoftirqd/0 freezes.
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 5773

PostPosted: Thu Feb 14, 2019 5:44 pm    Post subject: Reply with quote

It might be a threaded IRQs bug. Is there an option to tell the driver not to use those? IIRC the vanilla kernel has one but I don't know if it affects nvidia.ko.
Back to top
View user's profile Send private message
cfgauss
Guru
Guru


Joined: 18 May 2005
Posts: 550
Location: USA

PostPosted: Sat Feb 16, 2019 12:59 am    Post subject: Reply with quote

Ant P. wrote:
It might be a threaded IRQs bug. Is there an option to tell the driver not to use those? IIRC the vanilla kernel has one but I don't know if it affects nvidia.ko.

No nvidia-drivers USE flags deal with irqs. gentoo-sources-4.20.8 appears to have one config option that deals with irqs and threads: CONFIG_IRQ_FORCED_THREADING=y. This is not modifiable through Kconfig but I could override it and simply edit .config.

Is this worth a test?
Back to top
View user's profile Send private message
Ant P.
Watchman
Watchman


Joined: 18 Apr 2009
Posts: 5773

PostPosted: Sat Feb 16, 2019 6:14 am    Post subject: Reply with quote

That only makes it possible (through threadirqs/nothreadirqs boot option), it doesn't turn it on by default. You could try turning that *on* and see if it helps, but it's unlikely.
Back to top
View user's profile Send private message
cfgauss
Guru
Guru


Joined: 18 May 2005
Posts: 550
Location: USA

PostPosted: Sun Feb 17, 2019 1:30 am    Post subject: Reply with quote

Ant P. wrote:
That only makes it possible (through threadirqs/nothreadirqs boot option), it doesn't turn it on by default. You could try turning that *on* and see if it helps, but it's unlikely.

You're right. I have the same behavior with the threadirqs boot option. I suppose my only solution is a newer video card which will be able to use a non-legacy driver.
Back to top
View user's profile Send private message
cfgauss
Guru
Guru


Joined: 18 May 2005
Posts: 550
Location: USA

PostPosted: Sun Feb 17, 2019 10:41 pm    Post subject: Reply with quote

cfgauss wrote:
Here nvidia says that its 410.93 driver supports my GeForce GT 730 card but here nvidia says that only 390.xx will work...

I've since learned that not all nvidia GeForce GT 730's have the same chip and, hence, need different drivers. This lists six PCI IDs for that card. Only one of them (mine, sadly) requires the legacy 390.87 driver.
Back to top
View user's profile Send private message
cfgauss
Guru
Guru


Joined: 18 May 2005
Posts: 550
Location: USA

PostPosted: Wed Feb 20, 2019 11:26 pm    Post subject: Reply with quote

Many thanks to Ant P. for pointing out that the problem is my video card/driver throwing excessive IRQs. This is a post to the nvidia Linux forum confirming that my video card is dying/dead.

I bought a new one which doesn't throw any errors in Xorg.0.log nor is there any trace of NVRM in /var/log/messages so I assumed that my problem was solved. But I still get the GUI freezes when one of the four ksoftirqd pushes a core to 100%.

Is this a problem with the CPU or some other piece of hardware?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum