Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
systemd-218: sockets not working, sometimes
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware
View previous topic :: View next topic  
Author Message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6281

PostPosted: Sun Dec 28, 2014 10:07 am    Post subject: systemd-218: sockets not working, sometimes Reply with quote

Hello,

please do not turn this thread into yet another systemd flamewar - I am completely aware, that my question reveals one of the weaknesses of systemd which I had expected from the very beginning, but I would like to understand the technical reason for the particular issue:

After upgrading to sytemd-218 and when booting with systemd it happens randomly (it depends on the booting - either it happens always or never after a fresh boot) that sockets are not working.

By "not working", I really mean not working: Sockets are created and visible, but whenever something is trying to write to a socket, this process just hangs and apparently never returns from the library call. There are no error messages or otherwise unexpected behaviour - only indefinite hangs.

This is not related to starting the process with systemd: Starting a process manually which creates a socket and reads from it yields the same result.

It might play a role that I use hardened-sources with grsecurity, but I also tried to turnvarious chroot-security features on or off: Due to the unreproducible nature of the problem, I am not really sure, but it seems that none of the features is really related; turning some features off did sometimes hide some problem for many boots, but eventually the problem returned.

My first question is: Does anybody else experience the same problem?
My second question: How is systemd able to cause such a behaviour at all? Again some broken cgroups handling? How can I look for the cause?
And the last question is then of course: Why does this happen only sometimes, and why then either always or never until the next booting?
Back to top
View user's profile Send private message
krinn
Watchman
Watchman


Joined: 02 May 2003
Posts: 7071

PostPosted: Mon Dec 29, 2014 10:11 am    Post subject: Reply with quote

With respects to others ; i think you + systemd = the most competent gentoo user with systemd.

So any question you could ask, i'm afraid the only user that could answer them is: you

(i'm actually a bit surprise by "this process just hangs" ; i would expect the supervision awesome systemd would have a watchdog, else it's no better supervision then the classic pid file ; is it another "we don't use pid as it sucks we use sockets to sucks the same" (ok that's trolling)
Back to top
View user's profile Send private message
mv
Watchman
Watchman


Joined: 20 Apr 2005
Posts: 6281

PostPosted: Mon Dec 29, 2014 2:05 pm    Post subject: Reply with quote

krinn wrote:
i would expect the supervision awesome systemd would have a watchdog

You can activate a watchdog, but this is unrelated to the problem: The problem is not to kill the job but that it does not work.

I am not talking about processes supervised by systemd, I am talking about "ordinary" processes started "manually" (i.e. from a shell, either as a user or as root - it does not matter; of course, if I start them from systemd I get the same problem).

Since the problem is so unreproducible, I am not even sure whether systemd is the culprit: It could be that there is some kernel or grsecurity bug which just for some reason is triggered by systemd more often than by openrc (where the problem did not occur so far).

Once more: For testing, I have just started a server which opens a socket and listens to it, and a client which writes to a socket; these simple (ordinary user) programs already do not work. It is not clear to me, in which sense systemd is related to this - I just observe that this problem did not occur with systemd-217 and with openrc so far, so I guess systemd does something which can trigger the problem

Of course, once a "faulty" boot happened (in which case the above testing programs fail) actually a lot of other programs failed too: dhcpcd fails to report anythnig - in fact, it is not possible to do any internet connection at all, since all programs accessing some port will just hang. But the testing case shows that it is not the internet connection but actually already the plain socket access which fails.
Back to top
View user's profile Send private message
hansb
n00b
n00b


Joined: 11 Jul 2016
Posts: 1

PostPosted: Mon Jul 11, 2016 4:23 pm    Post subject: Solution available? Reply with quote

Hello mv,
I am searching for a problem in my software, that exactly matches your error description, since month:

Quote:
By "not working", I really mean not working: Sockets are created and visible, but whenever something is trying to write to a socket, this process just hangs and apparently never returns from the library call. There are no error messages or otherwise unexpected behaviour - only indefinite hangs.


My application uses a huge amount (500) of TCP/IP connections from one "data source" process to 500 destination processes. Each destination process consumes the data only. TCP/IP data transfers run well for some time, but after an unspecified period of time one of the 500 TCP/IP connections hang. The thread writing to the socket is blocked in the system call forever. All output queues (netstat) are empty. So exactly the observation you described.

Have you found a solution for your problem meanwhile?
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Kernel & Hardware All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum