Andres,
I believe I remember this issue being brought up a while back. If I remember
correctly, ser children locked up when communicating to a locked rtpproxy
over socket interface. The "solution" was to use udp over loopback to
communicate as this would fail that specific call, but not lock the ser
process.
g-)
----- Original Message -----
From: "Andres" <andres(a)telesip.net>
To: <serusers(a)lists.iptel.org>
Sent: Tuesday, November 22, 2005 10:19 PM
Subject: [Serusers] SER Children Misbehaving
Today we had an incident where SER (0.9.4) children
drained all the CPUs
of one of our servers.
Top Showed:
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
17925 root 25 0 5644 5644 3888 R 25.5 0.2 6:26 1 ser
17929 root 25 0 5672 5672 3880 R 24.7 0.2 6:48 0 ser
17928 root 25 0 5688 5688 3872 R 24.3 0.2 6:25 1 ser
17933 root 25 0 4540 4540 3740 R 22.8 0.2 6:00 0 ser
And ..
# ps -Al | grep ser
1 S 0 17901 1 0 85 0 - 14200 pause ? 00:00:00 ser
1 S 0 17916 17901 0 75 0 - 14200 pipe_w ? 00:00:00 ser
1 S 0 17917 17901 0 75 0 - 14418 schedu ? 00:00:22 ser
1 S 0 17918 17901 0 75 0 - 14422 schedu ? 00:00:23 ser
1 S 0 17919 17901 0 75 0 - 14423 schedu ? 00:00:24 ser
1 S 0 17920 17901 0 75 0 - 14447 schedu ? 00:00:22 ser
1 S 0 17921 17901 0 75 0 - 14421 schedu ? 00:00:22 ser
1 S 0 17922 17901 0 75 0 - 14424 schedu ? 00:00:22 ser
1 S 0 17923 17901 0 75 0 - 14428 schedu ? 00:00:21 ser
1 S 0 17924 17901 0 75 0 - 14424 schedu ? 00:00:22 ser
1 R 0 17925 17901 0 85 0 - 14448 - ? 00:06:22 ser
1 S 0 17926 17901 0 75 0 - 14457 schedu ? 00:00:49 ser
1 S 0 17927 17901 0 75 0 - 14453 schedu ? 00:00:50 ser
1 R 0 17928 17901 0 85 0 - 14477 - ? 00:06:20 ser
1 R 0 17929 17901 0 85 0 - 14455 - ? 00:06:44 ser
1 S 0 17930 17901 0 75 0 - 14452 schedu ? 00:00:50 ser
1 S 0 17931 17901 0 75 0 - 14448 schedu ? 00:00:50 ser
1 S 0 17932 17901 0 76 0 - 14448 schedu ? 00:00:49 ser
1 R 0 17933 17901 0 85 0 - 14235 - ? 00:05:55 ser
As you can see it looks like 4 children dropped out of the scheduler. The
only thing suspicious is that RTPProxy became non-responsive around that
time. At least thats the only thing the log shows:
Nov 22 15:56:17 /usr/local/sbin/ser[17931]: ERROR: send_rtpp_command:
timeout waiting reply from a RTP proxy
Any idea why these 4 children dropped out? Any hints on how to
troubleshoot this?
Thanks,
--
Andres
Network Admin
http://www.telesip.net
_______________________________________________
Serusers mailing list
serusers(a)lists.iptel.org
http://lists.iptel.org/mailman/listinfo/serusers