From the 5.1 TB transferred on the 'lo' interface (RTPEngine running on
same host as Kamailio):
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536 inet 127.0.0.1 netmask 255.0.0.0 inet6 ::1 prefixlen 128 scopeid 0x10<host> loop txqueuelen 1 (Local Loopback) RX packets 27694476978 bytes 5678003789723 (5.1 TiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 27694476978 bytes 5678003789723 (5.1 TiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
... I would surmise that this is some sort of loop / infinite loop condition, but I can't quite put my finger on what it is.
From an RTPEngine logging standpoint, the scenario begins looking rather
normally:
Aug 6 18:11:04 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20]: Received command 'offer' from 127.0.0.1:48525 Aug 6 18:11:04 gw1 rtpengine[25476]: NOTICE: [1074537188_44545952@20.20.20.20]: Creating new call Aug 6 18:11:04 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20]: offer time = 0.000865 sec Aug 6 18:11:04 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20]: Replying to 'offer' from 127.0.0.1:48525 Aug 6 18:11:14 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20]: Received command 'answer' from 127.0.0.1:58668 Aug 6 18:11:14 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20]: answer time = 0.000123 sec Aug 6 18:11:14 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20]: Replying to 'answer' from 127.0.0.1:58668 Aug 6 18:11:18 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20 port 61666]: Confirmed peer address as 2.2.2.2:38236 Aug 6 18:11:23 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20]: Received command 'offer' from 127.0.0.1:50995 Aug 6 18:11:23 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20]: offer time = 0.000082 sec Aug 6 18:11:23 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20]: Replying to 'offer' from 127.0.0.1:50995 Aug 6 18:11:23 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20]: Received command 'offer' from 127.0.0.1:47321 Aug 6 18:11:23 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20]: offer time = 0.000151 sec Aug 6 18:11:23 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20]: Replying to 'offer' from 127.0.0.1:47321 Aug 6 18:11:23 gw1 rtpengine[25476]: INFO: [1074537188_44545952@20.20.20.20 port 61666]: Switching local interface to 1.1.1.1:61666 Aug 6 18:11:23 gw1 rtpengine[25476]: ERR: [1074537188_44545952@20.20.20.20 port 61666]: Too many packets in UDP receive queue (more than 50), aborting loop. Dropped packets possible [... ad nauseum... ]
The only thing unusual about it from a SIP perspective is that it involves an SRTP endpoint and several reinvites which don't seem to be handled as one would expect:
Aug 6 18:11:04 gw1 /usr/local/sbin/kamailio[22070]: INFO: [R-INBOUND-ROUTE-BRANCH:1074537188_44545952@2.2.2.2] Relaying media through RTPEngine Aug 6 18:11:04 gw1 /usr/local/sbin/kamailio[22070]: INFO: [R-INBOUND-ROUTE-BRANCH:1074537188_44545952@2.2.2.2] -> Forcing Secure RTP (SRTP) profile in offer Aug 6 18:11:04 gw1 /usr/local/sbin/kamailio[22233]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] Reply '100 Trying' from 3.3.3.3:54246 Aug 6 18:11:04 gw1 /usr/local/sbin/kamailio[22233]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] Reply '180 Ringing' from 3.3.3.3:54246 Aug 6 18:11:14 gw1 /usr/local/sbin/kamailio[22258]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] Reply '200 OK' from 3.3.3.3:54246 Aug 6 18:11:14 gw1 /usr/local/sbin/kamailio[22258]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] -> Reply contains SDP; diverting through RTPEngine Aug 6 18:11:14 gw1 /usr/local/sbin/kamailio[22112]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> Other sequential ACK received from 2.2.2.2:5060 Aug 6 18:11:23 gw1 /usr/local/sbin/kamailio[22263]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> Re-invite received on rtpengine'd call -- updating RTPEngine Aug 6 18:11:23 gw1 /usr/local/sbin/kamailio[22068]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> Re-invite received on rtpengine'd call -- updating RTPEngine Aug 6 18:11:23 gw1 /usr/local/sbin/kamailio[22074]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] Reply '100 Trying' from 1.1.1.1:5060 Aug 6 18:11:23 gw1 /usr/local/sbin/kamailio[22057]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] Reply '100 Trying' from 2.2.2.2:5060 Aug 6 18:11:23 gw1 /usr/local/sbin/kamailio[22070]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] Reply '200 OK' from 2.2.2.2:5060 Aug 6 18:11:23 gw1 /usr/local/sbin/kamailio[22070]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] -> Reply contains SDP; diverting through RTPEngine Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22065]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] Reply '200 OK' from 2.2.2.2:5060 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22065]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] -> Reply contains SDP; diverting through RTPEngine Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22104]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] Reply '200 OK' from 1.1.1.1:5060 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22104]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] -> Reply contains SDP; diverting through RTPEngine Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22088]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] Reply '200 OK' from 2.2.2.2:5060 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22088]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] -> Reply contains SDP; diverting through RTPEngine Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22118]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] Reply '200 OK' from 1.1.1.1:5060 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22118]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] -> Reply contains SDP; diverting through RTPEngine Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22100]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] Reply '200 OK' from 2.2.2.2:5060 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22100]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] -> Reply contains SDP; diverting through RTPEngine Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22116]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] Reply '200 OK' from 1.1.1.1:5060 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22116]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] -> Reply contains SDP; diverting through RTPEngine Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22094]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] Reply '200 OK' from 1.1.1.1:5060 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22094]: INFO: [R-MAIN-REPLY:1074537188_44545952@2.2.2.2] -> Reply contains SDP; diverting through RTPEngine Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22263]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> Other sequential ACK received from 3.3.3.3:54246 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22090]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> Other sequential ACK received from 1.1.1.1:5060 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22263]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> Other sequential ACK received from 3.3.3.3:54246 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22263]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> Other sequential ACK received from 3.3.3.3:54246 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22074]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> Other sequential ACK received from 1.1.1.1:5060 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22068]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> Other sequential ACK received from 1.1.1.1:5060 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22263]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> Other sequential ACK received from 3.3.3.3:54246 Aug 6 18:11:27 gw1 /usr/local/sbin/kamailio[22098]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> Other sequential ACK received from 1.1.1.1:5060 Aug 6 18:11:58 gw1 /usr/local/sbin/kamailio[22211]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> BYE (hangup) received from 3.3.3.3:54246 Aug 6 18:11:58 gw1 /usr/local/sbin/kamailio[22211]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> BYE received - tearing down rtpengine session Aug 6 18:12:02 gw1 /usr/local/sbin/kamailio[22104]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> BYE (hangup) received from 1.1.1.1:5060 Aug 6 18:12:02 gw1 /usr/local/sbin/kamailio[22104]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> BYE received - tearing down rtpengine session Aug 6 18:12:02 gw1 /usr/local/sbin/kamailio[22211]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> BYE (hangup) received from 3.3.3.3:54246 Aug 6 18:12:02 gw1 /usr/local/sbin/kamailio[22211]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> BYE received - tearing down rtpengine session Aug 6 18:12:02 gw1 /usr/local/sbin/kamailio[22211]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> BYE (hangup) received from 3.3.3.3:54246 Aug 6 18:12:02 gw1 /usr/local/sbin/kamailio[22211]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> BYE received - tearing down rtpengine session Aug 6 18:12:02 gw1 /usr/local/sbin/kamailio[22211]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> BYE (hangup) received from 3.3.3.3:54246 Aug 6 18:12:02 gw1 /usr/local/sbin/kamailio[22211]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> BYE received - tearing down rtpengine session Aug 6 18:12:02 gw1 /usr/local/sbin/kamailio[22211]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> BYE (hangup) received from 3.3.3.3:54246 Aug 6 18:12:02 gw1 /usr/local/sbin/kamailio[22211]: INFO: [R-MAIN:1074537188_44545952@2.2.2.2] -> BYE received - tearing down rtpengine session
But while this signalling chronology is anomalous, it doesn't explain a continuous queue overflow of some kind which lasted several hours?
-- Alex
On Mon, Aug 06, 2018 at 08:59:21PM -0400, Alex Balashov wrote:
Hi,
We have an RTPEngine that occasionally finds itself in this state of CPU usage:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 25476 root 20 0 2929600 80820 1900 S 156.2 0.1 259:05.54 rtpengine
The journal is a dizzying stream of:
Aug 06 20:58:47 gw1 rtpengine[25476]: ERR: [1074537188_44545952 port 61666]: Too many packets in UDP receive queue (more than 50), aborting loop. Dropped packets possible
It eventually leads to increasing CPU usage and a degeneration of audio quality until RTPEngine is restarted.
It would be appreciated if someone could shed some light on the nature of this queue and what we can do about this problem. Is this referring to a netfilter packet forwarding queue, or is it a receive queue of the RTPengine user space control process? Can its size be increased or tweaked with kernel parameters?
Thanks,
-- Alex
-- Alex Balashov | Principal | Evariste Systems LLC
Tel: +1-706-510-6800 / +1-800-250-5920 (toll-free) Web: http://www.evaristesys.com/, http://www.csrpswitch.com/
Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users