Hi everyone, we have an installation with: - 1 kamailio instance - with 2 interface, one with public and one with private IP (for local communication to the db and asterisk ). We use keepalived service for failover and so, every interface has 2 IPs: the real one and the “virtual” one for the keepalived service - some asterisk instances for transcoding and billing - with one interface with private IP - 2 RTP (rtpengine) instances - with 2 interfaces, one with public and one with private IP
A few times happens a loop on the RTPengine and we could see a lot of lines like this in the log: “Too many packets in UDP receive queue (more than 50), aborting loop. Dropped packets possible” This loop doesn't stop until all the resources are exhausted and it results in the freeze of the machine. We don’t understand the reason why these loops are generated.
We have 2 cases: 1) we send an INVITE to the CPE (a Bria on iOS) The CPE answers with a "100 Trying", a "180 Ringing" and a few seconds later sends us 3 packets of "200 ok" with SDP like this:
SIP/2.0 200 OK Via: SIP/2.0/UDP 178.250.x.y;branch=z9hG4bKb842.29952898c30d6d06b75b916618357ccd.2 Via: SIP/2.0/UDP 192.168.x.y:5090;rport=5090;branch=z9hG4bK7fd100fb Record-Route: sip:178.250.x.y;r2=on;lr=on;ftag=as26b82ac7;did=cb5.0ce;nat=yes Record-Route: sip:192.168.x.z;r2=on;lr=on;ftag=as26b82ac7;did=cb5.0ce;nat=yes Contact: sip:6051018****@158.148.x.y:3180;rinstance=ffdff5e0142dfd2d To: sip:6051018****@192.168.x.z;tag=29859d3c From: sip:051438****@192.168.x.y;tag=as26b82ac7 Call-ID: 698200851849894c5874186163abe668@192.168.x.y:5090 CSeq: 102 INVITE Allow: INVITE, ACK, CANCEL, BYE, REFER, INFO, NOTIFY, OPTIONS, UPDATE, PRACK, MESSAGE, SUBSCRIBE Content-Type: application/sdp Supported: replaces User-Agent: Bria iOS release 3.9.7 stamp 38887.38893 Content-Length: 230
v=0 o=- 1085831961 3 IN IP4 100.81.231.195 s=Cpc session c=IN IP4 100.81.231.195 t=0 0 m=audio 65454 RTP/AVP 18 101 a=rtpmap:18 G729/8000 a=fmtp:18 annexb=no a=rtpmap:101 telephone-event/8000 a=fmtp:101 0-15 a=sendrecv
Could the IP of the media part (100.81.231.195 in the CGNAT 100.64.0.0/10) be the cause of the loop? If that’s the case, how can we prevent the loop?
2) in the second case the CPE is an Avaya The Avaya send us an INVITE. (Also in this case in the SDP there is a private IP)
INVITE sip:04818****@178.250.x.y SIP/2.0 From: sip:0432415453@185.55.x.y;tag=4500808-c0a8077d-13c4-50022-313081f-200d750c-313081f To: sip:04818****@178.250.x.y Call-ID: 44e8f20-c0a8077d-13c4-50022-313081f-31e99586-313081f CSeq: 1 INVITE Via: SIP/2.0/UDP 185.55.x.y:5060;rport;branch=z9hG4bK-313081f-257bc48-1472e4a5 Privacy: none Max-Forwards: 70 User-Agent: OfficeServ 7200 Contact: sip:0432415453@185.55.z.y:5060;transport=udp Allow: REGISTER,INVITE,ACK,BYE,REFER,NOTIFY,CANCEL,INFO,OPTIONS,PRACK,SUBSCRIBE,UPDATE Supported: 100rel Content-Type: application/sdp Content-Length: 289
v=0 o=SAMSUNG_SIP_GATEWAY 39304265 0 IN IP4 185.55.x.y s=SIP_CALL c=IN IP4 192.168.7.126 t=0 0 m=audio 30012 RTP/AVP 18 8 0 101 a=rtpmap:18 G729/8000 a=fmtp:18 annexb=no a=rtpmap:8 PCMA/8000 a=rtpmap:0 PCMU/8000 a=rtpmap:101 telephone-event/8000 a=fmtp:101 0-15 a=sendrecv
We reply with a “183 Session Progress” and the Avaya send US an “INFO” packet (dtmf) we replay with a “200 OK”; and then, after another “INFO” packet we replay with a “200 OK” with SDP. After that the Avaya replies with an ACK
2017/08/02 10:17:32.101121 185.55.x.y:5060 -> 178.250.x.y:5060 ACK sip:04818****@178.250.x.y SIP/2.0 From: sip:0432415453@185.55.x.y;tag=4500808-c0a8077d-13c4-50022-313081f-200d750c-313081f To: sip:04818****@178.250.x.y;tag=as783106e4 Call-ID: 44e8f20-c0a8077d-13c4-50022-313081f-31e99586-313081f CSeq: 1 ACK Via: SIP/2.0/UDP 185.55.x.y:5060;rport;branch=z9hG4bK-3130831-257ffea-84906e4 Max-Forwards: 70 User-Agent: OfficeServ 7200 Route: sip:178.250.x.y;lr=on;r2=on;ftag=4500808-c0a8077d-13c4-50022-313081f-200d750c-313081f;did=adc.2f61 Route: sip:192.168.x.y;lr=on;r2=on;ftag=4500808-c0a8077d-13c4-50022-313081f-200d750c-313081f;did=adc.2f61 Contact: sip:043241****@185.55.x.y:5060;transport=udp Content-Length: 0 And after that the chaos…
Has anyone had a problem like this? Can somebody help us in order to prevent this loop?
In the attached images you can see: [0] a call, our machines IPs, an ACK from public interface of our SBC to its private one; and than a lot of ACKs from the keepalived IP to the private IP of our SBC. [1] one of these strange ACKs, they are all different in content, but similar in structure: multiple Record-Route / Via, truncating too long message; [2] a "never ending" BYEs.
After modifying the configuration adding some "listen=IP_ADDR:PORT" (and of course excluding the keepalive IP) we had no loops but is this enough? Why Kamailio generated the loops? Is there something else we should review in our config? Any advice would help a lot. Thanks.
[0] https://goo.gl/rWh5db [1] https://goo.gl/MNShq9 [2] https://goo.gl/6mxepG