Hi everyone,
we have an installation with:
- 1 kamailio instance - with 2 interface, one with public and one with
private IP (for local communication to the db and asterisk ). We use
keepalived service for failover and so, every interface has 2 IPs: the
real one and the “virtual” one for the keepalived service
- some asterisk instances for transcoding and billing - with one
interface with private IP
- 2 RTP (rtpengine) instances - with 2 interfaces, one with public and
one with private IP
A few times happens a loop on the RTPengine and we could see a lot of
lines like this in the log:
“Too many packets in UDP receive queue (more than 50), aborting loop.
Dropped packets possible”
This loop doesn't stop until all the resources are exhausted and it
results in the freeze of the machine.
We don’t understand the reason why these loops are generated.
We have 2 cases:
1) we send an INVITE to the CPE (a Bria on iOS)
The CPE answers with a "100 Trying", a "180 Ringing" and a few
seconds
later sends us 3 packets of "200 ok" with SDP like this:
SIP/2.0 200 OK
Via: SIP/2.0/UDP
178.250.x.y;branch=z9hG4bKb842.29952898c30d6d06b75b916618357ccd.2
Via: SIP/2.0/UDP 192.168.x.y:5090;rport=5090;branch=z9hG4bK7fd100fb
Record-Route: <sip:178.250.x.y;r2=on;lr=on;ftag=as26b82ac7;did=cb5.0ce;nat=yes>
Record-Route: <sip:192.168.x.z;r2=on;lr=on;ftag=as26b82ac7;did=cb5.0ce;nat=yes>
Contact: <sip:6051018****@158.148.x.y:3180;rinstance=ffdff5e0142dfd2d>
To: <sip:6051018****@192.168.x.z>;tag=29859d3c
From: <sip:051438****@192.168.x.y>;tag=as26b82ac7
Call-ID: 698200851849894c5874186163abe668@192.168.x.y:5090
CSeq: 102 INVITE
Allow: INVITE, ACK, CANCEL, BYE, REFER, INFO, NOTIFY, OPTIONS, UPDATE,
PRACK, MESSAGE, SUBSCRIBE
Content-Type: application/sdp
Supported: replaces
User-Agent: Bria iOS release 3.9.7 stamp 38887.38893
Content-Length: 230
v=0
o=- 1085831961 3 IN IP4 100.81.231.195
s=Cpc session
c=IN IP4 100.81.231.195
t=0 0
m=audio 65454 RTP/AVP 18 101
a=rtpmap:18 G729/8000
a=fmtp:18 annexb=no
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=sendrecv
Could the IP of the media part (100.81.231.195 in the CGNAT
100.64.0.0/10) be the cause of the loop?
If that’s the case, how can we prevent the loop?
2) in the second case the CPE is an Avaya
The Avaya send us an INVITE. (Also in this case in the SDP there is a
private IP)
INVITE sip:04818****@178.250.x.y SIP/2.0
From:
<sip:0432415453@185.55.x.y>;tag=4500808-c0a8077d-13c4-50022-313081f-200d750c-313081f
To: <sip:04818****@178.250.x.y>
Call-ID: 44e8f20-c0a8077d-13c4-50022-313081f-31e99586-313081f
CSeq: 1 INVITE
Via: SIP/2.0/UDP 185.55.x.y:5060;rport;branch=z9hG4bK-313081f-257bc48-1472e4a5
Privacy: none
Max-Forwards: 70
User-Agent: OfficeServ 7200
Contact: <sip:0432415453@185.55.z.y:5060;transport=udp>
Allow: REGISTER,INVITE,ACK,BYE,REFER,NOTIFY,CANCEL,INFO,OPTIONS,PRACK,SUBSCRIBE,UPDATE
Supported: 100rel
Content-Type: application/sdp
Content-Length: 289
v=0
o=SAMSUNG_SIP_GATEWAY 39304265 0 IN IP4 185.55.x.y
s=SIP_CALL
c=IN IP4 192.168.7.126
t=0 0
m=audio 30012 RTP/AVP 18 8 0 101
a=rtpmap:18 G729/8000
a=fmtp:18 annexb=no
a=rtpmap:8 PCMA/8000
a=rtpmap:0 PCMU/8000
a=rtpmap:101 telephone-event/8000
a=fmtp:101 0-15
a=sendrecv
We reply with a “183 Session Progress” and the Avaya send US an “INFO”
packet (dtmf)
we replay with a “200 OK”; and then, after another “INFO” packet we
replay with a “200 OK” with SDP.
After that the Avaya replies with an ACK
2017/08/02 10:17:32.101121 185.55.x.y:5060 -> 178.250.x.y:5060
ACK sip:04818****@178.250.x.y SIP/2.0
From:
<sip:0432415453@185.55.x.y>;tag=4500808-c0a8077d-13c4-50022-313081f-200d750c-313081f
To: <sip:04818****@178.250.x.y>;tag=as783106e4
Call-ID: 44e8f20-c0a8077d-13c4-50022-313081f-31e99586-313081f
CSeq: 1 ACK
Via: SIP/2.0/UDP 185.55.x.y:5060;rport;branch=z9hG4bK-3130831-257ffea-84906e4
Max-Forwards: 70
User-Agent: OfficeServ 7200
Route:
<sip:178.250.x.y;lr=on;r2=on;ftag=4500808-c0a8077d-13c4-50022-313081f-200d750c-313081f;did=adc.2f61>
Route:
<sip:192.168.x.y;lr=on;r2=on;ftag=4500808-c0a8077d-13c4-50022-313081f-200d750c-313081f;did=adc.2f61>
Contact: <sip:043241****@185.55.x.y:5060;transport=udp>
Content-Length: 0
And after that the chaos…
Has anyone had a problem like this?
Can somebody help us in order to prevent this loop?
In the attached images you can see:
[0] a call, our machines IPs, an ACK from public interface of our SBC
to its private one; and than a lot of ACKs from the keepalived IP to
the private IP of our SBC.
[1] one of these strange ACKs, they are all different in content, but
similar in structure: multiple Record-Route / Via, truncating too long
message;
[2] a "never ending" BYEs.
After modifying the configuration adding some "listen=IP_ADDR:PORT"
(and of course excluding the keepalive IP) we had no loops but is this
enough?
Why Kamailio generated the loops? Is there something else we should
review in our config?
Any advice would help a lot. Thanks.
[0]
https://goo.gl/rWh5db
[1]
https://goo.gl/MNShq9
[2]
https://goo.gl/6mxepG