## Description
When one rtpengine instance becomes unavailable on a Kamailio instance with moderate call
traffic, even where there are multiple rtpengine instances configured for high
availability, all processes can become blocked and Kamailio stops responding to all
requests.
We produced this in a real world test, with two Kamailio servers running (20 child
processes) and two rtpengine servers running. We loaded the Kamailio servers with calls,
some answered, some ringing/busy/etc, and then made one rtpengine server drop all packets.
Both Kamailio servers were able to briefly detect that the rtpengine instance was
unavailable when it attempted to send an offer to it, but then as existing calls began to
complete, both Kamailio servers became congested attempting to send "delete"
commands to rtpengine. Both Kamailio servers then became unavailable and stopped
responding to all messages for 10 minutes, until we restored rtpengine.
As such, an rtpengine server that crashes can cause a complete Kamailio outage.
## Troubleshooting
By using the reproduction steps below, we were easily able to make Kamailio stop
responding to all incoming SIP messages when taking down an rtpengine instance.
The logs below show the sequence of events when this is reproduced for a single call.
Configuration:
- `rtpengine_tout_ms` was not configured and had the default value of 1000ms.
- `rtpengine_retr` was not configured and had the default value of 5.
- `queried_nodes_limit` was not configured and had the default value (of 30?).
- TM's `exec_time_check` was not configured and had the default value of 5000ms.
**A call is forwarded using t_relay and rtpengine_manage. rtpengine is killed mid-call.
The call ends with 486.**
When the t_on_failure route is called, the config makes a singular call to
rtpengine_manage, however what follows is 30 errors, all mentioning the same rtpengine
instance:
```
ERROR: rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for
command "delete" from RTPEngine <udp:192.168.X.X:2223>
```
Followed by the below:
```
ERROR: rtpengine [rtpengine.c:2729]: rtpp_function_call(): queried nodes limit reached
WARNING: tm [t_reply.c:1094]: run_failure_handlers(): failure route execution took too
long: 150153261 us
```
Each timeout is 5 seconds (5x retries of 1000ms), which corresponds to about 150 seconds
over 30 attempts. This matches the warning generated by tm.
In this case, there was about 5 minutes of time where a process was blocked. 150 seconds
for the child leg's failure route, and 150 seconds for the parent leg's failure
route. When this happens for many calls at once, congestion is guaranteed.
The 30 attempts to delete the call from rtpengine comes from the default value of
`queried_nodes_limit`. This behavior doesn't make sense to me -- it's a retry loop
that runs a retry loop.
## Reproduction
_tldr: kill rtpengine via iptables mid-call with the default rtpengine settings and
you'll see lots of errors._
You can reproduce this with one or multiple rtpengine instances configured in Kamailio.
It's easier with just one.
Standard setup, but importantly:
- forwarding a call using t_relay
- t_on_branch is configured which calls rtpengine_manage
- t_on_failure is configured which calls rtpengine_manage
To reproduce easily, we configured the destination server to send a 100, a 180, wait 20
seconds, then send a 486.
Place a number of calls to Kamailio (we used sipp) -- importantly, exceeding the number of
child processes configured. You can reproduce the issue with just one call, but this wont
make Kamailio unresponsive.
Once the calls are placed, and still in progress (ie before the 486 is returned), use
iptables on the rtpengine server to drop all traffic on its control port.
When the call(s) start receiving the 486, you will notice a flood of error messages
`timeout waiting reply for command "delete" from RTPEngine` for approximately 5
minutes per call. This blocks the process.
If it's now attempting to send the delete command to rtpengine for a number of calls
greater than your child process count: attempt to send any other SIP messages to Kamailio
and you will not receive a response.
## Log Messages
This is a log for a single call, forwarded using t_relay. All calls to rtpengine_manage
are immediately surrounded by info log messages "calling rtpengine_manage" and
"completed calling rtpengine_manage".
```
Feb 12 23:40:54 debian-kamailio kamailio[27715]: INFO: {1 1 INVITE 1-27864(a)127.0.1.1}
<script>: new branch [0] to sip:0421XXXXXX@192.168.X.X:9060
Feb 12 23:40:54 debian-kamailio kamailio[27715]: INFO: {1 1 INVITE 1-27864(a)127.0.1.1}
<script>: branch_route: calling rtpengine_manage
Feb 12 23:40:54 debian-kamailio kamailio[27715]: INFO: {1 1 INVITE 1-27864(a)127.0.1.1}
<script>: branch_route: completed calling rtpengine_manage
Feb 12 23:40:54 debian-kamailio kamailio[27732]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1}
<script>: onreply_route: calling rtpengine_manage
Feb 12 23:40:54 debian-kamailio kamailio[27732]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1}
<script>: onreply_route: completed calling rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27730]: INFO: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: new branch [0] to
sip:61421764018@192.168.X.X:9092;transport=udp
Feb 12 23:40:55 debian-kamailio kamailio[27730]: INFO: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: branch_route: calling
rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27730]: INFO: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: branch_route: completed calling
rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27717]: INFO: {2 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: onreply_route: calling
rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27717]: INFO: {2 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: onreply_route: completed calling
rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27714]: INFO: {2 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: onreply_route: calling
rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27714]: INFO: {2 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: onreply_route: completed calling
rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27735]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1}
<script>: onreply_route: calling rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27735]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1}
<script>: onreply_route: completed calling rtpengine_manage
Feb 12 23:41:15 debian-kamailio kamailio[27713]: INFO: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: failure_route: calling
rtpengine_manage
Feb 12 23:41:20 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:41:25 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:41:30 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:41:35 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:41:40 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:41:45 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:41:50 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:41:55 debian-kamailio kamailio[27740]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1}
<script>: onreply_route: calling rtpengine_manage
Feb 12 23:41:55 debian-kamailio kamailio[27740]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1}
<script>: onreply_route: completed calling rtpengine_manage
Feb 12 23:41:55 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:42:00 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:42:05 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:42:10 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:42:16 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:42:21 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:42:26 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:42:31 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:42:36 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:42:41 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:42:46 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:42:51 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:42:55 debian-kamailio kamailio[27734]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1}
<script>: onreply_route: calling rtpengine_manage
Feb 12 23:42:55 debian-kamailio kamailio[27734]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1}
<script>: onreply_route: completed calling rtpengine_manage
Feb 12 23:42:56 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:43:01 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:43:06 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:43:11 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:43:16 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:43:21 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:43:26 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:43:31 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:43:36 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:43:41 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:43:46 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command():
timeout waiting reply for command "delete" from RTPEngine
<udp:192.168.X.X:2223>
Feb 12 23:43:46 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:2729]: rtpp_function_call():
queried nodes limit reached
Feb 12 23:43:46 debian-kamailio kamailio[27713]: INFO: {1 63577835 INVITE
720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: failure_route: completed calling
rtpengine_manage
Feb 12 23:43:46 debian-kamailio kamailio[27713]: WARNING: tm [t_reply.c:1094]:
run_failure_handlers(): failure route execution took too long: 150153207 us
Feb 12 23:43:46 debian-kamailio kamailio[27736]: INFO: {1 63577835 ACK
720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: ACK: calling rtpengine_manage
Feb 12 23:43:46 debian-kamailio kamailio[27736]: INFO: {1 63577835 ACK
720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: ACK: completed calling
rtpengine_manage
Feb 12 23:43:46 debian-kamailio kamailio[27738]: INFO: {1 1 INVITE 1-27864(a)127.0.1.1}
<script>: failure_route: calling rtpengine_manage
Feb 12 23:43:51 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:43:56 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:01 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:06 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:11 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:16 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:21 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:26 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:31 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:36 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:41 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:46 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:51 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:56 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:01 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:06 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:11 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:16 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:21 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:26 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:31 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:36 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:41 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:46 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:51 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:56 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:46:01 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:46:06 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:46:11 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:46:16 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command
"delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:46:16 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1}
rtpengine [rtpengine.c:2729]: rtpp_function_call(): queried nodes limit reached
Feb 12 23:46:16 debian-kamailio kamailio[27738]: INFO: {1 1 INVITE 1-27864(a)127.0.1.1}
<script>: failure_route: completed calling rtpengine_manage
Feb 12 23:46:16 debian-kamailio kamailio[27738]: WARNING: tm [t_reply.c:1094]:
run_failure_handlers(): failure route execution took too long: 150153261 us
```
### Possible Solutions
To minimize the issue, `rtpengine_tout_ms` and `rtpengine_retr` can be reduced. This
doesn't exactly the fix the problem, as you need reasonable values here.
Additionally, `queried_nodes_limit` can be reduced to limit the number of unnecessary
retries, at the expense of the functionality it gives you to initially select candidate
nodes.
**Actual solution:** I think that, when a call has already selected an rtpengine instance
successfully, and that's in the hash table, then `queried_nodes_limit` should not
apply. It should only attempt to send commands with a total number of attempts =
`rtpengine_retr`, not `rtpengine_retr * queried_nodes_limit`.
By changing the behavior as above, tweaking `rtpengine_tout_ms` and `rtpengine_retr`, and
increasing the number of child workers, an rtpengine becoming unavailable should cause
much less impactful blocking.
## Additional Information
**Kamailio version:** (attempted on 5.4.1 and latest master)
```
version: kamailio 5.4.1 (x86_64/linux)
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST,
DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY,
USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR,
USE_DST_BLACKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535,
DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: unknown
compiled on 23:22:15 Aug 15 2022 with gcc 8.3.0
```
```
version: kamailio 5.7.0-dev3 (x86_64/linux) 2384c3
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST,
DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, MEM_JOIN_FREE, Q_MALLOC, F_MALLOC, TLSF_MALLOC,
DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER,
USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535,
DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: 2384c3
compiled on 23:22:22 Feb 12 2023 with gcc 8.3.0
```
**Operating System**:
<!--
Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04,
CentOS 7.1, ...), MacOS, xBSD, Solaris, ...;
Kernel details (output of `lsb_release -a` and `uname -a`)
-->
```
Distributor ID: Debian
Description: Debian GNU/Linux 10 (buster)
Release: 10
Codename: buster
Linux debian-kamailio 4.19.0-20-amd64 #1 SMP Debian 4.19.235-1 (2022-03-17) x86_64
GNU/Linux
```
--
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/3370
You are receiving this because you are subscribed to this thread.
Message ID: <kamailio/kamailio/issues/3370(a)github.com>