#### Pre-Submission Checklist
- [ ] Commit message has the format required by CONTRIBUTING guide
- [ ] Commits are split per component (core, individual modules, libs, utils, ...)
- [ ] Each component has a single commit (if not, squash them into one commit)
- [ ] No commits to README files for modules (changes must be done to docbook files
in `doc/` subfolder, the README file is autogenerated)
#### Type Of Change
- [x] Small bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds new functionality)
- [ ] Breaking change (fix or feature that would change existing functionality)
#### Checklist:
- [ ] PR should be backported to stable branches
- [ ] Tested changes locally
- [ ] Related to issue #XXXX (replace XXXX with an open issue number)
#### Description
Hide error messages from "which" to avoid polluting kamctl's output (e.g. when a JSON document is returned) in environments where STDOUT and STDERR may be combined into a single stream (e.g. containers).
Without this change, kamctl's output may end up containing error messages, resulting in invalid/unparsable data. This is caused by the stdout & stderr streams being combined into a single stream in some environments, which may be outside the end-user's control (e.g. external cloud provider).
You can view, comment on, or merge this pull request online at:
https://github.com/kamailio/kamailio/pull/3377
-- Commit Summary --
* kamctl: hide errors raised by "which"
-- File Changes --
M utils/kamctl/kamctl.base (2)
-- Patch Links --
https://github.com/kamailio/kamailio/pull/3377.patchhttps://github.com/kamailio/kamailio/pull/3377.diff
--
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/pull/3377
You are receiving this because you are subscribed to this thread.
Message ID: <kamailio/kamailio/pull/3377(a)github.com>
Module: kamailio
Branch: 5.5
Commit: a9deaa503ff20a3384fe74e37093464c21885e44
URL: https://github.com/kamailio/kamailio/commit/a9deaa503ff20a3384fe74e37093464…
Author: Richard Fuchs <rfuchs(a)sipwise.com>
Committer: Henning Westerholt <hw(a)gilawa.com>
Date: 2023-02-16T11:10:10Z
rtpengine: avoid repeated queries to same node
If we know which node handles a particular call from the hash table,
only query that node once and then return error if it's dead, instead of
going into a pointless loop looking for other nodes to query which don't
exist.
closes #3370
(cherry picked from commit bd0ce126109f40f5217053f20e687b93b2268164)
(cherry picked from commit ea782be2cc29589f4dd55f0d91c11049b5280fe0)
---
Modified: src/modules/rtpengine/rtpengine.c
---
Diff: https://github.com/kamailio/kamailio/commit/a9deaa503ff20a3384fe74e37093464…
Patch: https://github.com/kamailio/kamailio/commit/a9deaa503ff20a3384fe74e37093464…
---
diff --git a/src/modules/rtpengine/rtpengine.c b/src/modules/rtpengine/rtpengine.c
index 62665205787..9cd3bd4061c 100644
--- a/src/modules/rtpengine/rtpengine.c
+++ b/src/modules/rtpengine/rtpengine.c
@@ -3320,6 +3320,12 @@ select_rtpp_node(str callid, str viabranch, int do_test, struct rtpp_node **quer
// lookup node
node = select_rtpp_node_old(callid, viabranch, do_test, op);
+ if (node && is_queried_node(node, queried_nodes_ptr, queried_nodes)) {
+ LM_ERR("rtpengine node for callid=%.*s is known (%.*s) but it has already been queried, therefore not returning it\n",
+ callid.len, callid.s, node->rn_url.len, node->rn_url.s);
+ return NULL;
+ }
+
// check node
if (!node || (node_in_set(node, active_rtpp_set) == 0)) {
// run the selection algorithm
Module: kamailio
Branch: 5.6
Commit: ea782be2cc29589f4dd55f0d91c11049b5280fe0
URL: https://github.com/kamailio/kamailio/commit/ea782be2cc29589f4dd55f0d91c1104…
Author: Richard Fuchs <rfuchs(a)sipwise.com>
Committer: Henning Westerholt <hw(a)gilawa.com>
Date: 2023-02-16T11:09:42Z
rtpengine: avoid repeated queries to same node
If we know which node handles a particular call from the hash table,
only query that node once and then return error if it's dead, instead of
going into a pointless loop looking for other nodes to query which don't
exist.
closes #3370
(cherry picked from commit bd0ce126109f40f5217053f20e687b93b2268164)
---
Modified: src/modules/rtpengine/rtpengine.c
---
Diff: https://github.com/kamailio/kamailio/commit/ea782be2cc29589f4dd55f0d91c1104…
Patch: https://github.com/kamailio/kamailio/commit/ea782be2cc29589f4dd55f0d91c1104…
---
diff --git a/src/modules/rtpengine/rtpengine.c b/src/modules/rtpengine/rtpengine.c
index ab6920c01d6..1b1f8b0ca92 100644
--- a/src/modules/rtpengine/rtpengine.c
+++ b/src/modules/rtpengine/rtpengine.c
@@ -3349,6 +3349,12 @@ select_rtpp_node(str callid, str viabranch, int do_test, struct rtpp_node **quer
// lookup node
node = select_rtpp_node_old(callid, viabranch, do_test, op);
+ if (node && is_queried_node(node, queried_nodes_ptr, queried_nodes)) {
+ LM_ERR("rtpengine node for callid=%.*s is known (%.*s) but it has already been queried, therefore not returning it\n",
+ callid.len, callid.s, node->rn_url.len, node->rn_url.s);
+ return NULL;
+ }
+
// check node
if (!node || (node_in_set(node, active_rtpp_set) == 0)) {
// run the selection algorithm
The file [th-test.c.txt](https://github.com/kamailio/kamailio/files/10608862/th-test.… contains the exact copy of the th_mask_encode() and th_mask_decode() functions from src/modules/topoh/th_mask.c . (except LM_ERR is removed and pkg_malloc/pkg_free are replaced with malloc/free).
When I run Kamilio under topoh and the websocket client calls, at some moment `sip:test-gnome-calls@bapha.be;gr=urn:uuid:f2f5a3cf-a0fb-0047-be50-740fb9bdc562;alias=87.118.146.153~60472~2>;+sip.instance="<urn:uuid:f2f5a3cf-a0fb-0047-be50-740fb9bdc562>";+org.linphone.specs="conference/1.0,ephemeral/1.1,groupchat/1.1,groupchat/1.2,lime"` is passed to th_mask_encode(). The output it `sip:127.3.4.84;line=sr-if7s1mg7i36PNf0AbdwPpfzlbqWEpLzsSGItpLwyN39ZMY4t1mCTSd6DNo4LWdIOpfpPp5FLpUQscX63Kd47W5EPWO6sNL90pLgoW5p-1fzlSdzO25n3KoIk1vHkWXptc5wOeopsWXi-eo9*`. This is put in a Contact: header. When that last string is passed to th_mask_decode(), e.g. because BYE or ACK have it as R-URI, the output is `sip:test-gnome-calls@bapha.be;gr=urn:uuid:f2f5a3cf-a0fb-0047-be50-740fb9bdc562;alias=87.118.1`. From the initial input the final output misses at least `46.153`.
The attached file contains under `b2` another input, which is passed ot th_mask_decode. This is the actual input Kamailio gets at runtime and passes to th_mask_decode. The result is not a string — callu.b.p...b.;.r...n:s..d..2lQ — where the dots represent unprintable bytes.
The error report is that th_mask_encode() produces output, which th_mask_decode() does not convert to its origin.
N.B.: The above problem happens, when a websocket client calls TCP-client with topoh-module. When a TLS-client calls TCP-client (no websockets involved), topoh does not make problems — BYE is delivered properly.
See also:
* https://lists.kamailio.org/mailman3/hyperkitty/list/sr-users@lists.kamailio…
* https://lists.kamailio.org/mailman3/hyperkitty/list/sr-users@lists.kamailio…
This might also be related:
* https://lists.kamailio.org/mailman3/hyperkitty/list/sr-users@lists.kamailio…
--
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/3358
You are receiving this because you are subscribed to this thread.
Message ID: <kamailio/kamailio/issues/3358(a)github.com>
## Description
When one rtpengine instance becomes unavailable on a Kamailio instance with moderate call traffic, even where there are multiple rtpengine instances configured for high availability, all processes can become blocked and Kamailio stops responding to all requests.
We produced this in a real world test, with two Kamailio servers running (20 child processes) and two rtpengine servers running. We loaded the Kamailio servers with calls, some answered, some ringing/busy/etc, and then made one rtpengine server drop all packets. Both Kamailio servers were able to briefly detect that the rtpengine instance was unavailable when it attempted to send an offer to it, but then as existing calls began to complete, both Kamailio servers became congested attempting to send "delete" commands to rtpengine. Both Kamailio servers then became unavailable and stopped responding to all messages for 10 minutes, until we restored rtpengine.
As such, an rtpengine server that crashes can cause a complete Kamailio outage.
## Troubleshooting
By using the reproduction steps below, we were easily able to make Kamailio stop responding to all incoming SIP messages when taking down an rtpengine instance.
The logs below show the sequence of events when this is reproduced for a single call.
Configuration:
- `rtpengine_tout_ms` was not configured and had the default value of 1000ms.
- `rtpengine_retr` was not configured and had the default value of 5.
- `queried_nodes_limit` was not configured and had the default value (of 30?).
- TM's `exec_time_check` was not configured and had the default value of 5000ms.
**A call is forwarded using t_relay and rtpengine_manage. rtpengine is killed mid-call. The call ends with 486.**
When the t_on_failure route is called, the config makes a singular call to rtpengine_manage, however what follows is 30 errors, all mentioning the same rtpengine instance:
```
ERROR: rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
```
Followed by the below:
```
ERROR: rtpengine [rtpengine.c:2729]: rtpp_function_call(): queried nodes limit reached
WARNING: tm [t_reply.c:1094]: run_failure_handlers(): failure route execution took too long: 150153261 us
```
Each timeout is 5 seconds (5x retries of 1000ms), which corresponds to about 150 seconds over 30 attempts. This matches the warning generated by tm.
In this case, there was about 5 minutes of time where a process was blocked. 150 seconds for the child leg's failure route, and 150 seconds for the parent leg's failure route. When this happens for many calls at once, congestion is guaranteed.
The 30 attempts to delete the call from rtpengine comes from the default value of `queried_nodes_limit`. This behavior doesn't make sense to me -- it's a retry loop that runs a retry loop.
## Reproduction
_tldr: kill rtpengine via iptables mid-call with the default rtpengine settings and you'll see lots of errors._
You can reproduce this with one or multiple rtpengine instances configured in Kamailio. It's easier with just one.
Standard setup, but importantly:
- forwarding a call using t_relay
- t_on_branch is configured which calls rtpengine_manage
- t_on_failure is configured which calls rtpengine_manage
To reproduce easily, we configured the destination server to send a 100, a 180, wait 20 seconds, then send a 486.
Place a number of calls to Kamailio (we used sipp) -- importantly, exceeding the number of child processes configured. You can reproduce the issue with just one call, but this wont make Kamailio unresponsive.
Once the calls are placed, and still in progress (ie before the 486 is returned), use iptables on the rtpengine server to drop all traffic on its control port.
When the call(s) start receiving the 486, you will notice a flood of error messages `timeout waiting reply for command "delete" from RTPEngine` for approximately 5 minutes per call. This blocks the process.
If it's now attempting to send the delete command to rtpengine for a number of calls greater than your child process count: attempt to send any other SIP messages to Kamailio and you will not receive a response.
## Log Messages
This is a log for a single call, forwarded using t_relay. All calls to rtpengine_manage are immediately surrounded by info log messages "calling rtpengine_manage" and "completed calling rtpengine_manage".
```
Feb 12 23:40:54 debian-kamailio kamailio[27715]: INFO: {1 1 INVITE 1-27864(a)127.0.1.1} <script>: new branch [0] to sip:0421XXXXXX@192.168.X.X:9060
Feb 12 23:40:54 debian-kamailio kamailio[27715]: INFO: {1 1 INVITE 1-27864(a)127.0.1.1} <script>: branch_route: calling rtpengine_manage
Feb 12 23:40:54 debian-kamailio kamailio[27715]: INFO: {1 1 INVITE 1-27864(a)127.0.1.1} <script>: branch_route: completed calling rtpengine_manage
Feb 12 23:40:54 debian-kamailio kamailio[27732]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1} <script>: onreply_route: calling rtpengine_manage
Feb 12 23:40:54 debian-kamailio kamailio[27732]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1} <script>: onreply_route: completed calling rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27730]: INFO: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: new branch [0] to sip:61421764018@192.168.X.X:9092;transport=udp
Feb 12 23:40:55 debian-kamailio kamailio[27730]: INFO: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: branch_route: calling rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27730]: INFO: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: branch_route: completed calling rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27717]: INFO: {2 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: onreply_route: calling rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27717]: INFO: {2 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: onreply_route: completed calling rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27714]: INFO: {2 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: onreply_route: calling rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27714]: INFO: {2 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: onreply_route: completed calling rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27735]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1} <script>: onreply_route: calling rtpengine_manage
Feb 12 23:40:55 debian-kamailio kamailio[27735]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1} <script>: onreply_route: completed calling rtpengine_manage
Feb 12 23:41:15 debian-kamailio kamailio[27713]: INFO: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: failure_route: calling rtpengine_manage
Feb 12 23:41:20 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:41:25 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:41:30 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:41:35 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:41:40 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:41:45 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:41:50 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:41:55 debian-kamailio kamailio[27740]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1} <script>: onreply_route: calling rtpengine_manage
Feb 12 23:41:55 debian-kamailio kamailio[27740]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1} <script>: onreply_route: completed calling rtpengine_manage
Feb 12 23:41:55 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:42:00 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:42:05 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:42:10 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:42:16 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:42:21 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:42:26 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:42:31 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:42:36 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:42:41 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:42:46 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:42:51 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:42:55 debian-kamailio kamailio[27734]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1} <script>: onreply_route: calling rtpengine_manage
Feb 12 23:42:55 debian-kamailio kamailio[27734]: INFO: {2 1 INVITE 1-27864(a)127.0.1.1} <script>: onreply_route: completed calling rtpengine_manage
Feb 12 23:42:56 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:43:01 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:43:06 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:43:11 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:43:16 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:43:21 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:43:26 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:43:31 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:43:36 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:43:41 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:43:46 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:43:46 debian-kamailio kamailio[27713]: ERROR: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} rtpengine [rtpengine.c:2729]: rtpp_function_call(): queried nodes limit reached
Feb 12 23:43:46 debian-kamailio kamailio[27713]: INFO: {1 63577835 INVITE 720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: failure_route: completed calling rtpengine_manage
Feb 12 23:43:46 debian-kamailio kamailio[27713]: WARNING: tm [t_reply.c:1094]: run_failure_handlers(): failure route execution took too long: 150153207 us
Feb 12 23:43:46 debian-kamailio kamailio[27736]: INFO: {1 63577835 ACK 720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: ACK: calling rtpengine_manage
Feb 12 23:43:46 debian-kamailio kamailio[27736]: INFO: {1 63577835 ACK 720bf6d9-25fb-123c-cb91-80fa5d718101} <script>: ACK: completed calling rtpengine_manage
Feb 12 23:43:46 debian-kamailio kamailio[27738]: INFO: {1 1 INVITE 1-27864(a)127.0.1.1} <script>: failure_route: calling rtpengine_manage
Feb 12 23:43:51 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:43:56 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:01 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:06 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:11 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:16 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:21 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:26 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:31 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:36 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:41 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:46 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:51 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:44:56 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:01 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:06 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:11 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:16 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:21 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:26 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:31 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:36 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:41 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:46 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:51 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:45:56 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:46:01 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:46:06 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:46:11 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:46:16 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:3129]: send_rtpp_command(): timeout waiting reply for command "delete" from RTPEngine <udp:192.168.X.X:2223>
Feb 12 23:46:16 debian-kamailio kamailio[27738]: ERROR: {1 1 INVITE 1-27864(a)127.0.1.1} rtpengine [rtpengine.c:2729]: rtpp_function_call(): queried nodes limit reached
Feb 12 23:46:16 debian-kamailio kamailio[27738]: INFO: {1 1 INVITE 1-27864(a)127.0.1.1} <script>: failure_route: completed calling rtpengine_manage
Feb 12 23:46:16 debian-kamailio kamailio[27738]: WARNING: tm [t_reply.c:1094]: run_failure_handlers(): failure route execution took too long: 150153261 us
```
### Possible Solutions
To minimize the issue, `rtpengine_tout_ms` and `rtpengine_retr` can be reduced. This doesn't exactly the fix the problem, as you need reasonable values here.
Additionally, `queried_nodes_limit` can be reduced to limit the number of unnecessary retries, at the expense of the functionality it gives you to initially select candidate nodes.
**Actual solution:** I think that, when a call has already selected an rtpengine instance successfully, and that's in the hash table, then `queried_nodes_limit` should not apply. It should only attempt to send commands with a total number of attempts = `rtpengine_retr`, not `rtpengine_retr * queried_nodes_limit`.
By changing the behavior as above, tweaking `rtpengine_tout_ms` and `rtpengine_retr`, and increasing the number of child workers, an rtpengine becoming unavailable should cause much less impactful blocking.
## Additional Information
**Kamailio version:** (attempted on 5.4.1 and latest master)
```
version: kamailio 5.4.1 (x86_64/linux)
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: unknown
compiled on 23:22:15 Aug 15 2022 with gcc 8.3.0
```
```
version: kamailio 5.7.0-dev3 (x86_64/linux) 2384c3
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, MEM_JOIN_FREE, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: 2384c3
compiled on 23:22:22 Feb 12 2023 with gcc 8.3.0
```
**Operating System**:
<!--
Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...;
Kernel details (output of `lsb_release -a` and `uname -a`)
-->
```
Distributor ID: Debian
Description: Debian GNU/Linux 10 (buster)
Release: 10
Codename: buster
Linux debian-kamailio 4.19.0-20-amd64 #1 SMP Debian 4.19.235-1 (2022-03-17) x86_64 GNU/Linux
```
--
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/3370
You are receiving this because you are subscribed to this thread.
Message ID: <kamailio/kamailio/issues/3370(a)github.com>
Module: kamailio
Branch: master
Commit: bd0ce126109f40f5217053f20e687b93b2268164
URL: https://github.com/kamailio/kamailio/commit/bd0ce126109f40f5217053f20e687b9…
Author: Richard Fuchs <rfuchs(a)sipwise.com>
Committer: Richard Fuchs <rfuchs(a)sipwise.com>
Date: 2023-02-15T08:49:01-05:00
rtpengine: avoid repeated queries to same node
If we know which node handles a particular call from the hash table,
only query that node once and then return error if it's dead, instead of
going into a pointless loop looking for other nodes to query which don't
exist.
closes #3370
---
Modified: src/modules/rtpengine/rtpengine.c
---
Diff: https://github.com/kamailio/kamailio/commit/bd0ce126109f40f5217053f20e687b9…
Patch: https://github.com/kamailio/kamailio/commit/bd0ce126109f40f5217053f20e687b9…
---
diff --git a/src/modules/rtpengine/rtpengine.c b/src/modules/rtpengine/rtpengine.c
index 5444b0d461..46640a5d3d 100644
--- a/src/modules/rtpengine/rtpengine.c
+++ b/src/modules/rtpengine/rtpengine.c
@@ -3378,6 +3378,12 @@ select_rtpp_node(str callid, str viabranch, int do_test, struct rtpp_node **quer
// lookup node
node = select_rtpp_node_old(callid, viabranch, do_test, op);
+ if (node && is_queried_node(node, queried_nodes_ptr, queried_nodes)) {
+ LM_ERR("rtpengine node for callid=%.*s is known (%.*s) but it has already been queried, therefore not returning it\n",
+ callid.len, callid.s, node->rn_url.len, node->rn_url.s);
+ return NULL;
+ }
+
// check node
if (!node || (node_in_set(node, active_rtpp_set) == 0)) {
// run the selection algorithm