Just to add some info
netstat -nlp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
PID/Program name
...
udp 25167616 0 <local_interface>:5060 0.0.0.0:*
211759/kamailio
...
So I see a huge Receive Queue on UDP for Kamailio which is not clearing.
Le mar. 29 août 2023 à 14:29, Ihor Olkhovskyi <igorolhovskiy(a)gmail.com> a
écrit :
Hello,
I've faced a bit strange issue, but a bit of preface. I have Kamailio as a
proxy (TLS/WS <-> UDP) and second Kamailio as a presence server. At some
point presence server accepts around 5K PUBLISH within 1 minute and sending
around the same amount of NOTIFY to proxy Kamailio.
Proxy is "transforming" protocol to TLS, but at sime point I'm starting to
get these type of errors
tm [../../core/forward.h:292]: msg_send_buffer(): tcp_send failed
tm [t_fwd.c:1588]: t_send_branch(): sending request on branch 0 failed
<script>: [RELAY] Relay to <sip:X.X.X.X:51571;transport=tls> failed!
tm [../../core/forward.h:292]: msg_send_buffer(): tcp_send failed
tm [t_fwd.c:1588]: t_send_branch(): sending request on branch 0 failed
Some of those messages are 100% valid as client can go away or so. Some
are not, cause I'm sure client is alive and connected.
But the problem comes later. At some moment proxy Kamailio just stops
accept UDP traffic on this interface (where it also accepts all NOTIFY's),
at the start of the "stopping accepting" Kamailio sends OPTIONS via
DISPATCHER but not able to receive 200 OK.
Over TLS on the same interface all is ok. On other (loopback) interface
UDP is being processed fine, so I don't suspert some limit on open files
here.
Only restart of Kamailio proxy process helps in this case.
I've tuned net.core.rmem_max and net.core.rmem_default to 25 Mb, so in
theory buffer should not be the case.
Is there some internal "interface buffer" in Kamailio that is not freed
upon failure send or maybe I've missed somethig?
Kamailio 5.6.4
fork=yes
children=12
tcp_children=12
enable_tls=yes
tcp_accept_no_cl=yes
tcp_max_connections=63536
tls_max_connections=63536
tcp_accept_aliases=no
tcp_async=yes
tcp_connect_timeout=10
tcp_conn_wq_max=63536
tcp_crlf_ping=yes
tcp_delayed_ack=yes
tcp_fd_cache=yes
tcp_keepalive=yes
tcp_keepcnt=3
tcp_keepidle=30
tcp_keepintvl=10
tcp_linger2=30
tcp_rd_buf_size=80000
tcp_send_timeout=10
tcp_wq_blk_size=2100
tcp_wq_max=10485760
open_files_limit=63536
Sysctl
# To increase the amount of memory available for socket input/output queues
net.ipv4.tcp_rmem = 4096 25165824 25165824
net.core.rmem_max = 25165824
net.core.rmem_default = 25165824
net.ipv4.tcp_wmem = 4096 65536 25165824
net.core.wmem_max = 25165824
net.core.wmem_default = 65536
net.core.optmem_max = 25165824
# To limit the maximum number of requests queued to a listen socket
net.core.somaxconn = 128
# Tells TCP to instead make decisions that would prefer lower latency.
net.ipv4.tcp_low_latency=1
# Optional (it will increase performance)
net.core.netdev_max_backlog = 1000
net.ipv4.tcp_max_syn_backlog = 128
# Flush the routing table to make changes happen instantly.
net.ipv4.route.flush=1
--
Best regards,
Ihor (Igor)
--
Best regards,
Ihor (Igor)