When running a load test Kamailio eventually becames unresponsive and stops processing calls.
Kamilio is configured to use the DMQ replication for dialog and usrloc. Also the dialog keepalived is enabled.
From investigation, the problem happens faster and easier when there is some network degradation causing packet loss and/or retransmissions but even without any noticeable network issue the freeze eventually happens.
Run a simple load test making calls at a rate of ~5 cps and keep around ~2000 calls connected all the time. A higher cps seems to make it easier to reproduce the problem.
Adding network degradation to the environment makes the problem happens, but when running a tool such as SIPp for the load test, the retransmission can be forced by simply killing the SIPp instance receiving calls which will then force Kamailio to retransmit.
Output of kamct trap:
gdb_kamailio_20201028_213030.txt
Local generated requests shows up in the log, but are not sent in the network
Not found so far
kamailio -v
# kamailio -v
version: kamailio 5.4.2 (x86_64/linux) c3b91f
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: c3b91f
compiled on 13:50:37 Oct 27 2020 with gcc 4.8.5
# cat /etc/centos-release
CentOS Linux release 7.8.2003 (Core)
# uname -a
Linux hostname 3.10.0-1127.19.1.el7.x86_64 #1 SMP Tue Aug 25 17:23:54 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.