After upgrading from Kamailio 5.2.x, a high volume Kamailio 5.4.4 instance randomly crashes with either a general protection or segfault error message in siptrace.so
during use of sip_trace
function from one of its child processes (which cascades to the parent crashing). This appears to occur once about every 36 hours on average, but has not yet appeared to correspond with any particular event.
We are continuing to collect debug information and will be populating this ticket as more information becomes available. However, this issue has been observed.
Sip trace function is applied in this example snippet:
# ------- siptrace --------
modparam("siptrace", "hep_mode_on", 1)
modparam("siptrace", "hep_version", 3)
modparam("siptrace", "trace_to_database", 0)
modparam("siptrace", "trace_flag", 22)
modparam("siptrace", "trace_on", 1)
request_route {
#....
if ( is_method("INVITE") && !has_totag() ) {
# Only start sip_trace on initial INVITE
sip_trace("HEP_URL","$ci-MY_IP","d");
}
setflag(22);
#...
}
We attempted packet collection with Homer v5 and Homer v7 and changed between HEP protocol v2 and v3.
We have not determined a means of reproducing this issue without simply letting the server run until a crash occurs. There are four almost identical servers all experiencing the same random crashing but not at the same time.
Our next troubleshooting case will be to simply comment out the sip_trace
function, but this effectively disables the siptrace
module completely rather than addressing an underlying problem.
Core dumps are still in-progress for retrieval. Debug logs should also be more readily available soon. There will be delays since these are high volume production servers.
All of them have randomly crashed with the following example log entry. Regardless of troubleshooting tactics to date:
kernel: traps: kamailio[7579] general protection ip:7fb1a64e2dbf sp:7ffc60f04180 error:0 in siptrace.so[7fb1a64b8000+4e000]
systemd: kamailio.service: main process exited, code=exited, status=1/FAILURE
systemd: Unit kamailio.service entered failed state.
systemd: kamailio.service failed.
To date, there is no corresponding SIP Traffic with the crash.
To date, only disabling the siptrace
module seems to be the solution.
kamailio -v
version: kamailio 5.4.4 (x86_64/linux) e16352
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: e16352
compiled on 15:56:46 Feb 15 2021 with gcc 4.8.5
Linux <hostname> 3.10.0-957.27.2.el7.x86_64 #1 SMP Mon Jul 29 17:46:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.