Description

We've random crashes (approx one occurrence per day) on tls_h_tcpconn_clean_f().

Troubleshooting

Reproduction

The crash appears sometimes on closing TLS session (it doesn't have to be gracefully) initiated by remote edge.

Debugging Data

(lldb) target create "/usr/local/sbin/kamailio" --core "/var/coredump/986/kamailio.51059.core"
Core file '/var/coredump/986/kamailio.51059.core' (x86_64) was loaded.
(lldb) bt
* thread #1, name = 'kamailio', stop reason = signal SIGSEGV
  * frame #0: 0x0000000800a13cd4 libc.so.7`__je_large_dalloc(tsdn=0x0000000800d1a090, extent=0x0000000000000000) at jemalloc_large.c:346
    frame #1: 0x0000000800a3acb7 libc.so.7`__free [inlined] rtree_leaf_elm_lookup(tsdn=0x0000000800d1a090, rtree=<unavailable>, rtree_ctx=<unavailable>, key=<unavailable>, dependent=true, init_missing=false) at rtree.h:337:6
    frame #2: 0x0000000800a3acb2 libc.so.7`__free [inlined] rtree_read(tsdn=0x0000000800d1a090, rtree=<unavailable>, rtree_ctx=<unavailable>, key=<unavailable>, dependent=true) at rtree.h:406
    frame #3: 0x0000000800a3acb2 libc.so.7`__free [inlined] rtree_extent_read(tsdn=0x0000000800d1a090, rtree=<unavailable>, rtree_ctx=<unavailable>, key=<unavailable>, dependent=true) at rtree.h:418
    frame #4: 0x0000000800a3acb2 libc.so.7`__free [inlined] iealloc(tsdn=0x0000000800d1a090, ptr=0x00000008235782f0) at jemalloc_internal_inlines_b.h:82
    frame #5: 0x0000000800a3acb2 libc.so.7`__free [inlined] arena_dalloc(tsdn=0x0000000800d1a090, ptr=0x00000008235782f0, tcache=0x0000000800cf4c80, alloc_ctx=<unavailable>, slow_path=false) at arena_inlines_b.h:236
    frame #6: 0x0000000800a3ac91 libc.so.7`__free [inlined] idalloctm(tsdn=0x0000000800d1a090, ptr=0x00000008235782f0, tcache=0x0000000800cf4c80, alloc_ctx=<unavailable>, is_internal=false, slow_path=false) at jemalloc_internal_inlines_c.h:118
    frame #7: 0x0000000800a3ac91 libc.so.7`__free [inlined] ifree(tsd=<unavailable>, ptr=0x00000008235782f0, tcache=0x0000000800cf4c80, slow_path=false) at jemalloc_jemalloc.c:2226
    frame #8: 0x0000000800a3ac91 libc.so.7`__free(ptr=<unavailable>) at jemalloc_jemalloc.c:2382
    frame #9: 0x0000000800d9ed0b libthr.so.3`_thr_rwlock_destroy(rwlock=<unavailable>) at thr_rwlock.c:136:3
    frame #10: 0x00000008027feba3 libcrypto.so.111`CRYPTO_THREAD_lock_free(lock=0x0000000802e87e20) at threads_pthread.c:107:5
    frame #11: 0x00000008027ec676 libcrypto.so.111`BIO_free(a=0x0000000802e88920) at bio_lib.c:136:5
    frame #12: 0x00000008027ed572 libcrypto.so.111`BIO_free_all(bio=0x0000000000000000) at bio_lib.c:691:9
    frame #13: 0x0000000802540d0b libssl.so.111`SSL_free(s=0x0000000802e799a8) at ssl_lib.c:1160:5
    frame #14: 0x0000000802482fdd tls.so`tls_h_tcpconn_clean_f(c=0x0000000802e93270) at tls_server.c:655:3
    frame #15: 0x0000000000618bd8 kamailio`_tcpconn_free(c=0x0000000802e93270) at tcp_main.c:1528:58
    frame #16: 0x0000000000659534 kamailio`tcpconn_put_destroy(tcpconn=0x0000000802e93270) at tcp_main.c:3269:3
    frame #17: 0x0000000000668966 kamailio`handle_tcp_child(tcp_c=0x00000008014dcf68, fd_i=-1) at tcp_main.c:3706:5
    frame #18: 0x000000000065b7e9 kamailio`handle_io(fm=0x00000008014e6110, ev=1, idx=-1) at tcp_main.c:4523:8
    frame #19: 0x0000000000644502 kamailio`io_wait_loop_kqueue(h=0x00000000008dbf50, t=5, repeat=0) at io_wait.h:1187:9
    frame #20: 0x000000000063cb62 kamailio`tcp_main_loop at tcp_main.c:4824:5
    frame #21: 0x00000000002e973a kamailio`main_loop at main.c:1779:5
    frame #22: 0x00000000002f7650 kamailio`main(argc=7, argv=0x00007fffffffec68) at main.c:2856:6
    frame #23: 0x00000000002d1af0 kamailio`_start(ap=<unavailable>, cleanup=<unavailable>) at crt1.c:76:7

All backtraces are differ from other on frames downed SSL_free() (frame #13 on the backtrace).

Log Messages

17:40:34.305319 51059 - - DEBUG: <core> [core/io_wait.h:600]: io_watch_del(): DBG: io_watch_del (0x8dbf50, 55, -1, 0x0) fd_no=46 called
17:40:34.30536351059 - - DEBUG: <core> [core/tcp_main.c:4457]: handle_tcpconn_ev(): sending to child, events 11
17:40:34.305382 51059 - - DEBUG: <core> [core/tcp_main.c:4130]: send2child(): selected tcp worker idx:5 proc:30 pid:51056 for activity on [tls:91.217.xx.yy:5070], 0x802e93270
17:40:34.305430 51059 - - DEBUG: <core> [core/io_wait.h:600]: io_watch_del(): DBG: io_watch_del (0x8dbf50, 56, -1, 0x0) fd_no=45 called
17:40:34.305446 51059 - - DEBUG: <core> [core/tcp_main.c:4457]: handle_tcpconn_ev(): sending to child, events 11
17:40:34.305461 51059 - - DEBUG: <core> [core/tcp_main.c:4130]: send2child(): selected tcp worker idx:6 proc:31 pid:51057 for activity on [tls:91.217.xx.yy:5070], 0x802ea7430
17:40:34.305678 51059 - - DEBUG: <core> [core/tcp_main.c:3560]: handle_tcp_child(): reader response= 802e93270, -1 from 5 
17:40:34.305700 51059 - - DEBUG: tls [tls_server.c:683]: tls_h_tcpconn_close_f(): Closing SSL connection 0x802e9b3a8
17:40:35.024013 51026 - - ALERT: <core> [main.c:777]: handle_sigs(): child process 51059 exited by a signal 11

Additional Information

kamailio 5.4.3 (x86_64/freebsd) e19ae3
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, select, kqueue.
id: e19ae3 
compiled on 14:45:07 Feb 11 2021 with cc FreeBSD clang version 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611aa2)
FreeBSD 12.2-STABLE r369029
OpenSSL 1.1.1i-freebsd  8 Dec 2020


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.