### Description
We are having cores when stopping the lb service if TLS is enabled.
### Troubleshooting
#### Debugging Data
``` Core was generated by `/usr/sbin/kamailio -f /etc/kamailio/lb/kamailio.cfg -P /var/run/kamailio/kamail'. Program terminated with signal SIGSEGV, Segmentation fault. #0 ossl_init_thread_stop (locals=0x7faaf4442d58) at ../crypto/init.c:332 332 ../crypto/init.c: No such file or directory. (gdb) bt full #0 ossl_init_thread_stop (locals=0x7faaf4442d58) at ../crypto/init.c:332 No locals. #1 0x00007faaf8d15234 in OPENSSL_cleanup () at ../crypto/init.c:400 currhandler = <optimized out> lasthandler = <optimized out> #2 0x00007fab014bb910 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 No symbol table info available. #3 0x00007fab014bb96a in exit () from /lib/x86_64-linux-gnu/libc.so.6 No symbol table info available. #4 0x0000556782094615 in handle_sigs () at main.c:698 chld = <optimized out> chld_status = 0 memlog = <optimized out> __func__ = "handle_sigs" #5 0x000055678209aa05 in main_loop () at main.c:1747
```
#### Log Messages ``` lb[8618]: WARNING: tls [tls_init.c:704]: init_tls_h(): tls: openssl bug #1491 (crash/mem leaks on low memory) workaround enabled (on low memory tls operations will fail preemptively) with free memory thresholds 7340032 and 3670016 bytes ```
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
``` This is NGCP version based on 4.4.6
kamailio -v version: kamailio 4.4.6 (x86_64/linux) becbde flags: STATS: Off, USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: becbde compiled with gcc 6.3.0 ```
* **Operating System**:
``` Debian stretch Linux spce 4.9.0-3-amd64 #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) x86_64 GNU/Linux libssl1.1 version 1.1.0f-3 ```
backporting [0] and [1] didn't help [0] https://github.com/kamailio/kamailio/commit/e7c03ce6ce61119fbf5cb9f41b7abcd4... [1] https://github.com/kamailio/kamailio/commit/76efc9b7a1489007f9ff431e730ce4e8...
FTR https://rt.openssl.org/Ticket/Display.html?id=1491 has been closed with
This is reported against 0.9.8; please open a new ticket if still a problem with current releases.
downloaded openssl source ``` #0 ossl_init_thread_stop (locals=0x7faaf4442d58) at ../crypto/init.c:332 332 if (locals->async) { (gdb) p locals $1 = (struct thread_local_inits_st *) 0x7faaf4442d58 (gdb) p *locals Cannot access memory at address 0x7faaf4442d58 ```
The link from previous comment goes to a page asking for authentication.
The crash happens inside exit() function, due to a callback registered by libssl.
Can you check if the version of the libssl you run has the next patch from about 1 month ago:
* https://github.com/openssl/openssl/commit/4b4bc00a00456e6d5cc8b2a26489ae905c...
Looks a bit related, and now the branch 1.1.0 on libssl has just a check for a null pointer:
* https://github.com/openssl/openssl/blob/OpenSSL_1_1_0-stable/crypto/init.c#L...
This is what I have ``` static struct thread_local_inits_st *ossl_init_get_thread_local(int alloc) { struct thread_local_inits_st *local = CRYPTO_THREAD_get_local(&threadstopkey);
if (local == NULL && alloc) { local = OPENSSL_zalloc(sizeof *local); CRYPTO_THREAD_set_local(&threadstopkey, local); } if (!alloc) { CRYPTO_THREAD_set_local(&threadstopkey, NULL); }
return local; } ```
The link from previous comment goes to a page asking for authentication.
guest:guest
https://rt.openssl.org/Ticket/Display.html?id=1491&user=guest&pass=g...
So there seems to be some fixes in branch 1.1.0 after release of 1.1.0f, your code is no longer matching their git branch.
Their new patch might be related, so maybe you can recompile the latest libssl from their 1.1.0 stable branch.
Otherwise, although not really familiar with libssl internals, it doesn't look to be anything related to upper layer application, but only some internal/local stuff they try to clean on exit. It is not even on cleaning the tls context/connections -- that callback seems to be executed because is registered with `atexit()`.
The issue on openssl rt seems to be from 2007:
``` Created: | Thu Feb 22 09:48:25 2007 ```
Not sure why it was dealt with in 2016. But I think is not relevant here at all.
FTR, tried with openssl/openssl@4b4bc00 applied same core
Can you give the new backtrace to match against the source code? Also, provide `info locals` and `list` from frame 0.
Btw, did you just use that patch or recompiled all from latest branch 1.1.0 for openssl?
just that patch
And the rest from my previous comment?
``` Core was generated by `/usr/sbin/kamailio -f /etc/kamailio/lb/kamailio.cfg -P /var/run/kamailio/kamail'. Program terminated with signal SIGSEGV, Segmentation fault. #0 ossl_init_thread_stop (locals=0x7f8d615fcd58) at ../crypto/init.c:335
warning: Source file is more recent than executable. 335 if (locals->async) { (gdb) info locals No locals. (gdb) l 330 { 331 /* Can't do much about this */ 332 if (locals == NULL) 333 return; 334 335 if (locals->async) { 336 #ifdef OPENSSL_INIT_DEBUG 337 fprintf(stderr, "OPENSSL_INIT: ossl_init_thread_stop: " 338 "ASYNC_cleanup_thread()\n"); 339 #endif (gdb) p locals $1 = (struct thread_local_inits_st *) 0x7f8d615fcd58 (gdb) p *locals Cannot access memory at address 0x7f8d615fcd58 (gdb) ```
I'm going to try with the last version of branch 1.1.0
Same result with latest openssl version of 1.1.0 stable
``` Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/sbin/kamailio -f /etc/kamailio/lb/kamailio.cfg -P /var/run/kamailio/kamail'. Program terminated with signal SIGSEGV, Segmentation fault. #0 ossl_init_thread_stop (locals=0x7fdcaca303f8) at ../crypto/init.c:335
warning: Source file is more recent than executable. 335 if (locals->async) { (gdb) info locals No locals. (gdb) l 330 { 331 /* Can't do much about this */ 332 if (locals == NULL) 333 return; 334 335 if (locals->async) { 336 #ifdef OPENSSL_INIT_DEBUG 337 fprintf(stderr, "OPENSSL_INIT: ossl_init_thread_stop: " 338 "ASYNC_cleanup_thread()\n"); 339 #endif (gdb) p locals $1 = (struct thread_local_inits_st *) 0x7fdcaca303f8 (gdb) p *locals Cannot access memory at address 0x7fdcaca303f8 ```
I think I may be seeing the same thing on Fedora 26, openssl-1.1.0f-7.fc26.x86_64, and Kamailio 5.0 branch @ 2d1dc7c ``` Core was generated by `/usr/sbin/kamailio -m 256 -M 8 -P /run/kamailio/kamailio.pid'. Program terminated with signal SIGSEGV, Segmentation fault. #0 ossl_init_thread_stop (locals=0x7f22ca6c0418) at crypto/init.c:332 332 if (locals->async) { (gdb) info locals No locals. (gdb) bt full #0 ossl_init_thread_stop (locals=0x7f22ca6c0418) at crypto/init.c:332 No locals. #1 0x00007f22db819f2d in OPENSSL_cleanup () at crypto/init.c:400 currhandler = <optimized out> lasthandler = <optimized out> #2 0x00007f22dc879c38 in __run_exit_handlers (status=0, listp=0x7f22dcc0a5b8 <__exit_funcs>, run_list_atexit=run_list_atexit@entry=true, run_dtors=run_dtors@entry=true) at exit.c:83 atfct = <optimized out> onfct = <optimized out> cxafct = <optimized out> f = <optimized out> #3 0x00007f22dc879c8a in __GI_exit (status=<optimized out>) at exit.c:105 No locals. #4 0x000000000041a2e1 in handle_sigs () at main.c:699 chld = 4291806 chld_status = 5359628 any_chld_stopped = 0 memlog = 6 __func__ = "handle_sigs" #5 0x0000000000423f42 in main_loop () at main.c:1758 i = 2 pid = 1671 si = 0x0 si_desc = "udp receiver child=1 sock=[2603:300A:134:50E0:0:0:0:3]:5060\000\000\000\000\000P\213ʈ\377\177\000\000\220?0\334"\177\000\000Ћʈ\377\177\000\000\321.\335\333"\177\000\000\360\206\060\334"\177\000\000\000\000\000\000\001\000\000\000X\216\033\334"\177\000\000huz\312"\177\000" nrprocs = 2 woneinit = 1 __func__ = "main_loop" #6 0x0000000000429b83 in main (argc=7, argv=0x7fff88ca8e88) at main.c:2646 cfg_stream = 0x297b010 c = -1 r = 0 tmp = 0x7fff88caaee9 "" tmp_len = 0 port = 0 proto = 0 options = 0x72a500 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 3909515647 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x7f22dd2558d0 p = 0x7fff88ca8ce0 "\377\377\377\377" st = {st_dev = 21, st_ino = 977, st_nlink = 2, st_mode = 16872, st_uid = 986, st_gid = 983, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1500938501, tv_nsec = 19585566}, st_mtim = {tv_sec = 1500938501, tv_nsec = 19585566}, st_ctim = {tv_sec = 1500938501, tv_nsec = 19585566}, __glibc_reserved = {0, 0, 0}} __func__ = "main" (gdb) info locals cfg_stream = 0x297b010 c = -1 r = 0 tmp = 0x7fff88caaee9 "" tmp_len = 0 port = 0 proto = 0 options = 0x72a500 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 3909515647 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x7f22dd2558d0 p = 0x7fff88ca8ce0 "\377\377\377\377" st = {st_dev = 21, st_ino = 977, st_nlink = 2, st_mode = 16872, st_uid = 986, st_gid = 983, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1500938501, tv_nsec = 19585566}, st_mtim = { tv_sec = 1500938501, tv_nsec = 19585566}, st_ctim = {tv_sec = 1500938501, tv_nsec = 19585566}, __glibc_reserved = {0, 0, 0}} __func__ = "main"
```
Does it happen every time when stopping kamailio?
I guess I have to try to reproduce it in order to dig in more.
Can the issue [1172](https://github.com/kamailio/kamailio/issues/1172) be related to this?
@miconda yes, it crashes every time on stop.
Also you may find this upstream bugreport to debian useful: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=870018
@joelsdc after looking closer this morning, yes in my case I do subjectively think #1172 might be related. I did have an occurrence of Kamailio stopping processing traffic. In my case these were incoming MESSAGE requests. I don't have further details as I'm now at work. This is Fedora 26, openssl-1.1.0f-7.fc26.x86_64
@miconda yes, in my case this crashes every time on stop as well.
@linuxmaniac, @apogrebennyk -- is libssl-dbg package missing in debian stretch? I can't find it in order to get the debug symbols...
It seems to core dump at shut down even with a very basic config and no sip traffic (just start and then stop):
``` debug=2 log_stderror=no memdbg=5 memlog=5 log_facility=LOG_LOCAL0 children=1 enable_tls=yes
loadmodule "tls.so" modparam("tls", "config", "/tmp/kamailio-dev/etc/kamailio/tls.cfg")
request_route { ; } ```
Either a buffer overflow in kamailio's tls module initialization, or something not done right for libssl 1.1.0 (init of libssl stuff from kamailio or inside libssl itself).
Should be fixed by the commit referenced above. Pushed in 5.0 branch as well.
Reopen if still an issue.
Closed #1189.
Thanks @miconda, would you backport this also to the 4.4 branch?
@miconda re:
is libssl-dbg package missing in debian stretch? I can't find it in order to get the debug symbols...
Be aware about Debian Stretch+ new name/location for debug symbols packages: https://www.debian.org/releases/stretch/i386/release-notes/ch-whats-new.en.h...
@apogrebennyk - pushed to 4.4 branch.
@taurus-forever - thanks, I didn't know they are in a separate apt repo.
@miconda i thinks i still have this issue.
`` kamctl version /usr/sbin/kamctl 5.0.0 ```
``` Distributor ID: Ubuntu Description: Ubuntu 16.04.3 LTS Release: 16.04 Codename: xenial ```
Installed ssl libs: ``` root@serv:~# apt list --installed | grep ssl
WARNING: apt does not have a stable CLI interface. Use with caution in scripts.
erlang-ssl/xenial-updates,xenial-security,now 1:18.3-dfsg-1ubuntu3.1 amd64 [installed,automatic] libevent-openssl-2.0-5/xenial-updates,xenial-security,now 2.0.21-stable-2ubuntu0.16.04.1 amd64 [installed,automatic] libgnutls-openssl27/xenial-updates,now 3.4.10-4ubuntu1.4 amd64 [installed] libnet-ssleay-perl/xenial,now 1.72-1build1 amd64 [installed,automatic] libssl-dev/xenial,now 1.1.0g-2.1+ubuntu16.04.1+deb.sury.org+1 amd64 [installed] libssl-doc/xenial,xenial,now 1.1.0g-2.1+ubuntu16.04.1+deb.sury.org+1 all [installed,automatic] libssl1.0.0/xenial-updates,xenial-security,now 1.0.2g-1ubuntu4.10 amd64 [installed] libssl1.0.2/now 1.0.2l-0~ubuntu16.04.1+deb.sury.org+1 amd64 [installed,local] libssl1.1/xenial,now 1.1.0g-2.1+ubuntu16.04.1+deb.sury.org+1 amd64 [installed,automatic] openssl/xenial,now 1.1.0g-2.1+ubuntu16.04.1+deb.sury.org+1 amd64 [installed] python-openssl/now 17.0.0-0+certbot~xenial+1 all [installed,local] ssl-cert/xenial,xenial,now 1.0.37 all [installed,automatic] ssldump/xenial,now 0.9b3-4.1ubuntu1 amd64 [installed] ```
In kamailio.log after kamailio starts: ``` Feb 15 21:29:43 serv /usr/sbin/kamailio[9604]: NOTICE: <core> [main.c:699]: handle_sigs(): Thank you for flying kamailio!!! Feb 15 21:29:43 serv /usr/sbin/kamailio[20955]: WARNING: tls [tls_init.c:778]: init_tls_h(): openssl bug #1491 (crash/mem leaks on low memory) workaround enabled (on low memory tls operations will fail preemptively) with free memory thresholds 17301504 and 8650752 bytes ```
@lgg that has noting to do with this issue. This is about a crash on stop.
@linuxmaniac so it's okay to have this in log file?