### Description When working atop the 5.4 branch patched with `--no-atexit` option from https://github.com/kamailio/kamailio/issues/2616#issuecomment-771811762 applied, I've been testing the use of http_async_query to pass out MESSAGE requests to selected destinations to an API via HTTPS. I experienced the following core dumps during the time where a higher number (may 10 or so) of these async queries may have been happening around the same time. I am not sure however how to interpret these and they may have had nothing to do with #2616.
Of note for process `23694`, the logs spewed `INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table` thousands of times, spiking CPU until Kamailio finally quit with sig 11.
#### Debugging Data For core dump of process `2193`:
``` #0 0x00007f367f64ea23 in free_cell_helper (dead_cell=0x7f3683d34b48, silent=0, fname=0x7f367f770902 "timer.c", fline=643) at h_table.c:186 b = 0x7f367f64c26d <futex_release+29> "\211E\374\203}\374\002\017\224\300\017\266\300H\205\300t6H\213E\350H\203\354\bj" i = 1 rpl = 0x7ffd51f7fd40 tt = 0x7ffd51f7fd60 foo = 0x0 cbs = 0x0 cbs_tmp = 0x7f3680154a20 __func__ = "free_cell_helper" #1 0x00007f367f71b738 in wait_handler (ti=2051374262, wait_tl=0x7f3683d34bd0, data=0x7f3683d34b48) at timer.c:643 p_cell = 0x7f3683d34b48 ret = 2143519712 unlinked = 0 rcount = 1 __func__ = "wait_handler" #2 0x00000000006ae985 in timer_list_expire (t=2051374262, h=0x7f367fc7c4c0, slow_l=0x7f367fc7e398, slow_mark=16842) at core/timer.c:857 tl = 0x7f3683d34bd0 ret = 32566 #3 0x00000000006aee48 in timer_handler () at core/timer.c:922 saved_ticks = 2051374262 run_slow_timer = 0 i = 458 __func__ = "timer_handler" #4 0x00000000006af30a in timer_main () at core/timer.c:961 No locals. #5 0x000000000042c59c in main_loop () at main.c:1769 i = 2 pid = 0 si = 0x0 si_desc = "udp receiver child=1 sock=[2603:300a:134:50e0::3]:5060\000\000\300\261ʈ6\177\000\000\360\005G\210\066\177\000\000Z\262ʈ6\177\000\000\060\000\000\000\060\000\000\000\b\001\370Q\375\177\000\000 \000\370Q\375\177\000\000\000\035\064\371\364\245\242.\350\301\201\000\000\000\000\000\270\251\206\210\066\177\000" nrprocs = 2 woneinit = 1 __func__ = "main_loop" #6 0x0000000000435d22 in main (argc=8, argv=0x7ffd51f80648) at main.c:2876 cfg_stream = 0x19e5120 c = -1 r = 0 tmp = 0x7ffd51f81ee5 "" tmp_len = 896 port = 896 proto = 896 ahost = 0x0 aport = 0 options = 0x7f3de8 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 504719050 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 2 n_lst = 0x0 p = 0xc2 <error: Cannot access memory at address 0xc2> st = {st_dev = 26, st_ino = 1205, st_nlink = 2, st_mode = 16872, st_uid = 985, st_gid = 983, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1612794079, tv_nsec = 14695130}, st_mtim = {tv_sec = 1612794079, tv_nsec = 14695130}, st_ctim = {tv_sec = 1612794079, tv_nsec = 14695130}, __glibc_reserved = {0, 0, 0}} tbuf = "@\002\370Q\375\177\000\000\000\000\000\000\000\000\000\000@\002\370Q\375\177", '\000' <repeats 18 times>, "\260\067\361\210\066\177\000\000\350\357\363\210\066\177\000\000\b\005\364\210\066\177\000\000\340\071\361\210\066\177\000\000\025\257\362\210\066\177\000\000$V\273\210\066\177\000\000\354k\363\210\066\177\000\000\336k\363\210\066\177\000\000\205\317c\t\000\000\000\000\300C\360\210\066\177\000\000߯\362\210\066\177\000\000\000\000\000\000\254\202\226\006\334@\273\210\066\177\000\000\000\000\000\000\000\000\000\000\300C\360\210\066\177\000\000\001\000\000\000\000\000\000\000\005\215.\201\267e\325\000\240\001\364\210\066\177\000\000\370\377\377\377\377\377\377\377\240\001\364\210\066\177\000\000R"... option_index = 11 long_options = {{name = 0x7f62a6 "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x7f1523 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x7f62ab "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x7f62b1 "subst", has_arg = 1, flag = 0x0, val = 1025}, {name = 0x7f62b7 "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x7f62c0 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, { name = 0x7f62ca "server-id", has_arg = 1, flag = 0x0, val = 1028}, {name = 0x7f62d4 "loadmodule", has_arg = 1, flag = 0x0, val = 1029}, {name = 0x7f62df "modparam", has_arg = 1, flag = 0x0, val = 1030}, {name = 0x7f62e8 "log-engine", has_arg = 1, flag = 0x0, val = 1031}, {name = 0x7f62f3 "debug", has_arg = 1, flag = 0x0, val = 1032}, {name = 0x7f62f9 "no-atexit", has_arg = 0, flag = 0x0, val = 1033}, { name = 0x0, has_arg = 0, flag = 0x0, val = 0}} __func__ = "main" b = 0x7f367f64c26d <futex_release+29> "\211E\374\203}\374\002\017\224\300\017\266\300H\205\300t6H\213E\350H\203\354\bj" i = 1 rpl = 0x7ffd51f7fd40 tt = 0x7ffd51f7fd60 foo = 0x0 cbs = 0x0 cbs_tmp = 0x7f3680154a20 __func__ = "free_cell_helper" 181 /* It is safer to release the shm memory lock 182 * otherwise the release function must to be aware of 183 * the lock state (Miklos) 184 */ 185 shm_global_unlock(); 186 cbs_tmp->release(cbs_tmp->param); 187 shm_global_lock(); 188 } 189 shm_free_unsafe(cbs_tmp); 190 }
```
For core dump of process `23694`: ``` #0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:96 No locals. #1 0x00007f83502f94f8 in __vfprintf_internal (s=s@entry=0x143c6d0, format=format@entry=0x7f8345c7ba08 "%s: %.*s%s%s%shandle %p returned error %d: %s\n", ap=ap@entry=0x7ffc27d97d10, mode_flags=mode_flags@entry=0) at vfprintf-internal.c:1647 len = <optimized out> step0_jumps = {0, 1757, 1653, 3485, 3381, 4069, 2733, 2893, 3685, 1813, 4405, 4565, 3589, 4549, 4493, 2845, 4285, 3989, 3285, 3053, 1173, 1397, 2053, 1973, 1933, 757, 3781, 549, 553, 4181} space = <optimized out> is_short = <optimized out> use_outdigits = 0 outc = <optimized out> step1_jumps = {0, 0, 0, 0, 0, 0, 0, 0, 0, 1813, 4405, 4565, 3589, 4549, 4493, 2845, 4285, 3989, 3285, 3053, 1173, 1397, 2053, 1973, 1933, 757, 3781, 549, 553, 0} group = 0 prec = -1 step2_jumps = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4405, 4565, 3589, 4549, 4493, 2845, 4285, 3989, 3285, 3053, 1173, 1397, 2053, 1973, 1933, 757, 3781, 549, 553, 0} string = 0xb8 <error: Cannot access memory at address 0xb8> left = 0 is_long_double = <optimized out> width = 0 signed_number = <optimized out> step3a_jumps = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4309, 0, 0, 0, 4493, 2845, 4285, 3989, 3285, 0, 0, 0, 0, 1973, 0, 0, 0, 0, 0, 0} alt = <optimized out> showsign = 0 is_long = 0 is_char = <optimized out> pad = <optimized out> step3b_jumps = {0 <repeats 11 times>, 3589, 0, 0, 4493, 2845, 4285, 3989, 3285, 3053, 1173, 1397, 2053, 1973, 1933, 757, 3781, 0, 0, 0} step4_jumps = {0 <repeats 14 times>, 4493, 2845, 4285, 3989, 3285, 3053, 1173, 1397, 2053, 1973, 1933, 757, 3781, 0, 0, 0} args_value = <optimized out> is_negative = <optimized out> number = {longlong = <optimized out>, word = <optimized out>} base = <optimized out> the_arg = {pa_wchar = 1698128436 L'\x65376234', pa_int = 1698128436, pa_long_int = 4048843148828828212, pa_long_long_int = 4048843148828828212, pa_u_int = 1698128436, pa_u_long_int = 4048843148828828212, pa_u_long_long_int = 4048843148828828212, pa_double = 4.8137781308181096e-38, pa_long_double = <invalid float value>, pa_float128 = 4.25777496576548470393424945528503408e-3201, pa_string = 0x3830616465376234 <error: Cannot access memory at address 0x3830616465376234>, pa_wstring = 0x3830616465376234 <error: Cannot access memory at address 0x3830616465376234>, pa_pointer = 0x3830616465376234, pa_user = 0x3830616465376234} spec = 115 's' _buffer = {__routine = 0x7ffc27d977b0, __arg = 0x7f8345c75daf, __canceltype = 668563760, __prev = 0x7f835030a3ba <__vsnprintf_internal+170>} _avail = <optimized out> thousands_sep = 0x0 grouping = 0xffffffffffffffff <error: Cannot access memory at address 0xffffffffffffffff> done = 106 f = 0x7f8345c7ba34 "s\n" lead_str_end = 0x7f8345c7ba08 "%s: %.*s%s%s%shandle %p returned error %d: %s\n" end_of_spec = <optimized out> work_buffer = "`z\331'\374\177\000\000nz\331'\374\177\000\000sz\331'\374\177\000\000`z\331'\374\177\000\000sz\331'\374\177", '\000' <repeats 46 times>, "\200\000\000\000\000\027\021ю\203w\026", '\000' <repeats 24 times>, "\220PcO\203\177\000\000\377\377\377\377\377\377\377\377", '\000' <repeats 16 times>, "\035GGO\203\177\000\000\377\377\377\377\374\177\000\000v\037CO\203\177\000\000H\241(P\203\177\000\000 \020EP\203\177\000\000\000\000\000\000\000\000\000\000u"... workend = 0x7ffc27d97bb8 "" ap_save = {{gp_offset = 16, fp_offset = 48, overflow_arg_area = 0x7ffc27d97df0, reg_save_area = 0x7ffc27d97d30}} nspecs_done = 7 save_errno = 11 readonly_format = 0 do_longlong_number = <optimized out> __result = <optimized out> #2 0x00007f8350386da3 in __vsyslog_internal (pri=<optimized out>, fmt=0x7f8345c7ba08 "%s: %.*s%s%s%shandle %p returned error %d: %s\n", ap=0x7ffc27d97d10, mode_flags=0) at ../misc/syslog.c:233 now_tm = {tm_sec = 28, tm_min = 59, tm_hour = 12, tm_mday = 8, tm_mon = 1, tm_year = 121, tm_wday = 1, tm_yday = 38, tm_isdst = 0, tm_gmtoff = -21600, tm_zone = 0x143ead0 "CST"} now = 1612810768 fd = <optimized out> f = 0x143c6d0 buf = 0x0 bufsize = 0 msgoff = 20 saved_errno = <optimized out> failbuf = "\017\000\000\000\000\000\000\000\230\323PK\203\177\000\000\000\000\000\000\000\000\000\000D}\331", <incomplete sequence \374> clarg = {buf = <optimized out>, oldaction = <optimized out>} #3 0x00007f835038725a in __syslog (pri=<optimized out>, fmt=<optimized out>) at ../misc/syslog.c:117 ap = {{gp_offset = 48, fp_offset = 48, overflow_arg_area = 0x7ffc27d97e18, reg_save_area = 0x7ffc27d97d30}} #4 0x00007f8345c714bf in check_multi_info (g=0x7f834afb1358) at http_multi.c:583 __llevel = -1 eff_url = 0x7f834b1fd168 "https://messinet.com/api/kamailio/message" msg = 0x7f834b50d400 msgs_left = 0 easy = 0x7f834b50d398 res = CURLE_WRITE_ERROR cell = 0x0 __func__ = "check_multi_info" #5 0x00007f8345c67d73 in event_cb (fd=15, kind=2, userp=0x7f834b7eda08) at http_multi.c:145 g = 0x7f834afb1358 rc = CURLM_OK easy = 0x7f834b7eda08 cell = 0x7f834b27ce90 __func__ = "event_cb" action = 1 #6 0x00007f8345b65c4a in event_persist_closure (ev=<optimized out>, base=0x1612440) at event.c:1580 evcb_callback = <optimized out> evcb_fd = <optimized out> evcb_res = 2 evcb_arg = 0x7f834b7eda08 evcb_callback = <optimized out> evcb_fd = <optimized out> evcb_res = <optimized out> evcb_arg = <optimized out> __func__ = {<optimized out> <repeats 22 times>} run_at = {tv_sec = <optimized out>, tv_usec = <optimized out>} relative_to = {tv_sec = <optimized out>, tv_usec = <optimized out>} delay = {tv_sec = <optimized out>, tv_usec = <optimized out>} now = {tv_sec = <optimized out>, tv_usec = <optimized out>} usec_mask = <optimized out> #7 event_process_active_single_queue (base=base@entry=0x1612440, activeq=0x16087e0, max_to_process=max_to_process@entry=2147483647, endtime=endtime@entry=0x0) at event.c:1639 ev = <optimized out> evcb_selfcb = <optimized out> evcb = <optimized out> count = 1 __func__ = "event_process_active_single_queue" #8 0x00007f8345b675d7 in event_process_active (base=0x1612440) at event.c:1738 activeq = <optimized out> i = 0 c = 0 tv = {tv_sec = 16697, tv_usec = 733811} maxcb = 2147483647 endtime = 0x0 limit_after_prio = 2147483647 activeq = <optimized out> i = <optimized out> c = <optimized out> endtime = <optimized out> tv = {tv_sec = <optimized out>, tv_usec = <optimized out>} maxcb = <optimized out> limit_after_prio = <optimized out> done = <optimized out> #9 event_base_loop (base=0x1612440, flags=0) at event.c:1961 n = <optimized out> evsel = 0x7f8345b97c80 <epollops> tv = {tv_sec = 0, tv_usec = 0} tv_p = <optimized out> res = <optimized out> done = 0 retval = 0 __func__ = "event_base_loop" #10 0x00007f8345c4550b in async_http_run_worker (worker=0x7f8347831378) at async_http.c:92 No locals. #11 0x00007f8345c5a278 in child_init (rank=0) at http_async_client_mod.c:368 pid = 0 i = 0 __func__ = "child_init" #12 0x000000000062148b in init_mod_child (m=0x7f834fb539d8, rank=0) at core/sr_module.c:780 __func__ = "init_mod_child" #13 0x00000000006210fa in init_mod_child (m=0x7f834fb545b8, rank=0) at core/sr_module.c:776 __func__ = "init_mod_child" #14 0x00000000006210fa in init_mod_child (m=0x7f834fb54d58, rank=0) at core/sr_module.c:776 __func__ = "init_mod_child" #15 0x00000000006210fa in init_mod_child (m=0x7f834fb55128, rank=0) at core/sr_module.c:776 __func__ = "init_mod_child" #16 0x00000000006210fa in init_mod_child (m=0x7f834fb555e0, rank=0) at core/sr_module.c:776 __func__ = "init_mod_child" #17 0x00000000006210fa in init_mod_child (m=0x7f834fb55ae8, rank=0) at core/sr_module.c:776 __func__ = "init_mod_child" #18 0x0000000000621c4a in init_child (rank=0) at core/sr_module.c:825 ret = 1341536656 type = 0x83d00b "PROC_MAIN" __func__ = "init_child" #19 0x000000000042c8a2 in main_loop () at main.c:1779 i = 2 pid = 23675 si = 0x0 si_desc = "udp receiver child=1 sock=[2603:300a:134:50e0::3]:5060\000\000\300q8P\203\177\000\000\360ŴO\203\177\000\000Zr8P\203\177\000\000\060\000\000\000\060\000\000\000\270\210\331'\374\177\000\000Ї\331'\374\177\000\000\000\027\021ю\203w\026\350\301\201\000\000\000\000\000\270i\364O\203\177\000" nrprocs = 2 woneinit = 1 __func__ = "main_loop" #20 0x0000000000435d22 in main (argc=8, argv=0x7ffc27d98df8) at main.c:2876 cfg_stream = 0x1337120 c = -1 r = 0 tmp = 0x7ffc27d9aee4 "" tmp_len = 896 port = 896 proto = 896 ahost = 0x0 aport = 0 options = 0x7f3de8 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 3012900158 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 2 n_lst = 0x0 p = 0xc2 <error: Cannot access memory at address 0xc2> st = {st_dev = 26, st_ino = 1205, st_nlink = 2, st_mode = 16872, st_uid = 985, st_gid = 983, __pad0 = 0, st_rdev = 0, st_size = 40, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1612794079, tv_nsec = 14695130}, st_mtim = {tv_sec = 1612810438, tv_nsec = 182128841}, st_ctim = {tv_sec = 1612810438, tv_nsec = 182128841}, __glibc_reserved = {0, 0, 0}} tbuf = "\360\211\331'\374\177\000\000\000\000\000\000\000\000\000\000\360\211\331'\374\177", '\000' <repeats 18 times>, "\260\367^P\203\177\000\000\350\257aP\203\177\000\000\b\305aP\203\177\000\000\340\371^P\203\177\000\000\025o`P\203\177\000\000$\026)P\203\177\000\000\354+aP\203\177\000\000\336+aP\203\177\000\000\205\317c\t\000\000\000\000\300\003^P\203\177\000\000\337o`P\203\177\000\000\000\000\000\000\254\202\226\006\334\000)P\203\177\000\000\000\000\000\000\000\000\000\000\300\003^P\203\177\000\000\001\000\000\000\000\000\000\000\360`\207\272ޟ\325\000\240\301aP\203\177\000\000\370\377\377\377\377\377\377\377\240\301aP\203\177\000\000R"... option_index = 11 long_options = {{name = 0x7f62a6 "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x7f1523 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x7f62ab "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x7f62b1 "subst", has_arg = 1, flag = 0x0, val = 1025}, {name = 0x7f62b7 "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x7f62c0 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, {name = 0x7f62ca "server-id", has_arg = 1, flag = 0x0, val = 1028}, {name = 0x7f62d4 "loadmodule", has_arg = 1, flag = 0x0, val = 1029}, {name = 0x7f62df "modparam", has_arg = 1, flag = 0x0, val = 1030}, {name = 0x7f62e8 "log-engine", has_arg = 1, flag = 0x0, val = 1031}, {name = 0x7f62f3 "debug", has_arg = 1, flag = 0x0, val = 1032}, {name = 0x7f62f9 "no-atexit", has_arg = 0, flag = 0x0, val = 1033}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}} __func__ = "main" No locals. 91 92 .p2align 4 93 L(cros_page_boundary): 94 andl $(VEC_SIZE - 1), %ecx 95 andq $-VEC_SIZE, %rdi 96 VPCMPEQ (%rdi), %ymm0, %ymm1 97 vpmovmskb %ymm1, %eax 98 /* Remove the leading bytes. */ 99 sarl %cl, %eax 100 testl %eax, %eax
```
#### Log Messages Log for process `2193`: ``` Feb 08 12:53:31 kamailio[2216]: [HTTP_REPLY] MESSAGE via HTTP_API succeeded: 202, HTTP/2 202 ^M date: Mon, 08 Feb 2021 18:53:31 GMT^M server: Apache/2.4.46^M strict-transport-security: max-age=15552000; includeSubDomains^M content-length: 46^M vary: User-Agent^M content-type: application/json; charset=utf-8^M ^M {"result": {"code": 202, "title": "Accepted"}} Feb 08 12:53:31 kamailio[2216]: [HTTP_REPLY] MESSAGE via HTTP_API succeeded: 202, HTTP/2 202 ^M date: Mon, 08 Feb 2021 18:53:31 GMT^M server: Apache/2.4.46^M strict-transport-security: max-age=15552000; includeSubDomains^M content-length: 46^M vary: User-Agent^M content-type: application/json; charset=utf-8^M ^M {"result": {"code": 202, "title": "Accepted"}}
Feb 08 12:53:55 kamailio[2216]: [HTTP_REPLY] MESSAGE via HTTP_API succeeded: 202, HTTP/2 202 ^M date: Mon, 08 Feb 2021 18:53:55 GMT^M server: Apache/2.4.46^M strict-transport-security: max-age=15552000; includeSubDomains^M content-length: 46^M vary: User-Agent^M content-type: application/json; charset=utf-8^M ^M {"result": {"code": 202, "title": "Accepted"}} Feb 08 12:53:58 kamailio[1957]: ALERT: <core> [main.c:784]: handle_sigs(): child process 2193 exited by a signal 11 Feb 08 12:53:58 kamailio[2227]: CRITICAL: <core> [core/pass_fd.c:277]: receive_fd(): EOF on 34 Feb 08 12:53:58 kamailio[1957]: ALERT: <core> [main.c:787]: handle_sigs(): core was generated Feb 08 12:53:58 kamailio[1957]: INFO: <core> [main.c:810]: handle_sigs(): terminating due to SIGCHLD Feb 08 12:53:58 kamailio[2168]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2165]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2166]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2170]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2172]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2180]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2190]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2174]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2187]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2177]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2221]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2227]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2184]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2195]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2216]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2224]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2219]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[2206]: INFO: <core> [main.c:873]: sig_usr(): signal 15 received Feb 08 12:53:58 kamailio[1957]: INFO: <core> [core/sctp_core.c:53]: sctp_core_destroy(): SCTP API not initialized Feb 08 12:53:58 systemd[1]: kamailio.service: Main process exited, code=exited, status=1/FAILURE Feb 08 12:53:58 systemd[1]: kamailio.service: Failed with result 'exit-code'. Feb 08 12:53:58 systemd[1]: kamailio.service: Consumed 10.011s CPU time. Feb 08 12:54:28 systemd[1]: kamailio.service: Scheduled restart job, restart counter is at 1. Feb 08 12:54:28 systemd[1]: Stopped Kamailio SIP Server. ``` Log for process `23694`: ``` Feb 08 12:59:09 kamailio[23694]: [HTTP_REPLY] MESSAGE via HTTP_API succeeded: 202, HTTP/2 202 ^M date: Mon, 08 Feb 2021 18:59:09 GMT^M server: Apache/2.4.46^M strict-transport-security: max-age=15552000; includeSubDomains^M content-length: 46^M vary: User-Agent^M content-type: application/json; charset=utf-8^M ^M {"result": {"code": 202, "title": "Accepted"}} Feb 08 12:59:09 kamailio[23694]: [HTTP_REPLY] MESSAGE via HTTP_API succeeded: 202, HTTP/2 202 ^M date: Mon, 08 Feb 2021 18:59:09 GMT^M server: Apache/2.4.46^M strict-transport-security: max-age=15552000; includeSubDomains^M content-length: 46^M vary: User-Agent^M content-type: application/json; charset=utf-8^M ^M {"result": {"code": 202, "title": "Accepted"}}
Feb 08 12:59:13 kamailio[23694]: [HTTP_REPLY] MESSAGE via HTTP_API succeeded: 202, HTTP/2 202 ^M date: Mon, 08 Feb 2021 18:59:13 GMT^M server: Apache/2.4.46^M strict-transport-security: max-age=15552000; includeSubDomains^M content-length: 46^M vary: User-Agent^M content-type: application/json; charset=utf-8^M ^M {"result": {"code": 202, "title": "Accepted"}} Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:13 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table ... ... Feb 08 12:59:14 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:14 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:14 kamailio[23694]: INFO: http_async_client [http_multi.c:87]: event_cb(): Cell for handler 0x7f834b7eda08 not found in table Feb 08 12:59:28 kamailio[23694]: ERROR: http_async_client [http_multi.c:253]: write_cb(): Cell for handler 0x7f834b50d398 not found in table Feb 08 12:59:29 kamailio[23709]: CRITICAL: <core> [core/pass_fd.c:277]: receive_fd(): EOF on 43 Feb 08 12:59:29 kamailio[23633]: ALERT: <core> [main.c:784]: handle_sigs(): child process 23694 exited by a signal 11 Feb 08 12:59:29 kamailio[23633]: ALERT: <core> [main.c:787]: handle_sigs(): core was generated Feb 08 12:59:29 systemd[1]: kamailio.service: Main process exited, code=exited, status=1/FAILURE Feb 08 12:59:29 systemd[1]: kamailio.service: Failed with result 'exit-code'. Feb 08 12:59:29 systemd[1]: kamailio.service: Consumed 11.523s CPU time. Feb 08 12:59:59 systemd[1]: kamailio.service: Scheduled restart job, restart counter is at 2. Feb 08 12:59:59 systemd[1]: Stopped Kamailio SIP Server. ```
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
``` version: kamailio 5.4.3-6.gitdbe358d2d0.fc33.1 (x86_64/linux) e19ae3 flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: e19ae3 compiled on 00:00:00 Feb 5 2021 with gcc 10.2.1 ```
* **Operating System**:
``` Fedora 33 Linux 5.10.13-200.fc33.x86_64 #1 SMP Thu Feb 4 14:54:51 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux ```
Another using http_async_client -- seems like much of this is related to libevent [gdb.203123.txt](https://github.com/kamailio/kamailio/files/5983969/gdb.203123.txt)
I've switched over to http_client for now.
I've have similar crashes on syslog(). But I don't use HTTP(S).
### Troubleshooting
#### Debugging Data
``` (lldb) target create "kamailio" --core "/var/coredump/986/kamailio.4087.core" Core file '/var/coredump/986/kamailio.4087.core' (x86_64) was loaded. (lldb) bt all * thread #1, name = 'kamailio', stop reason = signal SIGSEGV * frame #0: 0x00000008009fa56d libc.so.7`__je_tcache_bin_flush_small [inlined] extent_arena_get(extent=0x0000000000000000) at extent_inlines.h:39:43 frame #1: 0x00000008009fa56d libc.so.7`__je_tcache_bin_flush_small(tsd=<unavailable>, tcache=<unavailable>, tbin=0x0000000800d1a2a8, binind=<unavailable>, rem=33) at jemalloc_tcache.c:123 frame #2: 0x00000008009fa17c libc.so.7`__je_tcache_event_hard(tsd=<unavailable>, tcache=0x0000000800d1a250) at jemalloc_tcache.c:54:4 frame #3: 0x0000000800a37934 libc.so.7`__malloc [inlined] arena_malloc(tsdn=0x0000000800d1a090, arena=0x0000000000000000, size=<unavailable>, ind=<unavailable>, zero=false, tcache=<unavailable>, slow_path=false) at arena_inlines_b.h:0 frame #4: 0x0000000800a378a7 libc.so.7`__malloc [inlined] iallocztm(tsdn=0x0000000800d1a090, size=<unavailable>, ind=<unavailable>, zero=false, tcache=<unavailable>, is_internal=false, arena=0x0000000000000000, slow_path=false) at jemalloc_internal_inlines_c.h:53 frame #5: 0x0000000800a378a7 libc.so.7`__malloc [inlined] imalloc_no_sample(sopts=<unavailable>, dopts=<unavailable>, tsd=<unavailable>, size=<unavailable>, usize=<unavailable>, ind=<unavailable>) at jemalloc_jemalloc.c:1713 frame #6: 0x0000000800a378a7 libc.so.7`__malloc [inlined] imalloc_body(sopts=<unavailable>, dopts=<unavailable>, tsd=<unavailable>) at jemalloc_jemalloc.c:1909 frame #7: 0x0000000800a378a7 libc.so.7`__malloc [inlined] imalloc(sopts=<unavailable>, dopts=<unavailable>) at jemalloc_jemalloc.c:2009 frame #8: 0x0000000800a3768b libc.so.7`__malloc(size=<unavailable>) at jemalloc_jemalloc.c:2042 frame #9: 0x0000000800acaf34 libc.so.7`__smakebuf(fp=0x0000000800d06010) at makebuf.c:73:11 frame #10: 0x0000000800acae39 libc.so.7`__swsetup(fp=0x0000000800d06010) at wsetup.c:82:3 frame #11: 0x0000000800a67b53 libc.so.7`__vfprintf(fp=0x0000000800d06010, locale=0x0000000800ade698, fmt0="", ap=0x00007fffffff8cf0) at vfprintf.c:462:6 frame #12: 0x0000000800a678c5 libc.so.7`vfprintf_l(fp=0x0000000800d06010, locale=0x0000000800ade698, fmt0="", ap=0x00007fffffff8cf0) at vfprintf.c:285:9 frame #13: 0x0000000800a6f113 libc.so.7`fprintf(fp=<unavailable>, fmt=<unavailable>) at fprintf.c:57:8 frame #14: 0x0000000800aa8f01 libc.so.7`vsyslog [inlined] vsyslog1(pri=135, fmt="", ap=0x00007fffffff9c50) at syslog.c:173:8 frame #15: 0x0000000800aa8e4c libc.so.7`vsyslog(pri=<unavailable>, fmt="", ap=0x00007fffffff9c50) at syslog.c:363 frame #16: 0x0000000800aa8dcd libc.so.7`syslog(pri=<unavailable>, fmt=<unavailable>) at syslog.c:129:2 frame #17: 0x00000000006655dc kamailio`handle_tcp_child(tcp_c=0x00000008014dcf08, fd_i=-1) at tcp_main.c:3559:2 frame #18: 0x000000000065b7e9 kamailio`handle_io(fm=0x00000008014e6080, ev=1, idx=-1) at tcp_main.c:4523:8 frame #19: 0x0000000000644502 kamailio`io_wait_loop_kqueue(h=0x00000000008dbf50, t=5, repeat=0) at io_wait.h:1187:9 frame #20: 0x000000000063cb62 kamailio`tcp_main_loop at tcp_main.c:4824:5 frame #21: 0x00000000002e973a kamailio`main_loop at main.c:1779:5 frame #22: 0x00000000002f7650 kamailio`main(argc=7, argv=0x00007fffffffec70) at main.c:2856:6 frame #23: 0x00000000002d1af0 kamailio`_start(ap=<unavailable>, cleanup=<unavailable>) at crt1.c:76:7 ```
#### Log Messages
``` 03:20:21.349512 4087 - - DEBUG: <core> [core/tcp_main.c:4130]: send2child(): selected tcp worker idx:1 proc:26 pid:4080 for activity on [tcp:91.217.xx.yy:5060], 0x802e70718 03:20:27.364691 4087 - - DEBUG: <core> [core/tcp_main.c:3560]: handle_tcp_child(): reader response= 802e70718, 1 from 1 03:20:27.364727 4087 - - DEBUG: <core> [core/io_wait.h:375]: io_watch_add(): DBG: io_watch_add(0x8dbf50, 53, 2, 0x802e70718), fd_no=43 03:20:27.364746 4087 - - DEBUG: <core> [core/tcp_main.c:3687]: handle_tcp_child(): CONN_RELEASE 0x802e70718 refcnt= 1 03:20:29.740878 4062 - - DEBUG: {1 1001 SUBSCRIBE ceccb6ad-add35f9c@10.10.10.251} <core> [core/socket_info.c:646]: grep_sock_info(): checking if host==us: 13==13 && [78.37.aa.bb] == [91.217.xx.yy] 03:20:51.496077 4087 - - DEBUG: <core> [core/io_wait.h:600]: io_watch_del(): DBG: io_watch_del (0x8dbf50, 53, -1, 0x0) fd_no=44 called 03:20:51.496127 4087 - - DEBUG: <core> [core/tcp_main.c:4457]: handle_tcpconn_ev(): sending to child, events 1 03:20:51.496147 4087 - - DEBUG: <core> [core/tcp_main.c:4130]: send2child(): selected tcp worker idx:2 proc:27 pid:4081 for activity on [tcp:91.217.xx.yy:5060], 0x802e70718 03:20:58.190139 4054 - - ALERT: <core> [main.c:777]: handle_sigs(): child process 4087 exited by a signal 11 ```
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
``` kamailio 5.4.3 (x86_64/freebsd) e19ae3 flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, select, kqueue. id: e19ae3 compiled on 14:45:07 Feb 11 2021 with cc FreeBSD clang version 10.0.1 (git@github.com:llvm/llvm-project.git llvmorg-10.0.1-0-gef32c611aa2) ```
* **Operating System**:
``` FreeBSD 12.2-STABLE r369029 OpenSSL 1.1.1i-freebsd 8 Dec 2020 ```
We see the same issue on both 5.3 and 5.4:
``` version: kamailio 5.3.8 (x86_64/linux) flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: unknown compiled with gcc 8.3.0 ```
``` version: kamailio 5.4.4 (x86_64/linux) flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: unknown compiled with gcc 8.3.0 ``` ` Linux sbc-stateless-ig-wn7p 4.19.0-14-cloud-amd64 #1 SMP Debian 4.19.171-2 (2021-01-30) x86_64 GNU/Linux`
We suspect that the issue only happens with http2. @amessina I can see that the server you connect to is also using http2. Looking at the implementation of http_async_client it doesn't seem to handle http2 properly. If you understood the implementation correctly these is an association between the Curl_easy and the fd, when registering in libevents. That does not work in http2, where several requests are multiplexed over the same TLS connection.
A quick fix could be to disable ALPN `curl_easy_setopt(curl_, CURLOPT_SSL_ENABLE_ALPN, 0L);`.
Thank you for the report. I'll try to have a look on it in the next couple of days.
I can confirm that patching http_async_client module by adding `curl_easy_setopt(cell->easy, CURLOPT_SSL_ENABLE_ALPN, 0L);` solved our problems. Now we don't see any spinning Kamailio processes anymore.
Thank you @bjokax, you are correct -- I was not aware of the implementation detail related to http2, though it's promising that your `curl_easy_setopt(cell->easy, CURLOPT_SSL_ENABLE_ALPN, 0L);` may resolve the issue.
I nicer solution would be to fully implement support for http2!
Hi, althought I have not been able to reproduce the crash, I think that the problem, as highlighted by @bjokax, is the not correct handling of HTTP/2 multiplexing. The module was written before HTTP/2 was out so it has never been considered. Support for it would require some important refactoring of the module, I'll start seeing if I can work on it for the next major release. Meanwhile we can disable multiplexing at all (this can also be backported to current stable branches while a refactor to support multiplexing wouldn't). I've pushed such fix in this branch https://github.com/kamailio/kamailio/tree/grumvalski/disable_http2_multiplex.... @amessina @bjokax : can you try it? Since I've not been able for now to reproduce the crash I don't know if it solves yours. Thank you.
@grumvalski: have you checked if also makes sense to set the flag suggested by @bjokax CURLOPT_SSL_ENABLE_ALPN ?
* https://curl.se/libcurl/c/CURLOPT_SSL_ENABLE_ALPN.html
Wondering if there is any benefit of allowing protocol negotiation in Kamailio case.
Moreover, maybe we have to set it to http_client module - @oej: any opinion on http_client module?
We have now tested https://github.com/kamailio/kamailio/tree/grumvalski/disable_http2_multiplex... and it seems to solve our problems!
@miconda I would ague that this is a better solution than disabling TLS ALPN. Now we can continue use http2, which is a more efficient protocol than http 1.1, but just disable the multiplexing, until we have the full refactoring in place.
@miconda: in this case we should also disable NPN negotiation: https://curl.se/libcurl/c/CURLOPT_SSL_ENABLE_NPN.html, or even disabling HTTP/2 completely setting CURLOPT_HTTP_VERSION (current curl default value is CURLOPT_HTTP_VERSION_2TLS). I didn't want to be so extreme :) @bjokax, thanks for testing it.
@grumvalski I think the fix in your branch is good enough! It is the multiplexing that is the problem, not http2.
@grumvalski, @bjokax: just wanted to ensure flags listed around here are evaluated to see if worth setting them, not necessary for fixing the issue, but from the perspective of what Kamailio needs/expects from this function/module. I am fine to just go with what is considered better, then probably push it to the http_client module.
@miconda: not sure that http_client is affected too, being the problem in http_async_client the fact tha a socket is bound to a specific easy handler and an event is attached to it via libevent. But I think @oej is better qualified to answer :)
If there is agreement I'll merge this one and backport it to stable branches.
@miconda I don't think it is needed to modify any of the flags except the one controlling multiplexing.
@grumvalski I had a quick look at the http_client module, and don't think it is affected by this issue. It uses curl_easy_perform(), which is a blocking call.
I switched to using http_client module and have not had any crashes since February, so it that module is affected, it's in a different way.
Closing this one, commit was pushed to stable branches.
Closed #2632.