### Description
I'm using a few serial http_async_client functions in my config, I found one setup in which kamailio stops with a segmentation fault just doing one call. I can reproduce it in my concrete config, but can't put my finger on the bleeding. I do have a core file.
### Troubleshooting
#### Reproduction
gdb kamailio core GNU gdb (Debian 7.7.1+dfsg-5) 7.7.1 Copyright (C) 2014 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". Type "show configuration" for configuration details. For bug reporting instructions, please see: http://www.gnu.org/software/gdb/bugs/. Find the GDB manual and other documentation resources online at: http://www.gnu.org/software/gdb/documentation/. For help, type "help". Type "apropos word" to search for commands related to "word"... Reading symbols from kamailio...done. [New LWP 8187] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `kamailio -f /etc/kamailio/kamailio.cfg'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007f85580c360a in t_continue () from /usr/lib/x86_64-linux-gnu/kamailio/modules/tm.so (gdb) bt full #0 0x00007f85580c360a in t_continue () from /usr/lib/x86_64-linux-gnu/kamailio/modules/tm.so No symbol table info available. #1 0x00007f8555ed3848 in async_http_cb () from /usr/lib/x86_64-linux-gnu/kamailio/modules/http_async_client.so No symbol table info available. #2 0x00007f8555ecda72 in check_multi_info () from /usr/lib/x86_64-linux-gnu/kamailio/modules/http_async_client.so No symbol table info available. #3 0x00007f8555ec56a4 in event_cb () from /usr/lib/x86_64-linux-gnu/kamailio/modules/http_async_client.so No symbol table info available. #4 0x00007f8555a043dc in event_base_loop () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5 No symbol table info available. #5 0x00007f8555ed0c3d in async_http_run_worker () from /usr/lib/x86_64-linux-gnu/kamailio/modules/http_async_client.so No symbol table info available. #6 0x00007f8555eba86e in ?? () from /usr/lib/x86_64-linux-gnu/kamailio/modules/http_async_client.so No symbol table info available. #7 0x00000000004f204c in init_mod_child (m=0x7f855a6195c8, rank=0) at core/sr_module.c:921 __FUNCTION__ = "init_mod_child" #8 0x00000000004f1d6a in init_mod_child (m=0x7f855a61a1e0, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #9 0x00000000004f1d6a in init_mod_child (m=0x7f855a61a850, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #10 0x00000000004f1d6a in init_mod_child (m=0x7f855a61adb0, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #11 0x00000000004f1d6a in init_mod_child (m=0x7f855a61b420, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #12 0x00000000004f1d6a in init_mod_child (m=0x7f855a61b8d0, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #13 0x00000000004f1d6a in init_mod_child (m=0x7f855a61bfe8, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #14 0x00000000004f1d6a in init_mod_child (m=0x7f855a61c440, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #15 0x00000000004f1d6a in init_mod_child (m=0x7f855a61c7e8, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #16 0x00000000004f1d6a in init_mod_child (m=0x7f855a61cb88, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #17 0x00000000004f1d6a in init_mod_child (m=0x7f855a61cfc8, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #18 0x00000000004f1d6a in init_mod_child (m=0x7f855a61d370, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #19 0x00000000004f1d6a in init_mod_child (m=0x7f855a61d780, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #20 0x00000000004f1d6a in init_mod_child (m=0x7f855a61dbc8, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #21 0x00000000004f1d6a in init_mod_child (m=0x7f855a61e380, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" ---Type <return> to continue, or q <return> to quit--- #22 0x00000000004f1d6a in init_mod_child (m=0x7f855a61ec00, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #23 0x00000000004f1d6a in init_mod_child (m=0x7f855a621608, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #24 0x00000000004f1d6a in init_mod_child (m=0x7f855a622290, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #25 0x00000000004f1d6a in init_mod_child (m=0x7f855a622660, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #26 0x00000000004f1d6a in init_mod_child (m=0x7f855a62c5d0, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #27 0x00000000004f1d6a in init_mod_child (m=0x7f855a62dae0, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #28 0x00000000004f237a in init_child (rank=0) at core/sr_module.c:947 No locals. #29 0x00000000004231bb in main_loop () at main.c:1700 i = 4 pid = 8165 si = 0x0 si_desc = "udp receiver child=3 sock=185.165.211.54:5063\000\000\000OX\350T\205\177\000\000\000Y\350T\205\177\000\000\260\031B\201\000\000\000\000\000\000\000\000\000\000\350\000\000\000t\000\000\000\000\000\030"}Z\205\177\000\000\000\000\000\000\000\000\000\000\001\000\000\000\002\000\000\000\002\000\000\000\374\177\000\000\277\201e\000\000\000\000" nrprocs = 4 woneinit = 1 __FUNCTION__ = "main_loop" #30 0x0000000000429a61 in main (argc=3, argv=0x7ffc81421d18) at main.c:2639 cfg_stream = 0xf55010 c = -1 r = 0 tmp = 0x7f855b58173d <_dl_lookup_symbol_x+349> "\203\370" tmp_len = 0 port = 1 proto = 32645 options = 0x72b1b0 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 1992621074 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x7ffc81421bd0 p = 0x7ffc81421d38 "0>B\201\374\177" st = {st_dev = 47, st_ino = 29, st_nlink = 2, st_mode = 16877, st_uid = 104, st_gid = 110, __pad0 = 0, st_rdev = 0, st_size = 60, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1488206724, tv_nsec = 372702711}, st_mtim = {tv_sec = 1491326162, tv_nsec = 837249928}, st_ctim = {tv_sec = 1491326162, tv_nsec = 837249928}, __glibc_reserved = {0, 0, 0}} __FUNCTION__ = "main" (gdb) info locals No symbol table info available. (gdb) list 1838 int proto; 1839 char *options; 1840 int ret; 1841 unsigned int seed; 1842 int rfd; 1843 int debug_save, debug_flag; 1844 int dont_fork_cnt; 1845 struct name_lst* n_lst; 1846 char *p; 1847 struct stat st = {0};
``` (paste your debugging data here) ```
#### Log Messages
<!-- Check the syslog file and if there are relevant log messages printed by Kamailio, add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site). -->
``` (paste your log messages here) ```
#### SIP Traffic
``` (paste your sip traffic here) ```
### Possible Solutions
### Additional Information
kamailio 5.0.0 (x86_64/linux)
* **Operating System**: debian on proxmox 4.4.35-1-pve
Could you please describe the procedure to reproduce the segfault and, in case, the relevant part of your config? Thanks.
Also, install kamailio-dbg package (which has the debug symbols) and grab again the backtrace, it will be more useful.
Using kamailio-dbg:
Reading symbols from /usr/sbin/kamailio...Reading symbols from /usr/lib/debug/.build-id/fe/4d6c322f76df685bbec9adafde99fc43c0bc6a.debug...done. done. [New LWP 11891] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". Core was generated by `/usr/sbin/kamailio -f /etc/kamailio/kamailio.cfg'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007f226e50760a in t_continue (hash_index=1984, label=122428216, route=0x7f2270bb15f0) at t_suspend.c:411 411 t_suspend.c: No such file or directory. (gdb) bt full #0 0x00007f226e50760a in t_continue (hash_index=1984, label=122428216, route=0x7f2270bb15f0) at t_suspend.c:411 t = 0x7f2262786938 faked_req = 0x7f226e4763a1 <t_lookup_ident+962> faked_req_len = 0 cancel_data = {cancel_bitmap = 0, reason = {cause = 0, u = {text = {s = 0x0, len = 2}, e2e_cancel = 0x0, packed_hdrs = {s = 0x0, len = 2}}}} branch = 0 uac = 0x0 ret = 32765 cb_type = 3 msg_status = 32765 last_uac_status = 1646426056 reply_status = 4 do_put_on_wait = 1 hdr = 0xffffffff00000011 prev = 0x0 tmp = 0x0 route_type_bk = 32546 __FUNCTION__ = "t_continue" #1 0x00007f226c317848 in async_http_cb (reply=0x7f22627d9d28, param=0x7f22626d16f8) at async_http.c:217 aq = 0x7f22626d16f8 act = 0x7f2270bb15f0 tindex = 1984 tlabel = 122428216 t = 0x7f2262786938 p = 0x0 newbuf = {s = 0x0, len = 0} fmsg = 0x26c3860 __FUNCTION__ = "async_http_cb" #2 0x00007f226c311a72 in check_multi_info (g=0x7f22626ac5a8) at http_multi.c:573 eff_url = 0x26cd640 "https://109.68.161.209:9443/customers/cdr/" msg = 0x26c3880 msgs_left = 0 easy = 0x26c3860 res = CURLE_OK cell = 0x7f226279b068 __FUNCTION__ = "check_multi_info" #3 0x00007f226c3096a4 in event_cb (fd=11, kind=2, userp=0x26c3860) at http_multi.c:145 g = 0x7f22626ac5a8 rc = CURLM_OK easy = 0x26c3860 cell = 0x7f226279b068 __FUNCTION__ = "event_cb" action = 1 #4 0x00007f226be483dc in event_base_loop () from /usr/lib/x86_64-linux-gnu/libevent-2.0.so.5 No symbol table info available. #5 0x00007f226c314c3d in async_http_run_worker (worker=0x7f2262418430) at async_http.c:86 No locals. #6 0x00007f226c2fe86e in child_init (rank=0) at http_async_client_mod.c:367 ---Type <return> to continue, or q <return> to quit--- pid = 0 i = 0 __FUNCTION__ = "child_init" #7 0x000000000053d8c2 in init_mod_child (m=0x7f2270a5d5c8, rank=0) at core/sr_module.c:921 __FUNCTION__ = "init_mod_child" #8 0x000000000053d5e0 in init_mod_child (m=0x7f2270a5e1e0, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #9 0x000000000053d5e0 in init_mod_child (m=0x7f2270a5e850, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #10 0x000000000053d5e0 in init_mod_child (m=0x7f2270a5edb0, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #11 0x000000000053d5e0 in init_mod_child (m=0x7f2270a5f420, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #12 0x000000000053d5e0 in init_mod_child (m=0x7f2270a5f8d0, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #13 0x000000000053d5e0 in init_mod_child (m=0x7f2270a5ffe8, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #14 0x000000000053d5e0 in init_mod_child (m=0x7f2270a60440, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #15 0x000000000053d5e0 in init_mod_child (m=0x7f2270a607e8, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #16 0x000000000053d5e0 in init_mod_child (m=0x7f2270a60b88, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #17 0x000000000053d5e0 in init_mod_child (m=0x7f2270a60fc8, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #18 0x000000000053d5e0 in init_mod_child (m=0x7f2270a61370, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #19 0x000000000053d5e0 in init_mod_child (m=0x7f2270a61780, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #20 0x000000000053d5e0 in init_mod_child (m=0x7f2270a61bc8, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #21 0x000000000053d5e0 in init_mod_child (m=0x7f2270a62380, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #22 0x000000000053d5e0 in init_mod_child (m=0x7f2270a62c00, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #23 0x000000000053d5e0 in init_mod_child (m=0x7f2270a65608, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #24 0x000000000053d5e0 in init_mod_child (m=0x7f2270a66290, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #25 0x000000000053d5e0 in init_mod_child (m=0x7f2270a66660, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #26 0x000000000053d5e0 in init_mod_child (m=0x7f2270a705d0, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #27 0x000000000053d5e0 in init_mod_child (m=0x7f2270a71ae0, rank=0) at core/sr_module.c:918 __FUNCTION__ = "init_mod_child" #28 0x000000000053dbf0 in init_child (rank=0) at core/sr_module.c:947 No locals. #29 0x000000000042357c in main_loop () at main.c:1700 ---Type <return> to continue, or q <return> to quit--- i = 4 pid = 11878 si = 0x0 si_desc = "udp receiver child=3 sock=185.165.211.54:5063\000\000\000\361\240u\000\000\000\000\000\000 \266\034$LU\250O\021\000\020\000\000\000\000a\021\230b\000\000\000\000\360xA\000\000\000\000\000\000\213\314Q\375\177", '\000' <repeats 18 times>, "\060\210\314Q\375\177\000\000\035\322^\000\000\000\000" nrprocs = 4 woneinit = 1 __FUNCTION__ = "main_loop" #30 0x0000000000429f71 in main (argc=3, argv=0x7ffd51cc8b08) at main.c:2639 cfg_stream = 0x25de010 c = -1 r = 0 tmp = 0x7f22719c573d <_dl_lookup_symbol_x+349> "\203\370" tmp_len = 1897947560 port = 32546 proto = 1372359104 options = 0x737490 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 444768467 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x0 p = 0x1 <error: Cannot access memory at address 0x1> st = {st_dev = 47, st_ino = 29, st_nlink = 2, st_mode = 16877, st_uid = 104, st_gid = 110, __pad0 = 0, st_rdev = 0, st_size = 60, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1488206724, tv_nsec = 372702711}, st_mtim = {tv_sec = 1491386032, tv_nsec = 103114082}, st_ctim = {tv_sec = 1491386032, tv_nsec = 103114082}, __glibc_reserved = {0, 0, 0}} __FUNCTION__ = "main" (gdb) (gdb) info locals t = 0x7f2262786938 faked_req = 0x7f226e4763a1 <t_lookup_ident+962> faked_req_len = 0 cancel_data = {cancel_bitmap = 0, reason = {cause = 0, u = {text = {s = 0x0, len = 2}, e2e_cancel = 0x0, packed_hdrs = {s = 0x0, len = 2}}}} branch = 0 uac = 0x0 ret = 32765 cb_type = 3 msg_status = 32765 last_uac_status = 1646426056 reply_status = 4 do_put_on_wait = 1 hdr = 0xffffffff00000011 prev = 0x0 tmp = 0x0 route_type_bk = 32546 __FUNCTION__ = "t_continue"
(gdb) list 406 in t_suspend.c (gdb)
The relevant part in the config which makes it crash is:
$http_req(all) = $null; # reset the parameters $http_req(timeout) = 100; # 100 ms $http_req(method) = "POST"; if ($rs=~"^[4-6][0-9][0-9]") { $http_req(body)="{'call_id': '" + $ci + "', 'from_ip': '" + $si + "', 'event': 'notanswered','disposition': '" + $rr "; http_async_query("https://<myserver>/customers/", "HTTP_REPLY"); } I have http_async_query as well for invite, answers, ringing, etc, those all go well, but on call 4xx, 5xx, 6xx , it segfaults.
With kamailio-dbg, we do see "411 t_suspend.c: No such file or directory."
@davyvdm please use the proper format (insert code) when pasting gdb output or github gets confused
@linuxmaniac thank you for the remark ;) updated it
@davyvdm - your config snippet is from a reply_route (executed when handling a SIP response)?
@miconda that is correct, when I put the snippet elsewhere (not in reply_route), there is no issue...
@davyvdm ok, wanted to clarify -- maybe Federico ( @grumvalski ) can assert if the function is expected to run for sip replies or not, and investigate if does the expected behaviour in this case.
I'm sorry, I've been quite busy in the last days. I will try to reproduce/investigate this in the next days. The function should indeed work in the scenario. @davyvdm: to be clear, are you calling http_async_query in a reply route or in a failure route?
The issue is in the tm module. While processing the resumed reply, since we are in a case of final reply for the transaction, we delete it after sending it out. This is causing the carsh when, at the end of t_continue we try to access the branch's reply. I've open a PR https://github.com/kamailio/kamailio/pull/1063 with a fix. Also, why do you need to suspend the reply in this case? Wouldn't be fine to not suspend the transaction? See $http_req(suspend) http://www.kamailio.org/docs/modules/devel/modules/http_async_client.html#id....
@davyvdm - have you been able to test with patches from PR #1063? Is all going fine with them?
@davyvdm: any feedback?
Closed #1056.
Closing, reopen if PR #1063 didn't fix it.