When doing a dispatcher reload with kamcmd kamailio crashes.
``` # kamailio -v version: kamailio 5.0.0-dev6 (x86_64/freebsd) 276ccc flags: STATS: Off, USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, select, kqueue. id: 276ccc compiled on 17:39:46 Sep 29 2016 with clang 3.4 ```
``` #0 0x00000008098d32c3 in ds_avl_destroy (node_ptr=0x809ed2b40) at dispatch.c:3046 3046 if(dest->uri.s!=NULL) [New Thread 80286b400 (LWP 100148/<unknown>)] (gdb) bt #0 0x00000008098d32c3 in ds_avl_destroy (node_ptr=0x809ed2b40) at dispatch.c:3046 #1 0x00000008098d95cb in ds_destroy_list () at dispatch.c:1000 #2 0x0000000809911ead in destroy () at dispatcher.c:593 #3 0x0000000000508345 in destroy_modules () at sr_module.c:811 #4 0x0000000000517ff7 in cleanup (show_status=1) at main.c:526 #5 0x000000000052011c in shutdown_children (sig=15, show_status=1) at main.c:668 #6 0x000000000051f094 in handle_sigs () at main.c:760 #7 0x0000000000530859 in main_loop () at main.c:1748 #8 0x000000000053b530 in main (argc=15, argv=0x7fffffffeb38) at main.c:2632 Current language: auto; currently minimal ```
``` Nov 3 10:10:17 <local0.info> kamailio[43234]: INFO: <core> [main.c:811]: sig_usr(): signal 15 received Nov 3 10:10:17 <local0.info> kamailio[43233]: INFO: <core> [main.c:811]: sig_usr(): signal 15 received Nov 3 10:10:17 <local0.info> kamailio[43232]: INFO: <core> [main.c:811]: sig_usr(): signal 15 received Nov 3 10:10:17 <local0.info> kamailio[43231]: INFO: <core> [main.c:811]: sig_usr(): signal 15 received Nov 3 10:10:17 <local0.info> kamailio[43230]: INFO: <core> [main.c:811]: sig_usr(): signal 15 received Nov 3 10:10:17 <local0.info> kamailio[43228]: INFO: <core> [main.c:811]: sig_usr(): signal 15 received Nov 3 10:10:17 <local0.info> kamailio[43227]: INFO: <core> [main.c:811]: sig_usr(): signal 15 received Nov 3 10:10:17 <local0.info> kamailio[43226]: INFO: <core> [main.c:811]: sig_usr(): signal 15 received Nov 3 10:10:17 <local0.info> kamailio[43225]: INFO: <core> [main.c:811]: sig_usr(): signal 15 received Nov 3 10:10:17 <local0.info> kamailio[43224]: INFO: <core> [main.c:811]: sig_usr(): signal 15 received Nov 3 10:10:17 <local0.info> kamailio[43229]: INFO: <core> [main.c:811]: sig_usr(): signal 15 received Nov 3 10:10:17 <local0.alert> kamailio[43223]: ALERT: <core> [main.c:738]: handle_sigs(): child process 43233 exited normally, status=0 Nov 3 10:10:17 <local0.alert> kamailio[43223]: ALERT: <core> [main.c:738]: handle_sigs(): child process 43234 exited normally, status=0 Nov 3 10:10:17 <local0.alert> kamailio[43223]: ALERT: <core> [main.c:738]: handle_sigs(): child process 43232 exited normally, status=0 Nov 3 10:10:17 <local0.alert> kamailio[43223]: ALERT: <core> [main.c:738]: handle_sigs(): child process 43231 exited normally, status=0 Nov 3 10:10:17 <local0.alert> kamailio[43223]: ALERT: <core> [main.c:738]: handle_sigs(): child process 43230 exited normally, status=0 Nov 3 10:10:17 <local0.alert> kamailio[43223]: ALERT: <core> [main.c:738]: handle_sigs(): child process 43229 exited normally, status=0 Nov 3 10:10:17 <local0.alert> kamailio[43223]: ALERT: <core> [main.c:738]: handle_sigs(): child process 43228 exited normally, status=0 Nov 3 10:10:17 <local0.alert> kamailio[43223]: ALERT: <core> [main.c:738]: handle_sigs(): child process 43227 exited normally, status=0 Nov 3 10:10:17 <local0.alert> kamailio[43223]: ALERT: <core> [main.c:738]: handle_sigs(): child process 43226 exited normally, status=0 Nov 3 10:10:17 <local0.alert> kamailio[43223]: ALERT: <core> [main.c:738]: handle_sigs(): child process 43225 exited normally, status=0 Nov 3 10:10:17 <local0.alert> kamailio[43223]: ALERT: <core> [main.c:738]: handle_sigs(): child process 43224 exited normally, status=0 Nov 3 10:10:17 <local0.info> kamailio[43223]: INFO: <core> [main.c:756]: handle_sigs(): terminating due to SIGCHLD ```
This happens every time I do reload.
Hi Rick,
there should be a second core-dump for the crash? The BT shows the shutdown, not where the actual crash happened. As I cannot reproduce the issue (works on my systems with the same version you're running), can you shed a bit more light about your configuration and your data?
Thanks, Carsten
Sorry there is no other core-dump created. Sorry I forgot to tell about the system, it is kamailio build from git-source on a FreeBSD 10.3-RELEASE-p7 machine.
I attached gdb to the ctl process and got the following: ``` (gdb) s Single stepping until exit from function _kevent, which has no line number information. 0x0000000800fa4664 in __sysctl () from /lib/libc.so.7 (gdb) s Single stepping until exit from function __sysctl, which has no line number information.
Program received signal SIGSEGV, Segmentation fault. 0x00000008098ce9c8 in add_dest2list (id=2, uri= {s = 0x80299d05a "sip:192.168.1.1", len = 16}, flags=0, priority=0, attrs=0x7fffffff4ed0, list_idx=0, setn=0x7fffffff4f04) at dispatch.c:434 434 if(dp0->priority > dp->priority) Current language: auto; currently minimal ```
The dispatcher information is from database.
Output of 'bt full' is more useful. Also get: 'frame 0' + 'info locals'.
Ok this is what I did, hope that is correct.
``` gdb /usr/local/sbin/kamailio /var/run/kamailio/kamailio.core ```
``` (gdb) bt full #0 0x00000008098d32c3 in ds_avl_destroy (node_ptr=0x809ed2b40) at dispatch.c:3046 node = (ds_set_t *) 0x809ee72d8 i = 2 dest = (ds_dest_t *) 0x1 #1 0x00000008098d95cb in ds_destroy_list () at dispatch.c:1000 No locals. #2 0x0000000809911ead in destroy () at dispatcher.c:593 No locals. #3 0x0000000000508345 in destroy_modules () at sr_module.c:811 t = (struct sr_module *) 0x80186fa50 foo = (struct sr_module *) 0x80186f5a0 #4 0x0000000000517ff7 in cleanup (show_status=1) at main.c:526 memlog = 0 #5 0x000000000052011c in shutdown_children (sig=15, show_status=1) at main.c:668 No locals. #6 0x000000000051f094 in handle_sigs () at main.c:760 chld = 0 chld_status = 0 memlog = 32767 #7 0x0000000000530859 in main_loop () at main.c:1748 i = 8 pid = 43479 si = (struct socket_info *) 0x0 si_desc = "udp receiver child=7 sock=5.159.56.124:5060\000\b\000\000\000\001", '\0' <repeats 15 times>, "\020����\177\000\000\230����\177\000\000\030����\177\000\000\017\000\000\000\000\000\000\000P����\177\000\000��x\000\000\000\000\000U�\232\000\000\000\000\000\000\000\000\000\004\000\000" nrprocs = 8 woneinit = 1 #8 0x000000000053b530 in main (argc=15, argv=0x7fffffffeb18) at main.c:2632 cfg_stream = (FILE *) 0x801206170 c = -1 r = 0 tmp = 0x7fffffffede9 "" tmp_len = 32767 port = -5224 proto = 32767 options = 0x99b412 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:" ret = -1 seed = 1396112342 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = (struct name_lst *) 0xf p = 0x7fffffffeb18 "�����\177" st = {st_dev = 90, st_ino = 402609, st_mode = 16832, st_nlink = 2, st_uid = 986, st_gid = 986, st_rdev = 826095, st_atim = {tv_sec = 1462371141, tv_nsec = 922319000}, st_mtim = { tv_sec = 1478165947, tv_nsec = 81501000}, st_ctim = {tv_sec = 1478165947, tv_nsec = 81501000}, st_size = 512, st_blocks = 8, st_blksize = 32768, st_flags = 0, st_gen = 0, st_lspare = 0, st_birthtim = {tv_sec = 1462371141, tv_nsec = 922274000}} Current language: auto; currently minimal ```
``` (gdb) frame 0 #0 0x00000008098d32c3 in ds_avl_destroy (node_ptr=0x809ed2b40) at dispatch.c:3046 3046 if(dest->uri.s!=NULL) ```
``` (gdb) info locals node = (ds_set_t *) 0x809ee72d8 i = 2 dest = (ds_dest_t *) 0x1 ```
It is needed from the ctl process core file, not from the shutdown core file. You have to enable core file per pid, otherwise it may get overwritten.
Ok now I got two core files this one is the second one:
``` (gdb) bt full #0 0x00000008098ce9c8 in add_dest2list (id=2, uri={s = 0x80299c05a "sip:5.159.56.120", len = 16}, flags=0, priority=0, attrs=0x7fffffff4ed0, list_idx=1, setn=0x7fffffff4f04) at dispatch.c:434 dp = (ds_dest_t *) 0x809ee7810 sp = (ds_set_t *) 0x809ed33d8 dp0 = (ds_dest_t *) 0xc0c0c0c0 dp1 = (ds_dest_t *) 0x0 #1 0x00000008098d7854 in ds_load_db () at dispatch.c:959 i = 0 id = 2 nr_rows = 5 setn = 1 flags = 0 priority = 0 nrcols = 5 dest_errs = 0 uri = {s = 0x80299c05a "sip:5.159.56.120", len = 16} attrs = {s = 0x80299c06f "", len = 0} res = (db1_res_t *) 0x801a214b8 values = (db_val_t *) 0x801a1e4b8 rows = (db_row_t *) 0x801a1e400 query_cols = {0x809b279f8, 0x809b27a08, 0x809b27a18, 0x809b27a28, 0x809b27a38} #2 0x00000008098d904c in ds_reload_db () at dispatch.c:870 ret = 43520061 #3 0x0000000809912a20 in dispatcher_rpc_reload (rpc=0x804c01c48, ctx=0x7fffffff53e0) at dispatcher.c:1177 No locals. #4 0x00000008049bc475 in process_rpc_req (buf=0x802981034 "�\003\024\016F.�\221\022dispatcher.reload", size=27, bytes_needed=0x7fffffff5948, sh=0x7fffffff58c0, saved_state=0x802991038) at binrpc_run.c:675 err = 0 val = {name = {s = 0x802982000 "", len = -43824}, type = 1, u = {strval = {s = 0x80298103d "dispatcher.reload", len = 17}, fval = 1.6997468094767194e-313, intval = 43520061, end = 43520061}} rpc_e = (rpc_export_t *) 0x801ab6578 f_ctx = {in = {ctx = {tlen = 20, cookie = 239480522, type = 0, flags = 1, offset = 20, in_struct = 0, in_array = 0}, s = 0x80298104f "", end = 0x80298104f "", record_no = 0, in_struct = 0}, out = { pkt = {body = 0x802992000 "", end = 0x80299a000 "", crt = 0x802992000 ""}, structs = {next = 0x7fffffff5430, prev = 0x7fffffff5430}}, send_h = 0x7fffffff58c0, method = 0x80298103d "dispatcher.reload", gc = 0x0, replied = 0, err_code = 0, err_phrase = {s = 0x0, len = 0}} ctx = (struct binrpc_parse_ctx *) 0x7fffffff53e0 #5 0x00000008049d7a99 in handle_stream_read (s_c=0x802981000, idx=-1) at io_listener.c:511 bytes_free = 65535 bytes_read = 27 bytes_needed = 0 bytes_processed = 27 r = (struct stream_req *) 0x802981020 sh = {fd = 11, type = 0, from = {sa_in = {s = {sa_len = 186 '�', sa_family = 61 '=', sa_data = "k\000\000\000\000\000\000\020\230\002\b\000\000"}, sin = {sin_len = 186 '�', sin_family = 61 '=', sin_port = 107, sin_addr = {s_addr = 0}, sin_zero = "\000\020\230\002\b\000\000"}, sin6 = {sin6_len = 186 '�', sin6_family = 61 '=', sin6_port = 107, sin6_flowinfo = 0, sin6_addr = {__u6_addr = { __u6_addr8 = "\000\020\230\002\b\000\000\000\v\000\000\000\020\000\000", __u6_addr16 = {4096, 664, 8, 0, 11, 0, 16, 0}, __u6_addr32 = {43520000, 8, 11, 16}}}, sin6_scope_id = 272}}, sa_un = { sun_len = 186 '�', sun_family = 61 '=', sun_path = "k\000\000\000\000\000\000\020\230\002\b\000\000\000\v\000\000\000\020\000\000\000\020\001", '\0' <repeats 14 times>, "��������\033\000\000\000\000\000\000\000\220��\001\b\000\000\000\234>k\000\000\000\000\000 _\177\000\000\000\000\000\210(�\t\b\000\000\000@Y���\177\000\000\f�\235\004\b\000\000\000\000"}}, from_len = 0} #6 0x00000008049d2b35 in handle_io (fm=0x801ac1c78, events=1, idx=-1) at io_listener.c:706 ret = 8 #7 0x00000008049cf66c in io_wait_loop_kqueue (h=0x804c11d80, t=10, repeat=0) at io_wait.h:1182 n = 1 r = 0 tspec = {tv_sec = 10, tv_nsec = 0} fm = (struct fd_map *) 0x801ac1c78 orig_changes = 0 apply_changes = 0 revents = 1 #8 0x00000008049c83b2 in io_listen_loop (fd_no=1, cs_lst=0x80283d700) at io_listener.c:293 max_fd_no = 199 poll_err = 0x0 poll_method = 6 cs = (struct ctrl_socket *) 0x0 type = 2 #9 0x00000008049a6f12 in mod_child (rank=0) at ctl.c:320 rpc_handler = 1 pid = 0 cs = (struct ctrl_socket *) 0x801a33ff8 #10 0x0000000000508a93 in init_mod_child (m=0x8018587e8, rank=0) at sr_module.c:921 No locals. #11 0x00000000005084ba in init_mod_child (m=0x801858e48, rank=0) at sr_module.c:918 No locals. #12 0x00000000005084ba in init_mod_child (m=0x80185a758, rank=0) at sr_module.c:918 No locals. #13 0x00000000005084ba in init_mod_child (m=0x80185ade0, rank=0) at sr_module.c:918 No locals. #14 0x00000000005084ba in init_mod_child (m=0x80185b5b0, rank=0) at sr_module.c:918 No locals. #15 0x00000000005084ba in init_mod_child (m=0x80185bdb0, rank=0) at sr_module.c:918 No locals. #16 0x00000000005084ba in init_mod_child (m=0x80185c510, rank=0) at sr_module.c:918 No locals. #17 0x00000000005084ba in init_mod_child (m=0x80185e008, rank=0) at sr_module.c:918 No locals. #18 0x00000000005084ba in init_mod_child (m=0x80185ee20, rank=0) at sr_module.c:918 No locals. #19 0x00000000005084ba in init_mod_child (m=0x80185f458, rank=0) at sr_module.c:918 No locals. #20 0x00000000005084ba in init_mod_child (m=0x80185f970, rank=0) at sr_module.c:918 No locals. #21 0x00000000005084ba in init_mod_child (m=0x80185fde0, rank=0) at sr_module.c:918 No locals. #22 0x00000000005084ba in init_mod_child (m=0x801860a60, rank=0) at sr_module.c:918 No locals. #23 0x00000000005084ba in init_mod_child (m=0x801860e30, rank=0) at sr_module.c:918 No locals. #24 0x00000000005084ba in init_mod_child (m=0x8018617f8, rank=0) at sr_module.c:918 No locals. #25 0x00000000005084ba in init_mod_child (m=0x801868820, rank=0) at sr_module.c:918 No locals. #26 0x00000000005084ba in init_mod_child (m=0x801869668, rank=0) at sr_module.c:918 No locals. #27 0x00000000005084ba in init_mod_child (m=0x80186a180, rank=0) at sr_module.c:918 No locals. #28 0x00000000005084ba in init_mod_child (m=0x80186aa70, rank=0) at sr_module.c:918 No locals. #29 0x00000000005084ba in init_mod_child (m=0x80186aea0, rank=0) at sr_module.c:918 No locals. #30 0x00000000005084ba in init_mod_child (m=0x80186b2a0, rank=0) at sr_module.c:918 No locals. #31 0x00000000005084ba in init_mod_child (m=0x80186b668, rank=0) at sr_module.c:918 No locals. #32 0x00000000005084ba in init_mod_child (m=0x80186bea8, rank=0) at sr_module.c:918 No locals. #33 0x00000000005084ba in init_mod_child (m=0x80186c658, rank=0) at sr_module.c:918 No locals. #34 0x00000000005084ba in init_mod_child (m=0x80186ced0, rank=0) at sr_module.c:918 No locals. #35 0x00000000005084ba in init_mod_child (m=0x80186d440, rank=0) at sr_module.c:918 No locals. #36 0x00000000005084ba in init_mod_child (m=0x80186dad8, rank=0) at sr_module.c:918 No locals. #37 0x00000000005084ba in init_mod_child (m=0x80186e140, rank=0) at sr_module.c:918 No locals. #38 0x00000000005084ba in init_mod_child (m=0x80186e690, rank=0) at sr_module.c:918 No locals. #39 0x00000000005084ba in init_mod_child (m=0x80186eb38, rank=0) at sr_module.c:918 No locals. #40 0x00000000005084ba in init_mod_child (m=0x80186f1a0, rank=0) at sr_module.c:918 No locals. #41 0x00000000005084ba in init_mod_child (m=0x80186f5a0, rank=0) at sr_module.c:918 No locals. #42 0x00000000005084ba in init_mod_child (m=0x80186fa50, rank=0) at sr_module.c:918 No locals. #43 0x000000000050846a in init_child (rank=0) at sr_module.c:947 No locals. #44 0x000000000052f032 in main_loop () at main.c:1697 i = 8 pid = 44543 si = (struct socket_info *) 0x0 si_desc = "udp receiver child=7 sock=5.159.56.124:5060\000\b\000\000\000\001", '\0' <repeats 15 times>, "\020����\177\000\000\230����\177\000\000\030����\177\000\000\017\000\000\000\000\000\000\000P����\177\000\000��x\000\000\000\000\000U�\232\000\000\000\000\000\000\000\000\000\004\000\000" nrprocs = 8 woneinit = 1 #45 0x000000000053b530 in main (argc=15, argv=0x7fffffffeb18) at main.c:2632 cfg_stream = (FILE *) 0x801206170 c = -1 r = 0 tmp = 0x7fffffffede9 "" tmp_len = 32767 port = -5224 proto = 32767 options = 0x99b412 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:" ret = -1 seed = 1833853928 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = (struct name_lst *) 0xf p = 0x7fffffffeb18 "�����\177" st = {st_dev = 90, st_ino = 402609, st_mode = 16832, st_nlink = 2, st_uid = 986, st_gid = 986, st_rdev = 826095, st_atim = {tv_sec = 1462371141, tv_nsec = 922319000}, st_mtim = {tv_sec = 1478177288, tv_nsec = 435051000}, st_ctim = {tv_sec = 1478177288, tv_nsec = 435051000}, st_size = 512, st_blocks = 8, st_blksize = 32768, st_flags = 0, st_gen = 0, st_lspare = 0, st_birthtim = {tv_sec = 1462371141, tv_nsec = 922274000}} ```
The dp0 pointer ends up in an invalid value:
``` dp0 = (ds_dest_t *) 0xc0c0c0c0 ```
Likely a buffer overflow or a mis-shifting of pointer values.
Can you grab the latest master branch and try again? I pushed some memset inits in dispatcher, likely not to solve the issue, but it's better to work on the last code base to hunt the problem.
Also, run kamailio with '-x qm' -- it will use the memory manager that is more friendly for debugging such issues.
While I was try to test I am getting this with latest master:
``` # kamailio -c loading modules under config path: /usr/local/lib64/kamailio/modules/ 0(67414) ERROR: <core> [sr_module.c:582]: load_module(): could not open module </usr/local/lib64/kamailio/modules/alias_db.so>: /usr/local/lib64/kamailio/modules/alias_db.so: Undefined symbol "set_alias_to_pvar" 0(67414) : <core> [cfg.y:3407]: yyerror_at(): parse error in config file /usr/local/etc/kamailio/kamailio-modules.cfg, line 86, column 12-24: failed to load module 0(67414) ERROR: <core> [modparam.c:152]: set_mod_param_regex(): No module matching <alias_db> found 0(67414) : <core> [cfg.y:3410]: yyerror_at(): parse error in config file /usr/local/etc/kamailio/kamailio-modules.cfg, line 224, column 128: Can't set module parameter 0(67414) ERROR: <core> [modparam.c:152]: set_mod_param_regex(): No module matching <alias_db> found 0(67414) : <core> [cfg.y:3410]: yyerror_at(): parse error in config file /usr/local/etc/kamailio/kamailio-modules.cfg, line 225, column 37: Can't set module parameter 0(67414) ERROR: <core> [cfg.y:3270]: yyparse(): cfg. parser: failed to find command alias_db_lookup (params 1) 0(67414) : <core> [cfg.y:3410]: yyerror_at(): parse error in config file /usr/local/etc/kamailio/kamailio-routing.cfg, line 679, column 36: unknown command, missing loadmodule?
ERROR: bad config file (8 errors) 0(67414) INFO: <core> [sctp_core.c:53]: sctp_core_destroy(): SCTP API not initialized ```
the `inline int set_alias_to_pvar(struct sip_msg* _msg, str *alias, int no, void *p)` doesn't work for me with clang, when changed to `static inline int set_alias_to_pvar(struct sip_msg* _msg, str *alias, int no, void *p)` it works
With latest right now I didn't get a crash.
If you get other issues not related to dispatcher, open a different item here, otherwise is not easy to track what's wrong on a topic.
Closed #846.
I removed the inline from the alias_db function, pushed the change to the master.
Given that the dispatcher is not exposing the issue, I am closing this one. If something new appears, then maybe is better to open a new item in the bug tracker.