<!-- Kamailio Project uses GitHub Issues only for bugs in the code or feature requests. Please use this template only for bug reports.
If you have questions about using Kamailio or related to its configuration file, ask on sr-users mailing list:
* https://lists.kamailio.org/mailman3/postorius/lists/sr-users.lists.kamailio....
If you have questions about developing extensions to Kamailio or its existing C code, ask on sr-dev mailing list:
* https://lists.kamailio.org/mailman3/postorius/lists/sr-dev.lists.kamailio.or...
Please try to fill this template as much as possible for any issue. It helps the developers to troubleshoot the issue.
Note that an issue report may be closed automatically after about 2 months if there is no interest from developers or community users on pursuing it, being considered expired. In such case, it can be reopened by writing a comment that includes the token `/notexpired`. About two weeks before considered expired, the issue is marked with the label `stale`, trying to notify the submitter and everyone else that might be interested in it. To remove the label `stale`, write a comment that includes the token `/notstale`. Also, any comment postpone the `expire` timeline, being considered that there is interest in pursuing the issue.
If there is no content to be filled in a section, the entire section can be removed.
You can delete the comments from the template sections when filling.
You can delete next line and everything above before submitting (it is a comment). -->
### Description
Kamailio segfaults on startup. After a bit of digging, it looks like the issue is caused by trying to publish something related to an expired dalog which is present in the database:
``` kamailio=# select * from dialog; id | hash_entry | hash_id | callid | from_uri | from_tag | to_uri | to_tag | caller_cseq | callee_cseq | caller_route_set | callee_route_set | caller_contact | callee_contact | caller_sock | callee_sock | state | start_time | timeout | sflags | iflags | toroute_name | req_ur i | xdata ----+------------+---------+---------------------------+------------------------------+----------+-------------------------------------------+---------------+-------------+- ------------+--------------------------------------------+------------------+---------------------------------------------------------------------------+-------------------- -----------------------------------+----------------------+----------------------+-------+------------+------------+--------+--------+--------------+------------------------ -------------------+------- 90 | 3720 | 11295 | msowdcchfzrlapg@localhost | sip:101@example.voismart.com | msfvz | sip:*9001*0039123456@example.voismart.com | HBQj442BFrH0N | 131 | 0 | sip:172.23.42.1;lr=on;ftag=msfvz;nat=yes | | sip:101_example_voismart_com@172.23.42.1:5065;alias=192.168.1.201~32878~1 | sip:*9001*003912345 6@172.23.42.211:5060;transport=udp | udp:172.23.42.3:5060 | udp:172.23.42.3:5060 | 4 | 1706715542 | 1706801943 | 0 | 1 | | sip:*9001*0039123456@ex ample.voismart.com | (1 row) ```
### Troubleshooting
#### Debugging Data
<!-- If you got a core dump, use gdb to extract troubleshooting data - full backtrace, local variables and the list of the code at the issue location.
gdb /path/to/kamailio /path/to/corefile bt full info locals list
If you are familiar with gdb, feel free to attach more of what you consider to be relevant. -->
``` (gdb) bt full #0 0x00007f193de1bc93 in free_str_list_all (del_current=0x2e656c706d617865) at /usr/local/src/pkg/src/modules/pua_dialoginfo/pua_dialoginfo.c:1109 del_next = <optimized out> __func__ = "free_str_list_all" #1 0x00007f193de1ebc6 in free_dlginfo_cell (param=0x7f1942265dc8) at /usr/local/src/pkg/src/modules/pua_dialoginfo/pua_dialoginfo.c:1079 cell = 0x7f1942265dc8 cell = <optimized out> __func__ = <optimized out> #2 free_dlginfo_cell (param=0x7f1942265dc8) at /usr/local/src/pkg/src/modules/pua_dialoginfo/pua_dialoginfo.c:1070 cell = 0x0 __func__ = "free_dlginfo_cell" #3 0x00007f1940b7120b in destroy_dlg_callbacks_list (cb=0x0) at /usr/local/src/pkg/src/modules/dialog/dlg_cb.c:74 cb_t = 0x7f1942266050 __func__ = "destroy_dlg_callbacks_list" #4 0x00007f1940ba5229 in destroy_dlg (dlg=0x7f1941b1a8a0) at /usr/local/src/pkg/src/modules/dialog/dlg_hash.c:369 ret = <optimized out> var = <optimized out> __func__ = "destroy_dlg" #5 0x00007f1940ba6b08 in destroy_dlg_table () at /usr/local/src/pkg/src/modules/dialog/dlg_hash.c:436 dlg = 0x0 l_dlg = <optimized out> i = 3720 __func__ = <optimized out> #6 destroy_dlg_table () at /usr/local/src/pkg/src/modules/dialog/dlg_hash.c:422 dlg = <optimized out> l_dlg = <optimized out> i = <optimized out> __func__ = "destroy_dlg_table" #7 0x00007f1940b57b06 in mod_destroy () at /usr/local/src/pkg/src/modules/dialog/dialog.c:871 No locals. #8 0x000055dd96a6d1e2 in destroy_modules () at core/sr_module.c:872 t = 0x7f1949d9d5f0 foo = 0x7f1949d94710 __func__ = "destroy_modules" #9 0x000055dd96864d01 in cleanup (show_status=1) at /usr/local/src/pkg/src/main.c:573 memlog = <optimized out> --Type <RET> for more, q to quit, c to continue without paging-- __func__ = "cleanup" #10 0x000055dd96c92a4d in shutdown_children.constprop.0 (show_status=show_status@entry=1, sig=15) at /usr/local/src/pkg/src/main.c:721 __func__ = <optimized out> #11 0x000055dd9685fcd5 in handle_sigs () at /usr/local/src/pkg/src/main.c:821 chld = <optimized out> chld_status = 139 any_chld_stopped = <optimized out> memlog = <optimized out> __func__ = "handle_sigs" #12 0x000055dd96868aa4 in main_loop () at /usr/local/src/pkg/src/main.c:1988 i = <optimized out> pid = <optimized out> si = <optimized out> si_desc = "udp receiver child=31 sock=172.23.42.3:9999\000\335U\000\000\000\260\312I\031\177\000\000\003\000\000\000\377\377\377\377\004\220\312\226\335U\000\000\004\000\000\000\031\177\000\000\003\000\000\000\000\000\000\000\t\000\000\000\000\000\000\000Tue Feb \000\023&\252U\314\342\207:34 2024\005\000\000\000\000\000\000" nrprocs = <optimized out> woneinit = 1 error = <optimized out> __func__ = "main_loop" #13 0x000055dd96859ffc in main (argc=<optimized out>, argv=<optimized out>) at /usr/local/src/pkg/src/main.c:3212 cfg_stream = <optimized out> c = <optimized out> r = <optimized out> tmp = 0x7ffe9d9b7db4 "" tmp_len = 1274056464 port = 32537 proto = 1275323104 ahost = 0x0 aport = 0 options = 0x55dd96cac300 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 3377142042 rfd = <optimized out> debug_save = <optimized out> debug_flag = <optimized out> --Type <RET> for more, q to quit, c to continue without paging-- dont_fork_cnt = <optimized out> n_lst = <optimized out> p = <optimized out> st = {st_dev = 194, st_ino = 570312884, st_nlink = 2, st_mode = 16872, st_uid = 101, st_gid = 101, __pad0 = 0, st_rdev = 0, st_size = 6, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1707216241, tv_nsec = 46870706}, st_mtim = {tv_sec = 1707216227, tv_nsec = 573144090}, st_ctim = {tv_sec = 1707216227, tv_nsec = 573144090}, __glibc_reserved = {0, 0, 0}} tbuf = '\000' <repeats 16 times>, '/' <repeats 16 times>, "\230\r", '\000' <repeats 14 times>, "`", '\000' <repeats 15 times>, "\001", '\000' <repeats 143 times>... option_index = 12 __func__ = "main" long_options = {{name = 0x55dd96cabfe2 "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x55dd96cb3dc8 "version", has_arg = 0, flag = 0x0, val = 118}, { name = 0x55dd96cc3731 "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x55dd96cabfe7 "subst", has_arg = 1, flag = 0x0, val = 1025}, { name = 0x55dd96cabfed "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x55dd96cabff6 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, { name = 0x55dd96cac000 "server-id", has_arg = 1, flag = 0x0, val = 1028}, {name = 0x55dd96cac00a "loadmodule", has_arg = 1, flag = 0x0, val = 1029}, { name = 0x55dd96cac015 "modparam", has_arg = 1, flag = 0x0, val = 1030}, {name = 0x55dd96cac01e "log-engine", has_arg = 1, flag = 0x0, val = 1031}, { name = 0x55dd96cb3ee5 "debug", has_arg = 1, flag = 0x0, val = 1032}, {name = 0x55dd96cac029 "cfg-print", has_arg = 0, flag = 0x0, val = 1033}, { name = 0x55dd96cac033 "atexit", has_arg = 1, flag = 0x0, val = 1034}, {name = 0x55dd96cac03a "all-errors", has_arg = 0, flag = 0x0, val = 1035}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}}
(gdb) info locals del_next = <optimized out> __func__ = "free_str_list_all"
(gdb) list 1104 in /usr/local/src/pkg/src/modules/pua_dialoginfo/pua_dialoginfo.c ```
#### Log Messages
<!-- Check the syslog file and if there are relevant log messages printed by Kamailio, add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site). -->
``` 65(75) DEBUG: dialog [dlg_timer.c:232]: get_expired_dlgs(): start with tl=0x7f1941b1a900 tl->prev=0x7f1941a084d8 tl->next=0x7f1941a084d8 (45536602) at 45536603 and end with end=0x7f1941a084d8 end->prev=0x7f1941b1a900 end->next=0x7f1941b1a900 65(75) DEBUG: dialog [dlg_timer.c:237]: get_expired_dlgs(): getting tl=0x7f1941b1a900 tl->prev=0x7f1941a084d8 tl->next=0x7f1941a084d8 with 45536602 65(75) DEBUG: dialog [dlg_timer.c:243]: get_expired_dlgs(): end with tl=0x7f1941a084d8 tl->prev=0x7f1941b1a900 tl->next=0x7f1941b1a900 and d_timer->first.next->prev=(nil) 65(75) DEBUG: dialog [dlg_timer.c:280]: dlg_timer_routine(): tl=0x7f1941b1a900 next=(nil) 65(75) DEBUG: dialog [dlg_cb.c:267]: run_dlg_callbacks(): dialog=0x7f1941b1a8a0, type=64 65(75) DEBUG: pua_dialoginfo [pua_dialoginfo.c:395]: __dialog_sendpublish(): dialog over, from=sip:101@example.voismart.com 65(75) DEBUG: pua_dialoginfo [dialog_publish.c:410]: dialog_publish_multi(): CALLING dialog_publish for URI 65(75) DEBUG: <core> [core/parser/parse_uri.c:1389]: parse_uri(): bad uri, state 0 parsed: <> (4) / <> (28) 65(75) ERROR: pua_dialoginfo [dialog_publish.c:303]: dialog_publish(): failed to parse the PUBLISH R-URI 123(133) CRITICAL: <core> [core/pass_fd.c:281]: receive_fd(): EOF on 133 123(133) DEBUG: <core> [core/tcp_main.c:3955]: handle_ser_child(): dead child 65, pid 75 (shutting down?) 123(133) DEBUG: <core> [core/io_wait.h:597]: io_watch_del(): DBG: io_watch_del (0x55dd96e1fba0, 133, -1, 0x0) fd_no=157 called 0(1) ALERT: <core> [main.c:791]: handle_sigs(): child process 75 exited by a signal 11 0(1) ALERT: <core> [main.c:795]: handle_sigs(): core was generated 0(1) INFO: <core> [main.c:818]: handle_sigs(): terminating due to SIGCHLD 2(12) INFO: <core> [main.c:874]: sig_usr(): signal 15 received 3(13) INFO: <core> [main.c:874]: sig_usr(): signal 15 received ```
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
``` version: kamailio 5.7.4 (x86_64/linux) dc393e flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, MEM_JOIN_FREE, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: dc393e compiled with gcc 11.4.0 ```
* **Operating System**:
<!-- Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...; Kernel details (output of `lsb_release -a` and `uname -a`) -->
``` Linux c7a9e0ef942e 6.6.13-200.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Sat Jan 20 18:03:28 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux ```
Running in a Docker container based on Ubuntu Jammy which builds kamailio from git tag 5.7.4.
The corefile from where you got the attached backtrace is generated at shut down, it is not the one that caused the crash, but a side effect as the structures might be compromised at that phase. Practically the above backtrace does not reflect the reason of the real crash. You have to enable core file per pid and get the backtraces from all core files.
From the log messages, there is a hint that the R-URI is invalid, maybe you do some operations in config that break it. Anyhow, it would be good to get it caught, crashing should not happen even with a config-made invalid R-URI.
Thanks for the feedback and sorry for the delay.
I reproduced the crash today by importing the dump of the db I made when the crash originally occurred.
I got 3 core dumps. Two have the backtrace as above, with `free_str_list_all` at the top of the call stack, thus neither of those is (if I understand correctly) the process from which the problem originated.
The third coredump has the following backtrace:
``` (gdb) bt full #0 0x00007ffa4c365840 in dialog_publish_multi (state=state@entry=0x7ffa4c36c267 "terminated", ruris=0x2e656c706d617865, entity=entity@entry=0x7ffd855c6440, peer=peer@entry=0x7ffd855c6450, callid=callid@entry=0x7ffa5077cdf0, initiator=initiator@entry=1, lifetime=10, localtag=0x0, remotetag=0x0, localtarget=0x7ffa5077ce20, remotetarget=0x7ffd855c6430, do_pubruri_localcheck=0, uuid=0x7ffa5077ce50) at /usr/local/src/pkg/src/modules/pua_dialoginfo/dialog_publish.c:410 __llevel = 3 __func__ = "dialog_publish_multi" #1 0x00007ffa4c366f70 in __dialog_sendpublish (dlg=<optimized out>, type=<optimized out>, _params=<optimized out>) at /usr/local/src/pkg/src/modules/pua_dialoginfo/pua_dialoginfo.c:402 tag = {s = 0x0, len = 0} uri = { s = 0x7ffa5077ce7c "sip:*9001*0039123456@example.voismart.commsowdcchfzrlapg@localhostmsfvzsip:*9001*0039123456@example.voismart.comsip:101_example_voismart_com@172.23.42.1:5065;alias=192.168.1.201~32878~1padi-1-65d7243e"..., len = 41} identity_local = { s = 0x7ffa5077ce60 "sip:101@example.voismart.comsip:*9001*0039123456@example.voismart.commsowdcchfzrlapg@localhostmsfvzsip:*9001*0039123456@example.voismart.comsip:101_example_voismart_com@172.23.42.1:5065;alias=192.168."..., len = 28} target = {s = 0x0, len = 0} dlginfo = 0x7ffa5077cdc8 request = 0x0 __func__ = "__dialog_sendpublish" #2 0x00007ffa4f0b5867 in run_dlg_callbacks (type=64, dlg=0x7ffa500318a0, req=<optimized out>, rpl=<optimized out>, dir=<optimized out>, dlg_data=<optimized out>) at /usr/local/src/pkg/src/modules/dialog/dlg_cb.c:269 cb = 0x7ffa5077d050 __func__ = "run_dlg_callbacks" #3 0x00007ffa4f115df7 in dlg_bye_all (dlg=0x7ffa500318a0, hdrs=0x0) at /usr/local/src/pkg/src/modules/dialog/dlg_req_within.c:858 all_hdrs = {s = 0x0, len = 0} ret = <optimized out> __func__ = "dlg_bye_all" #4 0x00007ffa4f0e6669 in dlg_ontimeout (tl=0x7ffa50031900) at /usr/local/src/pkg/src/modules/dialog/dlg_handlers.c:1670 dlg = 0x7ffa500318a0 new_state = 22098 old_state = 72604856 unref = 22098 fmsg = <optimized out> timeout_cb = 0x0 keng = <optimized out> evname = {s = 0x4b <error: Cannot access memory at address 0x4b>, len = 72378941} __func__ = "dlg_ontimeout" #5 0x00007ffa4f109460 in dlg_timer_routine (ticks=<optimized out>, attr=<optimized out>) at /usr/local/src/pkg/src/modules/dialog/dlg_timer.c:281 tl = 0x0 ctl = 0x7ffa50031900 __func__ = "dlg_timer_routine" #6 0x000056520433e01b in compat_old_handler (ti=<optimized out>, tl=<optimized out>, data=<optimized out>) at core/timer.c:980 t = <optimized out> #7 0x0000565204353c29 in slow_timer_main () at core/timer.c:1103 n = <optimized out> ret = <optimized out> tl = 0x7ffa4ff1f428 i = <optimized out> __func__ = "slow_timer_main" #8 0x00005652040c2fec in main_loop () at /usr/local/src/pkg/src/main.c:1490 i = <optimized out> pid = <optimized out> --Type <RET> for more, q to quit, c to continue without paging-- si = <optimized out> si_desc = "udp receiver child=31 sock=172.23.42.2:9999\000RV\000\000\000 \034X\372\177\000\000\003\000\000\000\377\377\377\377\004`P\004RV\000\000\004\000\000\000\372\177\000\000\003\000\000\000\000\000\000\000\t\000\000\000\000\000\000\000Thu Feb \000qv1\335c\376\356:54 2024\005\000\000\000\000\000\000" nrprocs = <optimized out> woneinit = 1 error = <optimized out> __func__ = "main_loop" #9 0x00005652040b6ffc in main (argc=<optimized out>, argv=<optimized out>) at /usr/local/src/pkg/src/main.c:3212 cfg_stream = <optimized out> c = <optimized out> r = <optimized out> tmp = 0x7ffd855c8db4 "" tmp_len = 0 port = 32765 proto = -2057539472 ahost = 0x0 aport = 0 options = 0x565204509300 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:" ret = -1 seed = 1294483928 rfd = <optimized out> debug_save = <optimized out> debug_flag = <optimized out> dont_fork_cnt = <optimized out> n_lst = <optimized out> p = <optimized out> st = {st_dev = 198, st_ino = 827616256, st_nlink = 1, st_mode = 16872, st_uid = 101, st_gid = 101, __pad0 = 0, st_rdev = 0, st_size = 6, st_blksize = 4096, st_blocks = 0, st_atim = {tv_sec = 1708594589, tv_nsec = 533404421}, st_mtim = {tv_sec = 1708598142, tv_nsec = 117863927}, st_ctim = {tv_sec = 1708598142, tv_nsec = 117863927}, __glibc_reserved = {0, 0, 0}} tbuf = '\000' <repeats 32 times>, '/' <repeats 16 times>, "\230\r", '\000' <repeats 14 times>, "`", '\000' <repeats 15 times>, "\001", '\000' <repeats 145 times>... option_index = 12 __func__ = "main" long_options = {{name = 0x565204508fe2 "help", has_arg = 0, flag = 0x0, val = 104}, {name = 0x565204510dc8 "version", has_arg = 0, flag = 0x0, val = 118}, {name = 0x565204520731 "alias", has_arg = 1, flag = 0x0, val = 1024}, {name = 0x565204508fe7 "subst", has_arg = 1, flag = 0x0, val = 1025}, {name = 0x565204508fed "substdef", has_arg = 1, flag = 0x0, val = 1026}, {name = 0x565204508ff6 "substdefs", has_arg = 1, flag = 0x0, val = 1027}, {name = 0x565204509000 "server-id", has_arg = 1, flag = 0x0, val = 1028}, {name = 0x56520450900a "loadmodule", has_arg = 1, flag = 0x0, val = 1029}, {name = 0x565204509015 "modparam", has_arg = 1, flag = 0x0, val = 1030}, {name = 0x56520450901e "log-engine", has_arg = 1, flag = 0x0, val = 1031}, {name = 0x565204510ee5 "debug", has_arg = 1, flag = 0x0, val = 1032}, {name = 0x565204509029 "cfg-print", has_arg = 0, flag = 0x0, val = 1033}, {name = 0x565204509033 "atexit", has_arg = 1, flag = 0x0, val = 1034}, {name = 0x56520450903a "all-errors", has_arg = 0, flag = 0x0, val = 1035}, {name = 0x0, has_arg = 0, flag = 0x0, val = 0}} (gdb) (gdb) (gdb) info locals __llevel = 3 __func__ = "dialog_publish_multi" (gdb) list 405 in /usr/local/src/pkg/src/modules/pua_dialoginfo/dialog_publish.c
```
Let me know if there is anything else I can provide.
This issue is stale because it has been open 6 weeks with no activity. Remove stale label or comment or this will be closed in 2 weeks.
Thanks for providing the additional information. Does it crash only with the DB entry listed from above? If not, maybe you can provide the DB dump somewhere for download (sensitive data removed of course)?
Thanks for the response and sorry for the delay. I just found the dump I had took and tested that it still produces the same behavior (I'm currently on `version: kamailio 5.7.4 (x86_64/linux) ed9d7b`).
Dump attached here [kamdb-dump.txt](https://github.com/kamailio/kamailio/files/15092499/kamdb-dump.txt).
Hey @gianluca-nitti,
Do you have any special config for the modules pua, dialog and pua_dialoginfo that you can provide? I am probably missing something when trying to replicate it with the provided dump and default configs.
Thanks,
Hey again, i managed to reproduce it on 5.7.4 (also on 5.7.5) but it seems to be fixed in 5.8.1 and master already.
Can you maybe verify that this is the case?
@xkaraman I confirm it no longer happens after updating to 5.8.1. Thanks!
Closed #3743 as completed.