<!-- Kamailio Project uses GitHub Issues only for bugs in the code or feature requests.
If you have questions about using Kamailio or related to its configuration file, ask on sr-users mailing list:
* http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
If you have questions about developing extensions to Kamailio or its existing C code, ask on sr-dev mailing list
* http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
Please try to fill this template as much as possible for any issue. It helps the developers to troubleshoot the issue.
If you submit a feature request (or enhancement), you can delete the text of the template and only add the description of what you would like to be added.
If there is no content to be filled in a section, the entire section can be removed.
You can delete the comments from the template sections when filling.
You can delete next line and everything above before submitting (it is a comment). -->
### Description Hi,
I got a problem with Kamailio, it crashed 11 times in a row, with no visible reason, only thing I notice is MALLOC always on all the core-dumps.
<!-- Explain what you did, what you expected to happen, and what actually happened. -->
### Troubleshooting
#### Reproduction It was aleatory 11 crashes in a row <!-- If the issue can be reproduced, describe how it can be done. -->
#### Debugging Data
<!-- If you got a core dump, use gdb to extract troubleshooting data - full backtrace, local variables and the list of the code at the issue location.
gdb /path/to/kamailio /path/to/corefile bt full info locals list
If you are familiar with gdb, feel free to attach more of what you consider to be relevant. -->
``` Program terminated with signal 11, Segmentation fault. #0 0x0000000000622796 in qm_status (qm=0x7fcbe89b4000) at mem/q_malloc.c:788 788 f!=&(qm->free_hash[h].head); f=f->u.nxt_free, i++, j++){ Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.5-7.el6_0.x86_64 db4-4.7.25-18.el6_4.x86_64 elfutils-libelf-0.152-1.el6.x86_64 glibc-2.12-1.132.el6.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.6.x86_64 libacl-2.2.49-6.el6.x86_64 libattr-2.4.44-7.el6.x86_64 libcap-2.16-5.5.el6.x86_64 libcom_err-1.41.12-18.el6.x86_64 libgcc-4.4.7-11.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 lm_sensors-libs-3.1.1-17.el6.x86_64 lua-5.1.4-4.1.el6.x86_64 mysql-libs-5.1.73-3.el6_5.x86_64 net-snmp-libs-5.5-50.el6_6.1.x86_64 nspr-4.10.0-1.el6.x86_64 nss-3.15.1-15.el6.x86_64 nss-softokn-freebl-3.14.3-9.el6.x86_64 nss-util-3.15.1-3.el6.x86_64 openssl-1.0.1e-30.el6_6.4.x86_64 pcre-7.8-6.el6.x86_64 perl-libs-5.10.1-136.el6.x86_64 popt-1.13-7.el6.x86_64 rpm-libs-4.8.0-37.el6.x86_64 tcp_wrappers-libs-7.6-57.el6.x86_64 xz-libs-4.999.9-0.3.beta.20091007git.el6.x86_64 zlib-1.2.3-29.el6.x86_64 (gdb) bt full #0 0x0000000000622796 in qm_status (qm=0x7fcbe89b4000) at mem/q_malloc.c:788 f = 0x69746163696c7070 i = 304 j = 3 h = 10 unused = 0 memlog = 5 mem_summary = 3 __FUNCTION__ = "qm_status" #1 0x000000000061a758 in qm_debug_frag (qm=0x7fcbe89b4000, f=0x7fcbf83fa268) at mem/q_malloc.c:150 __FUNCTION__ = "qm_debug_frag" #2 0x000000000061cd90 in qm_free (qm=0x7fcbe89b4000, p=0x7fcbf83fa298, file=0x7fcc0f8d336d "tm: h_table.c", func=0x7fcc0f8d3648 "free_cell", line=186) at mem/q_malloc.c:468 f = 0x7fcbf83fa268 size = 40 next = 0x400 prev = 0x7ffffdc3b7a0 __FUNCTION__ = "qm_free" #3 0x00007fcc0f814c9d in free_cell (dead_cell=0x7fcbf83ea0e8) at h_table.c:186 b = 0x7fcbf83fa298 "INVITE sip:0034126498563@proxy SIP/2.0\r\nRecord-Route: sip:192.168.1.5;lr;did=d4d.7fa\r\nCSeq: 1 INVITE\r\nCall-ID: SD7jt2a01-b81723f229af11e61f26a15065\r\nFrom: <sip:012854697@192.168.1."... i = 0 rpl = 0x0 tt = 0x7fcbe8ad0f18 foo = 0x7ffffdc3b870 cbs = 0x0 cbs_tmp = 0x7fcbf83f2308 __FUNCTION__ = "free_cell" #4 0x00007fcc0f859a1c in wait_handler (ti=269273136, wait_tl=0x7fcbf83ea168, data=0x7fcbf83ea0e8) at timer.c:675 p_cell = 0x7fcbf83ea0e8 ret = 1 #5 0x00000000005fd647 in timer_list_expire (t=269273136, h=0x7fcbe8a2d908, slow_l=0x7fcbe8a2fb38, slow_mark=15874) at timer.c:888 tl = 0x7fcbf83ea168 ret = 269273136 #6 0x00000000005fda8f in timer_handler () at timer.c:953 saved_ticks = 269273136 run_slow_timer = 0 i = 514 __FUNCTION__ = "timer_handler" #7 0x00000000005fdefd in timer_main () at timer.c:992 No locals. #8 0x00000000004a77e1 in main_loop () at main.c:1700 i = 8 pid = 0 si = 0x0 si_desc = "udp receiver child=7 sock=192.168.1.5:5060\000\177\000\000\200\272\303\375\377\177\000\000\023{N\000\000\000\000\000к\303\375\377\177\000\000\004\000\000\000\000\000\000\000`TA\000\000\000\000\000(\225\236\350\313\177", '\000' <repeats 14 times>, "\001\000\000\000к\303\375\377\177\000\000\266{N\000\000\000\000" nrprocs = 8 __FUNCTION__ = "main_loop" #9 0x00000000004acfa6 in main (argc=7, argv=0x7ffffdc3bd48) at main.c:2581 cfg_stream = 0x13b8010 c = -1 r = 0 tmp = 0x7ffffdc3cf70 "" tmp_len = 32767 port = -37503970 proto = 0 options = 0x703718 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:" ret = -1 seed = 2884067974 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x40d134 p = 0xc2 <Address 0xc2 out of bounds> __FUNCTION__ = "main" (gdb) info locals f = 0x69746163696c7070 i = 304 j = 3 h = 10 unused = 0 memlog = 5 mem_summary = 3 __FUNCTION__ = "qm_status" (gdb) list 783 LOG_(DEFAULT_FACILITY, memlog, "qm_status: ", 784 "dumping free list stats :\n"); 785 for(h=0,i=0;h<QM_HASH_SIZE;h++){ 786 unused=0; 787 for (f=qm->free_hash[h].head.u.nxt_free,j=0; 788 f!=&(qm->free_hash[h].head); f=f->u.nxt_free, i++, j++){ 789 if (!FRAG_WAS_USED(f)){ 790 unused++; 791 #ifdef DBG_QM_MALLOC 792 LOG_(DEFAULT_FACILITY, memlog, "qm_status: ", (gdb) quit
```
#### Log Messages
<!-- Check the syslog file and if there are relevant log messages printed by Kamailio, add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site). -->
``` Mar 9 10:33:49 kamserv /usr/local/sbin/kamailio[2945]: : <core> [mem/q_malloc.c:140]: qm_debug_frag(): BUG: qm_*: fragm. 0x7f064ea52480 (address 0x7f064ea524b0) beginning overwritten(646e756f622d6575)! Mar 9 10:33:52 kamserv abrtd: Directory 'ccpp-2017-03-09-10:33:49-2945' creation detected Mar 9 10:33:52 kamserv abrt[3793]: Saved core dump of pid 2945 (/usr/local/sbin/kamailio) to /var/spool/abrt/ccpp-2017-03-09-10:33:49-2945 (608370688 bytes) Mar 9 10:33:52 kamserv abrtd: Executable '/usr/local/sbin/kamailio' doesn't belong to any package and ProcessUnpackaged is set to 'no' Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2908]: ALERT: <core> [main.c:784]: handle_sigs(): child process 2945 exited by a signal 6 Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2908]: ALERT: <core> [main.c:787]: handle_sigs(): core was generated Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2908]: INFO: <core> [main.c:799]: handle_sigs(): terminating due to SIGCHLD Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2953]: INFO: <core> [main.c:850]: sig_usr(): signal 15 received Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2955]: INFO: <core> [main.c:850]: sig_usr(): signal 15 received Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2947]: INFO: <core> [main.c:850]: sig_usr(): signal 15 received Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2941]: INFO: <core> [main.c:850]: sig_usr(): signal 15 received Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2939]: INFO: <core> [main.c:850]: sig_usr(): signal 15 received Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2935]: INFO: <core> [main.c:850]: sig_usr(): signal 15 received Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2927]: INFO: <core> [main.c:850]: sig_usr(): signal 15 received Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2943]: INFO: <core> [main.c:850]: sig_usr(): signal 15 received Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2929]: INFO: <core> [main.c:850]: sig_usr(): signal 15 received Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2937]: INFO: <core> [main.c:850]: sig_usr(): signal 15 received Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2949]: INFO: <core> [main.c:850]: sig_usr(): signal 15 received Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2931]: INFO: <core> [main.c:850]: sig_usr(): signal 15 received Mar 9 10:33:52 kamserv /usr/local/sbin/kamailio[2933]: INFO: <core> [main.c:850]: sig_usr(): signal 15 received Mar 9 10:33:52 kamserv abrtd: 'post-create' on '/var/spool/abrt/ccpp-2017-03-09-10:33:49-2945' exited with 1 Mar 9 10:33:52 kamserv abrtd: Deleting problem directory '/var/spool/abrt/ccpp-2017-03-09-10:33:49-2945'
```
#### SIP Traffic
<!-- If the issue is exposed by processing specific SIP messages, grab them with ngrep or save in a pcap file, then add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site). -->
``` (paste your sip traffic here) ```
### Possible Solutions
<!-- If you found a solution or workaround for the issue, describe it. Ideally, provide a pull request with a fix. -->
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
``` version: kamailio 4.2.8 (x86_64/linux) 4507b8 flags: STATS: Off, USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 64MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: 4507b8 compiled on 16:19:58 Oct 26 2016 with gcc 4.4.7
```
* **Operating System**:
<!-- Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...; Kernel details (output of `uname -a`) -->
``` Linux kamserv 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux ```
Do you have any custom modules running there?
Hi, no customs modules running.
The logs suggest a buffer overflow. It will require to find the previous allocated memory chuck that writes over the beginning of the one being freed (the next to one doing the overflow).
I assume it is not easy to get access to the system so I can investigate the corefile with gdb?!? Are you familiar with gdb and C poiners?
Yes, I'm familiar with GDB and C pointers, you could give me the commands to enter, and I show you the result if you wish.
Thanks in advance
Did you get log messages in syslog that contain `qm_status` string?
Regarding the troubleshooting with gdb, the idea is to print all chunks of memory if you didn't get the `qm_status` log messages.
The wiki has some gdb scripts at:
* https://www.kamailio.org/wiki/tutorials/troubleshooting/memory#using_gdb
But they are for pkg as they use `mem_block` and the issue here seems to be in shm, so you have to use `shm_block` instead of `mem_block`.
The target is to find the fragment before the one that has the beginning overwritten, listed in the log message:
``` Mar 9 10:33:49 kamserv /usr/local/sbin/kamailio[2945]: : <core> [mem/q_malloc.c:140]: qm_debug_frag(): BUG: qm_*: fragm. 0x7f064ea52480 (address 0x7f064ea524b0) beginning overwritten(646e756f622d6575)! ```
Probably you can adjust the gdb scripts in order to print first only the addresses for fragments, then print the content of the fragment before the one overwritten.
Did you get the chance to investigate the corefile as per previous comment?
Reopen when having more date to follow up.
Closed #1026.