Messages from Kamailio:
/usr/sbin/kamailio[26653]: : <core> [mem/q_malloc.c:468]: qm_free(): BUG: qm_free: freeing already freed pointer (0x7ff0bf995ea8), called from <core>: mem/shm_mem.c: sh_realloc(88), first free <core>: mem/shm_mem.c: sh_realloc(88) - aborting /usr/sbin/kamailio[26631]: ALERT: <core> [main.c:775]: handle_sigs(): child process 26653 exited by a signal 6 /usr/sbin/kamailio[26631]: ALERT: <core> [main.c:778]: handle_sigs(): core was generated /usr/sbin/kamailio[26631]: INFO: <core> [main.c:790]: handle_sigs(): INFO: terminating due to SIGCHLD
Core was generated by `/usr/sbin/kamailio -P /var/run/kamailio.pid -m 128 -M 8 -u kamailio -g kamailio'. (gdb) bt #0 0x000000395c032925 in raise () from /lib64/libc.so.6 #1 0x000000395c034105 in abort () from /lib64/libc.so.6 #2 0x00000000005486e0 in qm_free (qm=0x7ff0befe1000, p=0x7ff0bf995ea8, file=0x61c530 "<core>: mem/shm_mem.c", func=0x61cccc "sh_realloc", line=88) at mem/q_malloc.c:470 #3 0x000000000054e70e in sh_realloc (p=0x7ff0bf995ea8, size=1011) at mem/shm_mem.c:88 #4 0x000000000054e8ad in _shm_resize (p=0x7ff0bf995ea8, s=1011, file=0x7ff0cc2cfb4b "tm: t_reply.c", func=0x7ff0cc2d24d1 "relay_reply", line=1954) at mem/shm_mem.c:114 #5 0x00007ff0cc29a0ca in relay_reply (t=0x7ff0bf96f370, p_msg=0x7ff0cd567828, branch=0, msg_status=180, cancel_data=0x7fff2689c160, do_put_on_wait=1) at t_reply.c:1953 #6 0x00007ff0cc29c935 in reply_received (p_msg=0x7ff0cd567828) at t_reply.c:2496 #7 0x000000000045d66f in do_forward_reply (msg=0x7ff0cd567828, mode=0) at forward.c:777 #8 0x000000000045df30 in forward_reply (msg=0x7ff0cd567828) at forward.c:860 #9 0x00000000004a558f in receive_msg (buf=0x9235e0 "SIP/2.0 180 Ringing\r\nVia: SIP/2.0/UDP 129.240.254.5;branch=z9hG4bK5308.7264ce2b6f84567135ea5d9fdac037e8.0\r\nVia: SIP/2.0/UDP 129.240.254.6;rport=5060;branch=z9hG4bK5308.90aa6ae9c3f480d0ba108044389e9387"..., len=937, rcv_info=0x7fff2689c4e0) at receive.c:273 #10 0x000000000053c394 in udp_rcv_loop () at udp_server.c:536 #11 0x000000000046d263 in main_loop () at main.c:1617 #12 0x000000000047030b in main (argc=21, argv=0x7fff2689c818) at main.c:2533
# /usr/sbin/kamailio -V version: kamailio 4.1.0 (x86_64/linux) flags: STATS: Off, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 4MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: unknown compiled on 15:29:28 Jan 7 2014 with gcc 4.4.7
Git rev d75bc3b69151a9d0391309e6bb51784f3b6b9a83.
Have the core file available if needed.
Øyvind
Are you using async module? Anything else significant in your cfg file?
This email was sent using my phone. It may be brief, to the point, or contain typos On 4 Feb 2014 18:02, "Øyvind Kolbu" oyvind.kolbu@usit.uio.no wrote:
Messages from Kamailio:
/usr/sbin/kamailio[26653]: : <core> [mem/q_malloc.c:468]: qm_free(): BUG: qm_free: freeing already freed pointer (0x7ff0bf995ea8), called from <core>: mem/shm_mem.c: sh_realloc(88), first free <core>: mem/shm_mem.c: sh_realloc(88) - aborting /usr/sbin/kamailio[26631]: ALERT: <core> [main.c:775]: handle_sigs(): child process 26653 exited by a signal 6 /usr/sbin/kamailio[26631]: ALERT: <core> [main.c:778]: handle_sigs(): core was generated /usr/sbin/kamailio[26631]: INFO: <core> [main.c:790]: handle_sigs(): INFO: terminating due to SIGCHLD
Core was generated by `/usr/sbin/kamailio -P /var/run/kamailio.pid -m 128 -M 8 -u kamailio -g kamailio'. (gdb) bt #0 0x000000395c032925 in raise () from /lib64/libc.so.6 #1 0x000000395c034105 in abort () from /lib64/libc.so.6 #2 0x00000000005486e0 in qm_free (qm=0x7ff0befe1000, p=0x7ff0bf995ea8, file=0x61c530 "<core>: mem/shm_mem.c", func=0x61cccc "sh_realloc", line=88) at mem/q_malloc.c:470 #3 0x000000000054e70e in sh_realloc (p=0x7ff0bf995ea8, size=1011) at mem/shm_mem.c:88 #4 0x000000000054e8ad in _shm_resize (p=0x7ff0bf995ea8, s=1011, file=0x7ff0cc2cfb4b "tm: t_reply.c", func=0x7ff0cc2d24d1 "relay_reply", line=1954) at mem/shm_mem.c:114 #5 0x00007ff0cc29a0ca in relay_reply (t=0x7ff0bf96f370, p_msg=0x7ff0cd567828, branch=0, msg_status=180, cancel_data=0x7fff2689c160, do_put_on_wait=1) at t_reply.c:1953 #6 0x00007ff0cc29c935 in reply_received (p_msg=0x7ff0cd567828) at t_reply.c:2496 #7 0x000000000045d66f in do_forward_reply (msg=0x7ff0cd567828, mode=0) at forward.c:777 #8 0x000000000045df30 in forward_reply (msg=0x7ff0cd567828) at forward.c:860 #9 0x00000000004a558f in receive_msg (buf=0x9235e0 "SIP/2.0 180 Ringing\r\nVia: SIP/2.0/UDP 129.240.254.5;branch=z9hG4bK5308.7264ce2b6f84567135ea5d9fdac037e8.0\r\nVia: SIP/2.0/UDP 129.240.254.6;rport=5060;branch=z9hG4bK5308.90aa6ae9c3f480d0ba108044389e9387"..., len=937, rcv_info=0x7fff2689c4e0) at receive.c:273 #10 0x000000000053c394 in udp_rcv_loop () at udp_server.c:536 #11 0x000000000046d263 in main_loop () at main.c:1617 #12 0x000000000047030b in main (argc=21, argv=0x7fff2689c818) at main.c:2533
# /usr/sbin/kamailio -V version: kamailio 4.1.0 (x86_64/linux) flags: STATS: Off, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 4MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: unknown compiled on 15:29:28 Jan 7 2014 with gcc 4.4.7
Git rev d75bc3b69151a9d0391309e6bb51784f3b6b9a83.
Have the core file available if needed.
Øyvind
sr-dev mailing list sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
okay, I see it is happening during resize of shm memory... Can you re-create on demand? load related?
On Tue, Feb 4, 2014 at 6:34 PM, Øyvind Kolbu oyvind.kolbu@usit.uio.nowrote:
On 04.02.2014 17:30, Jason Penton wrote:
Are you using async module? Anything else significant in your cfg file?
No async module. Depends on what you call significant.. We use dialog, db_postgres, ldap, siptrace and other pretty normal stuff.
-- Øyvind
sr-dev mailing list sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
okay, I see it is happening during resize of shm memory... Can you re-create on demand? load related?
Can not re-create on demand, but this is the second time in two weeks something like this has happened. From our graphs it happened just after a peak in CPU usage, and it might actually been swapping slightly. Could add some more memory to the machine to see if the problems goes away.
Hello,
this happens due to an abort() that is executed only when MEMDBG is set (memory debugging is turned on). It's main purpose is to spot double frees.
For production, either MEMDBG is not set or you set mem_safety global parameter. You will get the log message, but the application keeps running.
I also recommend to upgrade to 4.1.1, there were some fixes that affect this case as well.
On the other side, I will try to see what could be the situation to end in the two frees once I get a chance (still in the middle of a traveling period for a while here).
Cheers, Daniel
On 04/02/14 19:51, Øyvind Kolbu wrote:
okay, I see it is happening during resize of shm memory... Can you re-create on demand? load related?
Can not re-create on demand, but this is the second time in two weeks something like this has happened. From our graphs it happened just after a peak in CPU usage, and it might actually been swapping slightly. Could add some more memory to the machine to see if the problems goes away.
-----Original Message----- From: Daniel-Constantin Mierla Sent: Tuesday, February 04, 2014 9:26 PM
Hello,
this happens due to an abort() that is executed only when MEMDBG is set (memory debugging is turned on). It's main purpose is to spot double frees.
For production, either MEMDBG is not set or you set mem_safety global parameter. You will get the log message, but the application keeps running.
I also recommend to upgrade to 4.1.1, there were some fixes that affect this case as well.
On the other side, I will try to see what could be the situation to end in the two frees once I get a chance (still in the middle of a traveling period for a while here).
OK, will try a git after 4.1.1. I'm keeping the core file if you want it or a full backtrace.
Both MEMDBG and mem_safety are default. Neither config nor build changes them.
On 05/02/14 08:57, Øyvind Kolbu wrote:
-----Original Message----- From: Daniel-Constantin Mierla Sent: Tuesday, February 04, 2014 9:26 PM
Hello,
this happens due to an abort() that is executed only when MEMDBG is set (memory debugging is turned on). It's main purpose is to spot double frees.
For production, either MEMDBG is not set or you set mem_safety global parameter. You will get the log message, but the application keeps running.
I also recommend to upgrade to 4.1.1, there were some fixes that affect this case as well.
On the other side, I will try to see what could be the situation to end in the two frees once I get a chance (still in the middle of a traveling period for a while here).
OK, will try a git after 4.1.1. I'm keeping the core file if you want it or a full backtrace.
Both MEMDBG and mem_safety are default. Neither config nor build changes them.
Indeed, I checked 4.1, MEMDBG is set for the branch. You can set in config:
mem_safety=1
Cheers, Daniel
Indeed, I checked 4.1, MEMDBG is set for the branch. You can set in config:
mem_safety=1
OK, will try it! Seems like the default should be switched, to better cope with bugs in code on production systems.
Another server crashed with same error message just before I saw your advice ;/
On 05.02.2014 09:08, Daniel-Constantin Mierla wrote:
On 05/02/14 08:57, Øyvind Kolbu wrote:
-----Original Message----- From: Daniel-Constantin Mierla Sent: Tuesday, February 04, 2014 9:26 PM
Hello,
this happens due to an abort() that is executed only when MEMDBG is set (memory debugging is turned on). It's main purpose is to spot double frees.
For production, either MEMDBG is not set or you set mem_safety global parameter. You will get the log message, but the application keeps running.
I also recommend to upgrade to 4.1.1, there were some fixes that affect this case as well.
On the other side, I will try to see what could be the situation to end in the two frees once I get a chance (still in the middle of a traveling period for a while here).
OK, will try a git after 4.1.1. I'm keeping the core file if you want it or a full backtrace.
Both MEMDBG and mem_safety are default. Neither config nor build changes them.
Indeed, I checked 4.1, MEMDBG is set for the branch. You can set in config:
mem_safety=1
Forgot to set mem_safety for a server after upgrading to post 4.1.1 git, but crashed again:
(gdb) bt #0 0x000000395c032925 in raise () from /lib64/libc.so.6 #1 0x000000395c034105 in abort () from /lib64/libc.so.6 #2 0x00000000005486e0 in qm_free (qm=0x7f6ec2dba000, p=0x7f6ec314ac68, file=0x61c530 "<core>: mem/shm_mem.c", func=0x61cccc "sh_realloc", line=88) at mem/q_malloc.c:470 #3 0x000000000054e70e in sh_realloc (p=0x7f6ec314ac68, size=720) at mem/shm_mem.c:88 #4 0x000000000054e8ad in _shm_resize (p=0x7f6ec314ac68, s=720, file=0x7f6ecfe249cb "tm: t_reply.c", func=0x7f6ecfe272b1 "relay_reply", line=1949) at mem/shm_mem.c:114 #5 0x00007f6ecfdeef4b in relay_reply (t=0x7f6ec40aab28, p_msg=0x7f6ed130c5a0, branch=3, msg_status=180, cancel_data=0x7fff1a264320, do_put_on_wait=1) at t_reply.c:1948 #6 0x00007f6ecfdf17b6 in reply_received (p_msg=0x7f6ed130c5a0) at t_reply.c:2491 #7 0x000000000045d66f in do_forward_reply (msg=0x7f6ed130c5a0, mode=0) at forward.c:777 #8 0x000000000045df30 in forward_reply (msg=0x7f6ed130c5a0) at forward.c:860 #9 0x00000000004a558f in receive_msg ( buf=0x9235e0 "SIP/2.0 180 Ringing\r\nVia: SIP/2.0/UDP 129.240.254.5;branch=z9hG4bK823a.f082d80b4a3abcb7dbdfb45c4218c5bb.3\r\nVia: SIP/2.0/UDP 129.240.254.71:5060;branch=z9hG4bK236fae70;rport=5060\r\nFrom: "56435" <sip:56"..., len=646, rcv_info=0x7fff1a2646a0) at receive.c:273 #10 0x000000000053c394 in udp_rcv_loop () at udp_server.c:536 #11 0x000000000046d263 in main_loop () at main.c:1617 #12 0x000000000047030b in main (argc=21, argv=0x7fff1a2649d8) at main.c:2533
# /usr/sbin/kamailio -V version: kamailio 4.1.1 (x86_64/linux) flags: STATS: Off, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 4MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: unknown compiled on 17:13:06 Jan 29 2014 with gcc 4.4.7
gitrev d0e32ab598ec13756da96dbc3651aaae72bfd92b