Looks like kamailio-debuginfo rpm was from older version of kamailio. I'm not able to reproduce core file anymore. Could somebody please be so kind and explain why ?
Secondly:
Mar 13 18:12:57 ricvmf-fusion01 kam-scscf[13524]: WARNING: tm [t_lookup.c:1536]: t_unref(): WARNING: script writer didn't release transaction Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_usrloc_pcscf [udomain.c:400]: update_pcontact(): no more shm_mem Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_registrar_pcscf [save.c:208]: update_contacts(): failed to update pcscf contact Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_usrloc_pcscf [udomain.c:400]: update_pcontact(): no more shm_mem Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_registrar_pcscf [save.c:208]: update_contacts(): failed to update pcscf contact Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_usrloc_pcscf [udomain.c:400]: update_pcontact(): no more shm_mem Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_registrar_pcscf [save.c:208]: update_contacts(): failed to update pcscf contact Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_usrloc_pcscf [udomain.c:400]: update_pcontact(): no more shm_mem Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_registrar_pcscf [save.c:208]: update_contacts(): failed to update pcscf contact Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13657]: ERROR: <core> [tcp_main.c:4237]: handle_tcpconn_ev(): connect 10.67.64.29:1305 failed Mar 13 18:12:57 ricvmf-fusion01 kam-scscf[13524]: INFO: ims_registrar_scscf [cxdx_sar.c:79]: create_return_code(): created AVP successfully : [saa_return_code] - [1] Mar 13 18:12:57 ricvmf-fusion01 kam-scscf[13524]: WARNING: tm [t_lookup.c:1536]: t_unref(): WARNING: script writer didn't release transaction Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_usrloc_pcscf [udomain.c:400]: update_pcontact(): no more shm_mem Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_registrar_pcscf [save.c:208]: update_contacts(): failed to update pcscf contact Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_usrloc_pcscf [udomain.c:400]: update_pcontact(): no more shm_mem Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_registrar_pcscf [save.c:208]: update_contacts(): failed to update pcscf contact Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_usrloc_pcscf [udomain.c:400]: update_pcontact(): no more shm_mem Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_registrar_pcscf [save.c:208]: update_contacts(): failed to update pcscf contact Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_usrloc_pcscf [udomain.c:400]: update_pcontact(): no more shm_mem Mar 13 18:12:57 ricvmf-fusion01 kam-pcscf[13639]: ERROR: ims_registrar_pcscf [save.c:208]: update_contacts(): failed to update pcscf contact
As far as I remember there was configurable called "mem=XXX" but I don't see it in the devel cookbook anymore. Any idea what replaced this variable ?
@Hugh: we'll have beer on me once you get here ;)
On 03/13/2014 05:47 PM, Hugh Waite wrote: Dan, There are two cores because of a crash in one process followed by a crash when the other processes are trying to shutdown.
What's interesting is that the bt doesn't show useful pointers. If you have installed from RPMs make sure the kamailio-debuginfo is from the same build as the other RPMs.
Also, do the logs say anything? There should be a log entry from the kernel for the segfault/signal that says which module crashed (e.g. registrar.so) and possibly (hopefully) an error message just before that.
Hugh
On 13/03/2014 19:53, Daniel Ciprus wrote: Jason,
I've tried multiple combinations for pattern but I'm getting only 2 core files ...
Details:
~]# cat /proc/sys/kernel/core_pattern /tmp/core.%e.sig%s.%p
~]# lsb_release -a LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch:graphics-4.0-amd64:graphics-4.0-noarch:printing-4.0-amd64:printing-4.0-noarch Distributor ID: RedHatEnterpriseServer Description: Red Hat Enterprise Linux Server release 6.5 (Santiago) Release: 6.5 Codename: Santiago
(gdb) bt #0 0x00000000005350b0 in ?? () #1 0x000000000053542a in ?? () #2 0x00000000005356c7 in timer_main () #3 0x000000000046d572 in main_loop () #4 0x000000000047030b in main () (gdb) bt full #0 0x00000000005350b0 in ?? () No symbol table info available. #1 0x000000000053542a in ?? () No symbol table info available. #2 0x00000000005356c7 in timer_main () No symbol table info available. #3 0x000000000046d572 in main_loop () No symbol table info available. #4 0x000000000047030b in main () No symbol table info available. (gdb)
(gdb) bt #0 0x00000031ba432925 in raise () from /lib64/libc.so.6 #1 0x00000031ba434105 in abort () from /lib64/libc.so.6 #2 0x0000000000546750 in ?? () #3 0x000000000054853a in qm_free () #4 0x00007f23d98f87de in free_local_ack_unsafe (lack=0x7f23d3319d70) at uac.c:600 #5 0x00007f23d988ea57 in free_cell (dead_cell=0x7f23d3319a70) at h_table.c:217 #6 0x00007f23d988f2ee in free_hash_table () at h_table.c:441 #7 0x00007f23d98a2fca in tm_shutdown () at t_funcs.c:122 #8 0x00000000004f7c7a in destroy_modules () #9 0x0000000000466e63 in cleanup () #10 0x0000000000467f65 in ?? () #11 0x0000000000469679 in handle_sigs () #12 0x000000000046db19 in main_loop () #13 0x000000000047030b in main () (gdb) bt full #0 0x00000031ba432925 in raise () from /lib64/libc.so.6 No symbol table info available. #1 0x00000031ba434105 in abort () from /lib64/libc.so.6 No symbol table info available. #2 0x0000000000546750 in ?? () No symbol table info available. #3 0x000000000054853a in qm_free () No symbol table info available. #4 0x00007f23d98f87de in free_local_ack_unsafe (lack=0x7f23d3319d70) at uac.c:600 __FUNCTION__ = "free_local_ack_unsafe" #5 0x00007f23d988ea57 in free_cell (dead_cell=0x7f23d3319a70) at h_table.c:217 b = 0x0 i = 0 rpl = 0x0 tt = 0x0 foo = 0x2fd3221000 cbs = 0x0 cbs_tmp = 0x7f23d35386b8 __FUNCTION__ = "free_cell" #6 0x00007f23d988f2ee in free_hash_table () at h_table.c:441 p_cell = 0x7f23d3319a70 tmp_cell = 0x7f23d353dca0 i = 580 __FUNCTION__ = "free_hash_table" #7 0x00007f23d98a2fca in tm_shutdown () at t_funcs.c:122 __FUNCTION__ = "tm_shutdown" #8 0x00000000004f7c7a in destroy_modules () No symbol table info available. #9 0x0000000000466e63 in cleanup () No symbol table info available. #10 0x0000000000467f65 in ?? () No symbol table info available. #11 0x0000000000469679 in handle_sigs () No symbol table info available. #12 0x000000000046db19 in main_loop () No symbol table info available. #13 0x000000000047030b in main () No symbol table info available. (gdb)
On 03/13/2014 02:58 PM, Jason Penton wrote: I don't think these cores indicate the real crash... I'd like to get some more detail on what actually happened? Daniel, can you re-create? Keep in mind that if your core dump config on your box is not configured to name your cores according to process id or timestamp one core will overwrite the other..... as a result you will never see the core that is the root cause.
Which OS are you running?
if Linux, I use the following in /etc/sysctl.conf:
kernel.core_pattern=/tmp/core.%e.%p.%h.%t
On Thu, Mar 13, 2014 at 8:45 PM, Carsten Bock <carsten@ng-voice.commailto:carsten@ng-voice.com> wrote: It looks a little bit like a "double free".
You could try to disable the call to "abort()" in case this happens: mem_safety=1 See: http://www.kamailio.org/wiki/cookbooks/devel/core#mem_safety
Kind regards, Carsten
2014-03-13 19:44 GMT+01:00 Carsten Bock <carsten@ng-voice.commailto:carsten@ng-voice.com>:
It looks a little bit like a "double free".
You could try to disable the call to "abort()" in case this happens:
2014-03-13 17:22 GMT+01:00 Daniel Ciprus <daniel.ciprus@acision.commailto:daniel.ciprus@acision.com>:
There are no more core files on the filesystem :-(
On 03/13/2014 12:18 PM, Jason Penton wrote:
I'm afraid this is also not the correct core. Can you check the timestamp on the cores? Can you re-create the crash and send me the correct core?
On Thu, Mar 13, 2014 at 5:36 PM, Daniel Ciprus <daniel.ciprus@acision.commailto:daniel.ciprus@acision.com> wrote:
So I cleaned up my junkyard and I got 2 core files:
(gdb) bt #0 0x00000000005350b0 in ?? () #1 0x000000000053542a in ?? () #2 0x00000000005356c7 in timer_main () #3 0x000000000046d572 in main_loop () #4 0x000000000047030b in main () (gdb) bt full #0 0x00000000005350b0 in ?? ()
No symbol table info available. #1 0x000000000053542a in ?? ()
No symbol table info available. #2 0x00000000005356c7 in timer_main ()
No symbol table info available. #3 0x000000000046d572 in main_loop ()
No symbol table info available. #4 0x000000000047030b in main ()
No symbol table info available. (gdb)
(gdb) bt full #0 0x00000031ba432925 in raise () from /lib64/libc.so.6 No symbol table info available. #1 0x00000031ba434105 in abort () from /lib64/libc.so.6 No symbol table info available. #2 0x0000000000546750 in ?? () No symbol table info available. #3 0x000000000054853a in qm_free () No symbol table info available. #4 0x00007f5bf7d5a7de in free_local_ack_unsafe (lack=0x7f5bf1894528) at uac.c:600 __FUNCTION__ = "free_local_ack_unsafe" #5 0x00007f5bf7cf0a57 in free_cell (dead_cell=0x7f5bf1894228) at h_table.c:217
b = 0x0 i = 0 rpl = 0x0 tt = 0x0 foo = 0x2ff1683000 cbs = 0x0 cbs_tmp = 0x7f5bf198e508 __FUNCTION__ = "free_cell"
#6 0x00007f5bf7cf12ee in free_hash_table () at h_table.c:441 p_cell = 0x7f5bf1894228 tmp_cell = 0x7f5bf1894228 i = 3533 __FUNCTION__ = "free_hash_table" #7 0x00007f5bf7d04fca in tm_shutdown () at t_funcs.c:122
__FUNCTION__ = "tm_shutdown"
#8 0x00000000004f7c7a in destroy_modules () No symbol table info available. #9 0x0000000000466e63 in cleanup () No symbol table info available. #10 0x0000000000467f65 in ?? () No symbol table info available. #11 0x0000000000469679 in handle_sigs () No symbol table info available. #12 0x000000000046db19 in main_loop () No symbol table info available. #13 0x000000000047030b in main () No symbol table info available. (gdb)
On 03/13/2014 11:18 AM, Jason Penton wrote:
Hi Daniel,
this is the wrong core file. This is the one created on shutdown of kamailio. Can you do a bt on the other core file that you probably have...
Cheers Jason
On Thu, Mar 13, 2014 at 5:05 PM, Daniel Ciprus <daniel.ciprus@acision.commailto:daniel.ciprus@acision.com> wrote:
Folks,
This is happening during the registration on SCSCF.
Server:: kamailio (4.2.0-dev2 (x86_64/linux)) Build:: mi_core.c compiled on 10:01:09 Mar 13 2014 with gcc 4.4.6 Flags:: STATS: Off, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES GIT:: unknown Now:: Thu Mar 13 11:04:47 2014 Up since:: Thu Mar 13 10:58:12 2014 Up time:: 395 [sec]
(gdb) bt #0 0x00000031ba432925 in raise () from /lib64/libc.so.6 #1 0x00000031ba434105 in abort () from /lib64/libc.so.6 #2 0x0000000000546750 in ?? () #3 0x000000000054853a in qm_free () #4 0x00007fb4def5b7de in free_local_ack_unsafe (lack=0x7fb4d8b31728) at uac.c:600 #5 0x00007fb4deef1a57 in free_cell (dead_cell=0x7fb4d8b31428) at h_table.c:217 #6 0x00007fb4deef22ee in free_hash_table () at h_table.c:441 #7 0x00007fb4def05fca in tm_shutdown () at t_funcs.c:122 #8 0x00000000004f7c7a in destroy_modules () #9 0x0000000000466e63 in cleanup () #10 0x0000000000467f65 in ?? () #11 0x0000000000469679 in handle_sigs () #12 0x000000000046db19 in main_loop () #13 0x000000000047030b in main () (gdb) bt full #0 0x00000031ba432925 in raise () from /lib64/libc.so.6 No symbol table info available. #1 0x00000031ba434105 in abort () from /lib64/libc.so.6 No symbol table info available. #2 0x0000000000546750 in ?? () No symbol table info available. #3 0x000000000054853a in qm_free () No symbol table info available. #4 0x00007fb4def5b7de in free_local_ack_unsafe (lack=0x7fb4d8b31728) at uac.c:600 __FUNCTION__ = "free_local_ack_unsafe" #5 0x00007fb4deef1a57 in free_cell (dead_cell=0x7fb4d8b31428) at h_table.c:217 b = 0x0 i = 0 rpl = 0x0 tt = 0x0 foo = 0x2fd8a8b000 cbs = 0x0 cbs_tmp = 0x7fb4d8d9c9e0 __FUNCTION__ = "free_cell" #6 0x00007fb4deef22ee in free_hash_table () at h_table.c:441 p_cell = 0x7fb4d8b31428 tmp_cell = 0x7fb4d8b31428 i = 11517 __FUNCTION__ = "free_hash_table" #7 0x00007fb4def05fca in tm_shutdown () at t_funcs.c:122 __FUNCTION__ = "tm_shutdown" #8 0x00000000004f7c7a in destroy_modules () No symbol table info available. #9 0x0000000000466e63 in cleanup () No symbol table info available. #10 0x0000000000467f65 in ?? () No symbol table info available. #11 0x0000000000469679 in handle_sigs () No symbol table info available. #12 0x000000000046db19 in main_loop () No symbol table info available. #13 0x000000000047030b in main () No symbol table info available. (gdb)
-- Daniel Ciprus Integration engineer http://www.acision.com
9954 Mayland Dr Suite 3100 Richmond, VA 23233 USA T: +1 804 762 5601tel:%2B1%20804%20762%205601 E: daniel.ciprus@acision.commailto:daniel.ciprus@acision.com
This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you for understanding.
sr-dev mailing list sr-dev@lists.sip-router.orgmailto:sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
-- Daniel Ciprus Integration engineer http://www.acision.com
9954 Mayland Dr Suite 3100 Richmond, VA 23233 USA T: +1 804 762 5601tel:%2B1%20804%20762%205601 E: daniel.ciprus@acision.commailto:daniel.ciprus@acision.com
This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you for understanding.
sr-dev mailing list sr-dev@lists.sip-router.orgmailto:sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
-- Daniel Ciprus Integration engineer http://www.acision.com
9954 Mayland Dr Suite 3100 Richmond, VA 23233 USA T: +1 804 762 5601tel:%2B1%20804%20762%205601 E: daniel.ciprus@acision.commailto:daniel.ciprus@acision.com
This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you for understanding.
sr-dev mailing list sr-dev@lists.sip-router.orgmailto:sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
-- Carsten Bock CEO (Geschäftsführer)
ng-voice GmbH Schomburgstr. 80 D-22767 Hamburg / Germany
http://www.ng-voice.com mailto:carsten@ng-voice.commailto:carsten@ng-voice.com
Office +49 40 34927219tel:%2B49%2040%2034927219 Fax +49 40 34927220tel:%2B49%2040%2034927220
Sitz der Gesellschaft: Hamburg Registergericht: Amtsgericht Hamburg, HRB 120189 Geschäftsführer: Carsten Bock Ust-ID: DE279344284
Hier finden Sie unsere handelsrechtlichen Pflichtangaben: http://www.ng-voice.com/imprint/
-- Carsten Bock CEO (Geschäftsführer)
ng-voice GmbH Schomburgstr. 80 D-22767 Hamburg / Germany
http://www.ng-voice.com mailto:carsten@ng-voice.commailto:carsten@ng-voice.com
Office +49 40 34927219tel:%2B49%2040%2034927219 Fax +49 40 34927220tel:%2B49%2040%2034927220
Sitz der Gesellschaft: Hamburg Registergericht: Amtsgericht Hamburg, HRB 120189 Geschäftsführer: Carsten Bock Ust-ID: DE279344284
Hier finden Sie unsere handelsrechtlichen Pflichtangaben: http://www.ng-voice.com/imprint/
-- Daniel Ciprus Integration engineer http://www.acision.com
9954 Mayland Dr Suite 3100 Richmond, VA 23233 USA T: +1 804 762 5601 E: daniel.ciprus@acision.commailto:daniel.ciprus@acision.com
________________________________ This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you for understanding.
_______________________________________________ sr-dev mailing list sr-dev@lists.sip-router.orgmailto:sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
-- Hugh Waite Principal Design Engineer Crocodile RCS Ltd.
-- Daniel Ciprus Integration engineer http://www.acision.com
9954 Mayland Dr Suite 3100 Richmond, VA 23233 USA T: +1 804 762 5601 E: daniel.ciprus@acision.commailto:daniel.ciprus@acision.com
________________________________ This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you for understanding.