Hi all,
we got a core in dialog module. We are using kamailio 3.1.2. Below you can find a full backtrace from the dump and the Kamailio compilation options. Please, if you need further information don't hesitate to ask me for it. I can't precise the situation when it is generated because we have a quite high load in this server.
Thanks in advance. Antón
(gdb) bt full #0 unref_dlg (dlg=0x7f08a9f67da8, cnt=1) at dlg_hash.c:598 d_entry = (struct dlg_entry *) 0x7f10304b8b68 #1 0x00007f08ce92fa02 in run_trans_callbacks_internal (cb_lst=0x7f08aa203e98, type=32768, trans=0x7f08aa203e28, params=0x7fff49059a10) at t_hooks.c:290 cbp = (struct tm_callback *) 0x7f08a9f6e7e0 backup_from = (avp_list_t *) 0x8b3330 backup_to = (avp_list_t *) 0x8b3338 backup_dom_from = (avp_list_t *) 0x8b3340 backup_dom_to = (avp_list_t *) 0x8b3348 backup_uri_from = (avp_list_t *) 0x8b3320 backup_uri_to = (avp_list_t *) 0x8b3328 #2 0x00007f08ce92fc56 in run_trans_callbacks (type=32768, trans=<value optimized out>, req=0x1, rpl=0x7f10304b8b68, code=-868566200) at t_hooks.c:317 params = {req = 0x0, rpl = 0x0, param = 0x7f08a9f6e7f0, code = 0, flags = 0, branch = 0, t_rbuf = 0x0, dst = 0x0, send_buf = { s = 0x0, len = 0}} #3 0x00007f08ce915b36 in free_cell (dead_cell=0x7f08aa203e28) at h_table.c:136 b = <value optimized out> i = <value optimized out> rpl = <value optimized out> tt = <value optimized out> foo = <value optimized out> cbs = <value optimized out> ---Type <return> to continue, or q <return> to quit--- __FUNCTION__ = "free_cell" #4 0x00007f08ce9319f1 in wait_handler (ti=<value optimized out>, wait_tl=<value optimized out>, data=<value optimized out>) at timer.c:645 p_cell = (struct cell *) 0x7f08aa203e28 #5 0x0000000000513d8f in timer_main () at timer.c:894 No locals. #6 0x000000000046501b in main_loop () at main.c:1618 i = 4 pid = <value optimized out> si = (struct socket_info *) 0x0 si_desc = "udp receiver child=3 sock=XXX.XXX.XXX.XX:XXXX\000\000\000\210�\231\000\000\000\000\000\031", '\0' <repeats 15 times>, "\001\000\000\000\000\000\000\000�\215\213", '\0' <repeats 13 times>, "\004", '\0' <repeats 15 times>, "\b\236\005I�\177\000\000\227%J\000\000\000\000" #7 0x0000000000467873 in main (argc=<value optimized out>, argv=0x7fff49059e08) at main.c:2398 cfg_stream = (FILE *) 0x12e1010 c = <value optimized out> r = <value optimized out> tmp = 0x7fff4905ae90 "" tmp_len = 32520 port = <value optimized out> proto = <value optimized out> ret = <value optimized out> seed = 1235801225 ---Type <return> to continue, or q <return> to quit--- rfd = 4 debug_save = <value optimized out> debug_flag = 0 dont_fork_cnt = 0 n_lst = <value optimized out> p = <value optimized out> (gdb) (gdb) quit kamailio2:/var/kamailio# kamailio -V version: kamailio 3.1.2 (x86_64/linux) eb24c1-dirty flags: STATS: Off, USE_IPV6, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: eb24c1 -dirty compiled on 09:35:52 Apr 28 2011 with gcc 4.3.2
Hey,
On 12.05.2011 12:37, Anton Roman wrote:
we got a core in dialog module. We are using kamailio 3.1.2. Below you can find a full backtrace from the dump and the Kamailio compilation options. Please, if you need further information don't hesitate to ask me for it. I can't precise the situation when it is generated because we have a quite high load in this server.
The call path seems to be like this:
transaction timer fires -> tm module walking through callback list finds unref_dlg() -> tm module calls unref_dlg() -> boom.
I wonder why unref_dlg() was registered as a tm callback in the first place -- the dialog module shouldn't do that. Are you using any custom modules that would possibly do such registrations?
As to the reason of the segfault, the dialog structure or hash table may already be gone when unref_dlg() is called. Can you go to stack #0 and tell us what the value of each of the following data structures is (use "p <data structure> in gdb):
*dlg d_table d_table->entries
Cheers,
--Timo
(gdb) bt full #0 unref_dlg (dlg=0x7f08a9f67da8, cnt=1) at dlg_hash.c:598 d_entry = (struct dlg_entry *) 0x7f10304b8b68 #1 0x00007f08ce92fa02 in run_trans_callbacks_internal (cb_lst=0x7f08aa203e98, type=32768, trans=0x7f08aa203e28, params=0x7fff49059a10) at t_hooks.c:290 cbp = (struct tm_callback *) 0x7f08a9f6e7e0 backup_from = (avp_list_t *) 0x8b3330 backup_to = (avp_list_t *) 0x8b3338 backup_dom_from = (avp_list_t *) 0x8b3340 backup_dom_to = (avp_list_t *) 0x8b3348 backup_uri_from = (avp_list_t *) 0x8b3320 backup_uri_to = (avp_list_t *) 0x8b3328 #2 0x00007f08ce92fc56 in run_trans_callbacks (type=32768, trans=<value optimized out>, req=0x1, rpl=0x7f10304b8b68, code=-868566200) at t_hooks.c:317 params = {req = 0x0, rpl = 0x0, param = 0x7f08a9f6e7f0, code = 0, flags = 0, branch = 0, t_rbuf = 0x0, dst = 0x0, send_buf = { s = 0x0, len = 0}} #3 0x00007f08ce915b36 in free_cell (dead_cell=0x7f08aa203e28) at h_table.c:136 b = <value optimized out> i = <value optimized out> rpl = <value optimized out> tt = <value optimized out> foo = <value optimized out> cbs = <value optimized out> ---Type <return> to continue, or q <return> to quit--- __FUNCTION__ = "free_cell" #4 0x00007f08ce9319f1 in wait_handler (ti=<value optimized out>, wait_tl=<value optimized out>, data=<value optimized out>) at timer.c:645 p_cell = (struct cell *) 0x7f08aa203e28 #5 0x0000000000513d8f in timer_main () at timer.c:894 No locals. #6 0x000000000046501b in main_loop () at main.c:1618 i = 4 pid = <value optimized out> si = (struct socket_info *) 0x0 si_desc = "udp receiver child=3 sock=XXX.XXX.XXX.XX:XXXX\000\000\000\210�\231\000\000\000\000\000\031", '\0' <repeats 15 times>, "\001\000\000\000\000\000\000\000�\215\213", '\0' <repeats 13 times>, "\004", '\0' <repeats 15 times>, "\b\236\005I�\177\000\000\227%J\000\000\000\000" #7 0x0000000000467873 in main (argc=<value optimized out>, argv=0x7fff49059e08) at main.c:2398 cfg_stream = (FILE *) 0x12e1010 c = <value optimized out> r = <value optimized out> tmp = 0x7fff4905ae90 "" tmp_len = 32520 port = <value optimized out> proto = <value optimized out> ret = <value optimized out> seed = 1235801225 ---Type <return> to continue, or q <return> to quit--- rfd = 4 debug_save = <value optimized out> debug_flag = 0 dont_fork_cnt = 0 n_lst = <value optimized out> p = <value optimized out> (gdb) (gdb) quit kamailio2:/var/kamailio# kamailio -V version: kamailio 3.1.2 (x86_64/linux) eb24c1-dirty flags: STATS: Off, USE_IPV6, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: eb24c1 -dirty compiled on 09:35:52 Apr 28 2011 with gcc 4.3.2
Hello,
my answer is inline:
2011/5/12 Timo Reimann timo.reimann@1und1.de
Hey,
On 12.05.2011 12:37, Anton Roman wrote:
we got a core in dialog module. We are using kamailio 3.1.2. Below you can find a full backtrace from the dump and the Kamailio compilation options. Please, if you need further information don't hesitate to ask me for it. I can't precise the situation when it is generated because we have a quite high load in this server.
The call path seems to be like this:
transaction timer fires -> tm module walking through callback list finds unref_dlg() -> tm module calls unref_dlg() -> boom.
I wonder why unref_dlg() was registered as a tm callback in the first place -- the dialog module shouldn't do that. Are you using any custom modules that would possibly do such registrations?
No, we aren't. We got the code of all the modules directly from the git repository.
As to the reason of the segfault, the dialog structure or hash table may already be gone when unref_dlg() is called. Can you go to stack #0 and tell us what the value of each of the following data structures is (use "p <data structure> in gdb):
*dlg d_table d_table->entries
Here you have:
(gdb) p *dlg $1 = {ref = 793790803, next = 0xa0d4b4f20303032, prev = 0x504953203a616956, h_id = 808333871, h_entry = 1346655535, state = 775174432, lifetime = 841888562, start_ts = 892219952, dflags = 808794678, sflags = 1648046134, toroute = 1668178290, toroute_name = { s = 0x62344768397a3d68 <Address 0x62344768397a3d68 out of bounds>, len = 946221643}, from_rr_nb = 1886534457, tl = { next = 0x72460a0d30363035, prev = 0x6f6e4122203a6d6f, timeout = 1869445486}, callid = { s = 0x6f6e613a7069733c <Address 0x6f6e613a7069733c out of bounds>, len = 1869445486}, from_uri = { s = 0x3230322e33322e34 <Address 0x3230322e33322e34 out of bounds>, len = 1043739950}, to_uri = { s = 0x396637643173613d <Address 0x396637643173613d out of bounds>, len = 221656933}, req_uri = { s = 0x34333a7069733c20 <Address 0x34333a7069733c20 out of bounds>, len = 925972025}, tag = {{ s = 0x33322e3539314030 <Address 0x33322e3539314030 out of bounds>, len = 942747189}, { s = 0x743b3e303630353a <Address 0x743b3e303630353a out of bounds>, len = 1178429281}}, cseq = {{ s = 0x364134322d344434 <Address 0x364134322d344434 out of bounds>, len = 1631848973}, { s = 0x203932202c697246 <Address 0x203932202c697246 out of bounds>, len = 544236883}}, route_set = {{ s = 0x343a30313a333020 <Address 0x343a30313a333020 out of bounds>, len = 1296506937}, { s = 0x203a44492d6c6c61 <Address 0x203a44492d6c6c61 out of bounds>, len = 1630549808}}, contact = {{ s = 0x6639633663313634 <Address 0x6639633663313634 out of bounds>, len = 858808881}, { s = 0x6464363632663631 <Address 0x6464363632663631 out of bounds>, len = 775174464}}, bind_addr = {0x530a0d36352e3230, 0x43203a7265767265}, cbs = {first = 0x5049532d6f637369, types = 1702125895}, profile_links = 0x782e32312d534f49} (gdb) p d_table $2 = (struct dlg_table *) 0x7f08a9e38ee0 (gdb) p d_table->entries $3 = (struct dlg_entry *) 0x7f08a9e38f00 (gdb) info f Stack level 0, frame at 0x7fff49059980: rip = 0x7f08cc18fa39 in unref_dlg (dlg_hash.c:598); saved rip 0x7f08ce92fa02 called by frame at 0x7fff49059a10 source language c. Arglist at 0x7fff490598a8, args: dlg=0x7f08a9f67da8, cnt=1 Locals at 0x7fff490598a8, Previous frame's sp is 0x7fff49059980 Saved registers: rbx at 0x7fff49059948, rbp at 0x7fff49059950, r12 at 0x7fff49059958, r13 at 0x7fff49059960, r14 at 0x7fff49059968, r15 at 0x7fff49059970, rip at 0x7fff49059978 (gdb)
Cheers,
--Timo
Thank you very much,
Regards, Antón
(gdb) bt full #0 unref_dlg (dlg=0x7f08a9f67da8, cnt=1) at dlg_hash.c:598 d_entry = (struct dlg_entry *) 0x7f10304b8b68 #1 0x00007f08ce92fa02 in run_trans_callbacks_internal (cb_lst=0x7f08aa203e98, type=32768, trans=0x7f08aa203e28, params=0x7fff49059a10) at t_hooks.c:290 cbp = (struct tm_callback *) 0x7f08a9f6e7e0 backup_from = (avp_list_t *) 0x8b3330 backup_to = (avp_list_t *) 0x8b3338 backup_dom_from = (avp_list_t *) 0x8b3340 backup_dom_to = (avp_list_t *) 0x8b3348 backup_uri_from = (avp_list_t *) 0x8b3320 backup_uri_to = (avp_list_t *) 0x8b3328 #2 0x00007f08ce92fc56 in run_trans_callbacks (type=32768, trans=<value optimized out>, req=0x1, rpl=0x7f10304b8b68, code=-868566200) at t_hooks.c:317 params = {req = 0x0, rpl = 0x0, param = 0x7f08a9f6e7f0, code = 0, flags = 0, branch = 0, t_rbuf = 0x0, dst = 0x0, send_buf = { s = 0x0, len = 0}} #3 0x00007f08ce915b36 in free_cell (dead_cell=0x7f08aa203e28) at h_table.c:136 b = <value optimized out> i = <value optimized out> rpl = <value optimized out> tt = <value optimized out> foo = <value optimized out> cbs = <value optimized out> ---Type <return> to continue, or q <return> to quit--- __FUNCTION__ = "free_cell" #4 0x00007f08ce9319f1 in wait_handler (ti=<value optimized out>, wait_tl=<value optimized out>, data=<value optimized out>) at timer.c:645 p_cell = (struct cell *) 0x7f08aa203e28 #5 0x0000000000513d8f in timer_main () at timer.c:894 No locals. #6 0x000000000046501b in main_loop () at main.c:1618 i = 4 pid = <value optimized out> si = (struct socket_info *) 0x0 si_desc = "udp receiver child=3 sock=XXX.XXX.XXX.XX:XXXX\000\000\000\210�\231\000\000\000\000\000\031", '\0' <repeats 15 times>, "\001\000\000\000\000\000\000\000�\215\213", '\0' <repeats 13 times>, "\004", '\0' <repeats 15 times>, "\b\236\005I�\177\000\000\227%J\000\000\000\000" #7 0x0000000000467873 in main (argc=<value optimized out>, argv=0x7fff49059e08) at main.c:2398 cfg_stream = (FILE *) 0x12e1010 c = <value optimized out> r = <value optimized out> tmp = 0x7fff4905ae90 "" tmp_len = 32520 port = <value optimized out> proto = <value optimized out> ret = <value optimized out> seed = 1235801225 ---Type <return> to continue, or q <return> to quit--- rfd = 4 debug_save = <value optimized out> debug_flag = 0 dont_fork_cnt = 0 n_lst = <value optimized out> p = <value optimized out> (gdb) (gdb) quit kamailio2:/var/kamailio# kamailio -V version: kamailio 3.1.2 (x86_64/linux) eb24c1-dirty flags: STATS: Off, USE_IPV6, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: eb24c1 -dirty compiled on 09:35:52 Apr 28 2011 with gcc 4.3.2
Hey Anton,
On 12.05.2011 15:55, Anton Roman wrote:
my answer is inline:
2011/5/12 Timo Reimann <timo.reimann@1und1.de mailto:timo.reimann@1und1.de> As to the reason of the segfault, the dialog structure or hash table may already be gone when unref_dlg() is called. Can you go to stack #0 and tell us what the value of each of the following data structures is (use "p <data structure> in gdb):
*dlg d_table d_table->entries
Here you have:
(gdb) p *dlg $1 = {ref = 793790803, next = 0xa0d4b4f20303032, prev = 0x504953203a616956, h_id = 808333871, h_entry = 1346655535, state = 775174432, lifetime = 841888562, start_ts = 892219952, dflags = 808794678, sflags = 1648046134, toroute = 1668178290, toroute_name = { s = 0x62344768397a3d68 <Address 0x62344768397a3d68 out of bounds>, len = 946221643}, from_rr_nb = 1886534457, tl = { next = 0x72460a0d30363035, prev = 0x6f6e4122203a6d6f, timeout = 1869445486}, callid = { s = 0x6f6e613a7069733c <Address 0x6f6e613a7069733c out of bounds>, len = 1869445486}, from_uri = { s = 0x3230322e33322e34 <Address 0x3230322e33322e34 out of bounds>, len = 1043739950}, to_uri = {
[...]
As I suspected, your dialog seems outdated already: The reference count is 793790803, and the Call-ID is supposed to have a rough 2 billions characters. That's what I call unique. :)
I could ask you for more details on the dump but it'd probably be easiest if I could take a direct (gdb-)look at it. Would you mind sending it to me in private (i.e., no CC to the mailing list) to the address I am writing from?
Cheers,
--Timo
Hey,
On 13.05.2011 11:11, Timo Reimann wrote:
On 12.05.2011 15:55, Anton Roman wrote:
my answer is inline:
2011/5/12 Timo Reimann <timo.reimann@1und1.de mailto:timo.reimann@1und1.de> As to the reason of the segfault, the dialog structure or hash table may already be gone when unref_dlg() is called. Can you go to stack #0 and tell us what the value of each of the following data structures is (use "p <data structure> in gdb):
*dlg d_table d_table->entries
Here you have:
(gdb) p *dlg $1 = {ref = 793790803, next = 0xa0d4b4f20303032, prev = 0x504953203a616956, h_id = 808333871, h_entry = 1346655535, state = 775174432, lifetime = 841888562, start_ts = 892219952, dflags = 808794678, sflags = 1648046134, toroute = 1668178290, toroute_name = { s = 0x62344768397a3d68 <Address 0x62344768397a3d68 out of bounds>, len = 946221643}, from_rr_nb = 1886534457, tl = { next = 0x72460a0d30363035, prev = 0x6f6e4122203a6d6f, timeout = 1869445486}, callid = { s = 0x6f6e613a7069733c <Address 0x6f6e613a7069733c out of bounds>, len = 1869445486}, from_uri = { s = 0x3230322e33322e34 <Address 0x3230322e33322e34 out of bounds>, len = 1043739950}, to_uri = {
[...]
As I suspected, your dialog seems outdated already: The reference count is 793790803, and the Call-ID is supposed to have a rough 2 billions characters. That's what I call unique. :)
I could ask you for more details on the dump but it'd probably be easiest if I could take a direct (gdb-)look at it. Would you mind sending it to me in private (i.e., no CC to the mailing list) to the address I am writing from?
I (and Marius -- credits!) digged through your coredump and found a few curiosities. Before I bug you with the details, let me just say this: There might be something wrong the dialog reference counter that determines when a dialog is a to be removed from the hash table. In fact, your call stack indicates that an unreference operation was attempted on a hash table which looks empty:
(gdb) frame 0 #0 unref_dlg (dlg=0x7f08a9f67da8, cnt=1) at dlg_hash.c:598 598 dlg_lock( d_table, d_entry); (gdb) p *d_table->entries $53 = {first = 0x0, last = 0x0, next_id = 1124074261, lock_idx = 0}
Looking through the mailing-list archive, I noticed you brought attention to another reference counter-related bug which Daniel provided a fix for with commit 2c28a251a. Since you reported that no more issues appeared with that fixed version, I just backported the patch into 3.1. However, I can see from your core dump that you are not using a Kamailio version that includes the fix.
Before we continue with any bug hunting, could you try a version of Kamailio that comes with Daniel's "safer unref of terminated dialogs" patch? This can be master branch copy or a recent copy of the 3.1 git branch. I'd suggest the latter so we can ensure that no bleeding-edge features added to the dialog module distort our analysis.
Thanks and
Cheers,
--Timo
Hi,
yes, you're totally right, we got the core in other server and I though the fix was included in the code we compiled in this server, but it wasn't. My fault.
Now, a very recent copy of the 3.1 git branch is running, Daniel's patch is included. I'll keep you informed but it should go fine.
Thanks, and sorry for the misunderstanding,
Regards, Anton
2011/5/13 Timo Reimann timo.reimann@1und1.de
Hey,
On 13.05.2011 11:11, Timo Reimann wrote:
On 12.05.2011 15:55, Anton Roman wrote:
my answer is inline:
2011/5/12 Timo Reimann <timo.reimann@1und1.de mailto:timo.reimann@1und1.de> As to the reason of the segfault, the dialog structure or hash table
may
already be gone when unref_dlg() is called. Can you go to stack #0
and
tell us what the value of each of the following data structures is
(use
"p <data structure> in gdb): *dlg d_table d_table->entries
Here you have:
(gdb) p *dlg $1 = {ref = 793790803, next = 0xa0d4b4f20303032, prev = 0x504953203a616956, h_id = 808333871, h_entry = 1346655535, state = 775174432, lifetime = 841888562, start_ts = 892219952, dflags = 808794678, sflags = 1648046134, toroute = 1668178290, toroute_name = { s = 0x62344768397a3d68 <Address 0x62344768397a3d68 out of bounds>, len = 946221643}, from_rr_nb = 1886534457, tl = { next = 0x72460a0d30363035, prev = 0x6f6e4122203a6d6f, timeout = 1869445486}, callid = { s = 0x6f6e613a7069733c <Address 0x6f6e613a7069733c out of bounds>, len = 1869445486}, from_uri = { s = 0x3230322e33322e34 <Address 0x3230322e33322e34 out of bounds>, len = 1043739950}, to_uri = {
[...]
As I suspected, your dialog seems outdated already: The reference count is 793790803, and the Call-ID is supposed to have a rough 2 billions characters. That's what I call unique. :)
I could ask you for more details on the dump but it'd probably be easiest if I could take a direct (gdb-)look at it. Would you mind sending it to me in private (i.e., no CC to the mailing list) to the address I am writing from?
I (and Marius -- credits!) digged through your coredump and found a few curiosities. Before I bug you with the details, let me just say this: There might be something wrong the dialog reference counter that determines when a dialog is a to be removed from the hash table. In fact, your call stack indicates that an unreference operation was attempted on a hash table which looks empty:
(gdb) frame 0 #0 unref_dlg (dlg=0x7f08a9f67da8, cnt=1) at dlg_hash.c:598 598 dlg_lock( d_table, d_entry); (gdb) p *d_table->entries $53 = {first = 0x0, last = 0x0, next_id = 1124074261, lock_idx = 0}
Looking through the mailing-list archive, I noticed you brought attention to another reference counter-related bug which Daniel provided a fix for with commit 2c28a251a. Since you reported that no more issues appeared with that fixed version, I just backported the patch into 3.1. However, I can see from your core dump that you are not using a Kamailio version that includes the fix.
Before we continue with any bug hunting, could you try a version of Kamailio that comes with Daniel's "safer unref of terminated dialogs" patch? This can be master branch copy or a recent copy of the 3.1 git branch. I'd suggest the latter so we can ensure that no bleeding-edge features added to the dialog module distort our analysis.
Thanks and
Cheers,
--Timo