Hello,

my answer is inline:

2011/5/12 Timo Reimann <timo.reimann@1und1.de>
Hey,


On 12.05.2011 12:37, Anton Roman wrote:
> we got a core in dialog module. We are using kamailio 3.1.2. Below you
> can find a full backtrace from the dump and the Kamailio compilation
> options. Please, if you need further information don't hesitate to ask
> me for it.  I can't precise the situation when it is generated because
> we have a quite high load in this server.

The call path seems to be like this:

transaction timer fires -> tm module walking through callback list finds
unref_dlg() -> tm module calls unref_dlg() -> boom.

I wonder why unref_dlg() was registered as a tm callback in the first
place -- the dialog module shouldn't do that. Are you using any custom
modules that would possibly do such registrations?

No, we aren't. We got the code of all the modules directly from the git repository. 


As to the reason of the segfault, the dialog structure or hash table may
already be gone when unref_dlg() is called. Can you go to stack #0 and
tell us what the value of each of the following data structures is (use
"p <data structure> in gdb):

*dlg
d_table
d_table->entries

Here you have:

(gdb) p *dlg
$1 = {ref = 793790803, next = 0xa0d4b4f20303032, prev = 0x504953203a616956, h_id = 808333871, h_entry = 1346655535, state = 775174432,
  lifetime = 841888562, start_ts = 892219952, dflags = 808794678, sflags = 1648046134, toroute = 1668178290, toroute_name = {
    s = 0x62344768397a3d68 <Address 0x62344768397a3d68 out of bounds>, len = 946221643}, from_rr_nb = 1886534457, tl = {
    next = 0x72460a0d30363035, prev = 0x6f6e4122203a6d6f, timeout = 1869445486}, callid = {
    s = 0x6f6e613a7069733c <Address 0x6f6e613a7069733c out of bounds>, len = 1869445486}, from_uri = {
    s = 0x3230322e33322e34 <Address 0x3230322e33322e34 out of bounds>, len = 1043739950}, to_uri = {
    s = 0x396637643173613d <Address 0x396637643173613d out of bounds>, len = 221656933}, req_uri = {
    s = 0x34333a7069733c20 <Address 0x34333a7069733c20 out of bounds>, len = 925972025}, tag = {{
    s = 0x33322e3539314030 <Address 0x33322e3539314030 out of bounds>, len = 942747189}, {
    s = 0x743b3e303630353a <Address 0x743b3e303630353a out of bounds>, len = 1178429281}}, cseq = {{
    s = 0x364134322d344434 <Address 0x364134322d344434 out of bounds>, len = 1631848973}, {
    s = 0x203932202c697246 <Address 0x203932202c697246 out of bounds>, len = 544236883}}, route_set = {{
    s = 0x343a30313a333020 <Address 0x343a30313a333020 out of bounds>, len = 1296506937}, {
    s = 0x203a44492d6c6c61 <Address 0x203a44492d6c6c61 out of bounds>, len = 1630549808}}, contact = {{
    s = 0x6639633663313634 <Address 0x6639633663313634 out of bounds>, len = 858808881}, {
    s = 0x6464363632663631 <Address 0x6464363632663631 out of bounds>, len = 775174464}}, bind_addr = {0x530a0d36352e3230,
    0x43203a7265767265}, cbs = {first = 0x5049532d6f637369, types = 1702125895}, profile_links = 0x782e32312d534f49}
(gdb) p d_table
$2 = (struct dlg_table *) 0x7f08a9e38ee0
(gdb) p d_table->entries
$3 = (struct dlg_entry *) 0x7f08a9e38f00
(gdb) info f
Stack level 0, frame at 0x7fff49059980:
 rip = 0x7f08cc18fa39 in unref_dlg (dlg_hash.c:598); saved rip 0x7f08ce92fa02
 called by frame at 0x7fff49059a10
 source language c.
 Arglist at 0x7fff490598a8, args: dlg=0x7f08a9f67da8, cnt=1
 Locals at 0x7fff490598a8, Previous frame's sp is 0x7fff49059980
 Saved registers:
  rbx at 0x7fff49059948, rbp at 0x7fff49059950, r12 at 0x7fff49059958, r13 at 0x7fff49059960, r14 at 0x7fff49059968, r15 at 0x7fff49059970,
  rip at 0x7fff49059978
(gdb)



Cheers,

--Timo

 
Thank you very much,

Regards,
Antón
 


> (gdb) bt full
> #0  unref_dlg (dlg=0x7f08a9f67da8, cnt=1) at dlg_hash.c:598
>     d_entry = (struct dlg_entry *) 0x7f10304b8b68
> #1  0x00007f08ce92fa02 in run_trans_callbacks_internal
> (cb_lst=0x7f08aa203e98, type=32768, trans=0x7f08aa203e28,
> params=0x7fff49059a10)
>     at t_hooks.c:290
>     cbp = (struct tm_callback *) 0x7f08a9f6e7e0
>     backup_from = (avp_list_t *) 0x8b3330
>     backup_to = (avp_list_t *) 0x8b3338
>     backup_dom_from = (avp_list_t *) 0x8b3340
>     backup_dom_to = (avp_list_t *) 0x8b3348
>     backup_uri_from = (avp_list_t *) 0x8b3320
>     backup_uri_to = (avp_list_t *) 0x8b3328
> #2  0x00007f08ce92fc56 in run_trans_callbacks (type=32768, trans=<value
> optimized out>, req=0x1, rpl=0x7f10304b8b68, code=-868566200)
>     at t_hooks.c:317
>     params = {req = 0x0, rpl = 0x0, param = 0x7f08a9f6e7f0, code = 0,
> flags = 0, branch = 0, t_rbuf = 0x0, dst = 0x0, send_buf = {
>     s = 0x0, len = 0}}
> #3  0x00007f08ce915b36 in free_cell (dead_cell=0x7f08aa203e28) at
> h_table.c:136
>     b = <value optimized out>
>     i = <value optimized out>
>     rpl = <value optimized out>
>     tt = <value optimized out>
>     foo = <value optimized out>
>     cbs = <value optimized out>
> ---Type <return> to continue, or q <return> to quit---
>     __FUNCTION__ = "free_cell"
> #4  0x00007f08ce9319f1 in wait_handler (ti=<value optimized out>,
> wait_tl=<value optimized out>, data=<value optimized out>) at timer.c:645
>     p_cell = (struct cell *) 0x7f08aa203e28
> #5  0x0000000000513d8f in timer_main () at timer.c:894
> No locals.
> #6  0x000000000046501b in main_loop () at main.c:1618
>     i = 4
>     pid = <value optimized out>
>     si = (struct socket_info *) 0x0
>     si_desc = "udp receiver child=3
> sock=XXX.XXX.XXX.XX:XXXX\000\000\000\210�\231\000\000\000\000\000\031",
> '\0' <repeats 15 times>, "\001\000\000\000\000\000\000\000�\215\213",
> '\0' <repeats 13 times>, "\004", '\0' <repeats 15 times>,
> "\b\236\005I�\177\000\000\227%J\000\000\000\000"
> #7  0x0000000000467873 in main (argc=<value optimized out>,
> argv=0x7fff49059e08) at main.c:2398
>     cfg_stream = (FILE *) 0x12e1010
>     c = <value optimized out>
>     r = <value optimized out>
>     tmp = 0x7fff4905ae90 ""
>     tmp_len = 32520
>     port = <value optimized out>
>     proto = <value optimized out>
>     ret = <value optimized out>
>     seed = 1235801225
> ---Type <return> to continue, or q <return> to quit---
>     rfd = 4
>     debug_save = <value optimized out>
>     debug_flag = 0
>     dont_fork_cnt = 0
>     n_lst = <value optimized out>
>     p = <value optimized out>
> (gdb)
> (gdb) quit
> kamailio2:/var/kamailio# kamailio -V
> version: kamailio 3.1.2 (x86_64/linux) eb24c1-dirty
> flags: STATS: Off, USE_IPV6, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS,
> DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC,
> DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE,
> USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
> ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16,
> MAX_URI_SIZE 1024, BUF_SIZE 65535, PKG_SIZE 8MB
> poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
> id: eb24c1 -dirty
> compiled on 09:35:52 Apr 28 2011 with gcc 4.3.2