On Thu, Apr 22, 2010 at 4:47 PM, Daniel-Constantin Mierla <miconda@gmail.com> wrote:

Hi Timo,

thanks for troubleshooting. I committed the patch that moves setting of bind_addr before any error case in populate_leg_info(). I backported to kamailio_3.0 branch as well.

Kelvin, can you get the lasted git version for branch kamailio_3.0 and test?

Thanks,
Daniel

On 4/22/10 1:21 AM, Timo Reimann wrote:

Hello,

Kelvin Chua wrote:

(gdb) bt
#0 0x00002ab61b62779a in update_dialog_dbinfo (cell=0x2ab61c9100f8) at
dlg_db_handler.c:501

This corresponds to

SET_STR_VALUE(values+8, cell->bind_addr[DLG_CALLEE_LEG]->sock_str);

so assumingly sip-router crashes when it tries to access the callee's
bound address's sock_str...

#1 0x00002ab61b628ea8 in dlg_onreply (t=0x7d5228, type=<value optimized
out>, param=<value optimized out>) at dlg_handlers.c:361
#2 0x00002ab617965505 in run_trans_callbacks_internal
(cb_lst=0x2ab61c938830, type=128, trans=0x2ab61c9387c0,

(gdb) print cell
$1 = (struct dlg_cell *) 0x2ab61c9100f8

(gdb) print *cell
0}}, bind_addr = {0x88c580, 0x0},
cbs = {first = 0x0, types = 0}, profile_links = 0x0}

... as supported by the fact that bind_addr's second field
(DLG_CALLEE_LEG) is 0.

Why does the segfault happen?

Let's trace the code path: The initial error message

"bad sip message or missing Contact hdr"

occurred in dlg_handlers.c, line 218, which makes this piece of code's
surrounding function "populate_leg_info" return prematurely (by means of
"goto error0"). Specifically, this implies that the code at the end of
the function on line 272

dlg->bind_addr[leg] = msg->rcv.bind_address;

isn't carried out anymore, leaving the callee's bound address associated
with the given dialog unassigned. (This happens to be the only occasion
where the bound address is assigned.) Instead, execution drops back to
the "dlg_onreply" function and proceeds to line 361, thereby calling the
database update function:

update_dialog_dbinfo(dlg);

which directly leads to the segfaulting code location.

AFAICS, "update_dialog_dbinfo" is dereferencing a possibly null memory
location at the dialog data in question only, so one way to prevent the
segfault from happening is to move the bound address assignment before
any failing code in the function. This should make sure that some
accessible bound address is stored in any case.

Cheers,

--Timo

--
Daniel-Constantin Mierla * http://www.asipto.com/ * http://twitter.com/miconda * http://www.linkedin.com/in/danielconstantinmierla