Hello,

On 8/16/13 4:17 PM, Jason Penton wrote:
Hi,

It seems to me that there is a possible race condition in dialog module that can potentially cause segfault. Let me explain my thinking.

In the dlg_timer routine we call get_expired_dlgs (line 4). This function returns a list of dlg_tl's that have expired. This code is executed in the timer process and I can't see any reason why if a dialog is terminated (user hangs up) at the same time as the expiry timer fires that the dialog can't be nuked between lines 4 and 5 below. This will ultimately result in segfault in the later lines (5 and onwards) or in the specific timer_hdl callback function where the dialog is retrieved using some pointer arithmetic....

1. void dlg_timer_routine(unsigned int ticks , void * attr)
2. {
3. struct dlg_tl *tl, *ctl;

4. tl = get_expired_dlgs( ticks );

5. while (tl) {
6. ctl = tl;
7. tl = tl->next;
8. ctl->next = NULL;
9. LM_DBG("tl=%p next=%p\n", ctl, tl);
10. timer_hdl( ctl );
11. }
12. }

I would imagine we should look at incrementing ref for every dlg that goes into the tl. Then unref when removed or when fired.... (but at quick glance it looks like there could be a few locking issues with this solution)
iirc, there was a counter inc for keeping the structure in the timer list and dec for removing it, isn't like that?

Cheers,
Daniel
-- 
Daniel-Constantin Mierla - http://www.asipto.com
http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda