Hi,
It seems to me that there is a possible race condition in dialog module
that can potentially cause segfault. Let me explain my thinking.
In the dlg_timer routine we call get_expired_dlgs (line 4). This function
returns a list of dlg_tl's that have expired. This code is executed in the
timer process and I can't see any reason why if a dialog is terminated
(user hangs up) at the same time as the expiry timer fires that the dialog
can't be nuked between lines 4 and 5 below. This will ultimately result in
segfault in the later lines (5 and onwards) or in the specific timer_hdl
callback function where the dialog is retrieved using some pointer
arithmetic....
1. void dlg_timer_routine(unsigned int ticks , void * attr)
2. {
3. struct dlg_tl *tl, *ctl;
4. tl = get_expired_dlgs( ticks );
5. while (tl) {
6. ctl = tl;
7. tl = tl->next;
8. ctl->next = NULL;
9. LM_DBG("tl=%p next=%p\n", ctl, tl);
10. timer_hdl( ctl );
11. }
12. }
I would imagine we should look at incrementing ref for every dlg that goes
into the tl. Then unref when removed or when fired.... (but at quick glance
it looks like there could be a few locking issues with this solution)
Cheers
Jason