Hey Iñaki,
Iñaki Baz Castillo wrote:
2010/7/12 Timo Reimann timo.reimann@1und1.de:
Oh, and by the way: Kamailio does deny sending out BYE requests for calls not in the "confirmed" state as of now. However, this check is done in the tm module called by the dialog module, and the latter passes to the former a dialog state of DLG_CONFIRMED *always*.
Opsss, and this is why it gives an error when there is no remote-target (no Contact in the UAS's responses yet).
Yep, exactly.
Done with forensic. This is why the leak happens for early dialogs:
(1) dlg_end_dlg triggers generation of a BYE request for the caller which succeeds because the caller's Contact address was stored during processing of the initial INVITE. (2) The reference counter is increased by one and supposed to be decremented again during processing of the BYE response (in bye_reply_cb()) from the caller. (3) The BYE request is send to the caller. However, it replies with "481 Call Leg/Transaction Does Not Exist" because the request is missing the callee's To tag which is not stored in the dialog module prior to the call's transition to the confirmed state. (4) The caller's 481 response is dialog-handled in bye_reply_cb(). What this function basically does is on reception of a final response, run the proxy-initiated BYE request (not the response) through the state machine and decrease the reference counter *if the new dialog state is "terminated" after completion of the state machine*. However, because BYE requests in the early state do not trigger state transitions due to the fact that the dialog module cannot tell yet whether a single branch or the entire call was terminated the "early" state is maintained and, in consequence, the reference counter *not decremented*. (5) Generation of a BYE request for the callee is triggered. However, it fails because no callee Contact is available. The reference counter will not be touched either though. (6) When the UAC sends out a CANCEL request the call is torn down. However, the dialog structure will not be deleted because the reference counter can not drop lower than one.
The bottom line of this is that the dialog module leaks because it cannot handle in-early-dialog BYE requests accordingly. The way to deal with this at the moment is to deny proxy-initiated dialog termination unless the call is in the "confirmed" state because that's the only state it may manage to succeed. I say *may* because if a flawed or crashed UA is not able to process such a BYE properly it will reply with a non-2xx response or not at all, respectively, by which the proxy should consider its termination attempt to have failed and, subsequently, not delete the dialog IMHO. Even worse, you could have a situation where only one of the call participants shuts down the session while the other doesn't, leading to a weird kind of one-sided dialog. In order to avoid this, I believe the proxy should have UA response codes affect the state machinery and not rely on its BYE requests only.
However, the latter is something to be done in future dialog module. To prevent any kind of leakage in the current implementation, my proposed fix would be to
(1) Deny proxy-initiated dialog termination unless the call is in the "confirmed" state and (2) always decrement the reference counter during BYE response handling.
Let me know what you think of this.
Cheers,
--Timo