2010/7/12 Timo Reimann timo.reimann@1und1.de:
Done with forensic. This is why the leak happens for early dialogs:
(1) dlg_end_dlg triggers generation of a BYE request for the caller which succeeds because the caller's Contact address was stored during processing of the initial INVITE. (2) The reference counter is increased by one and supposed to be decremented again during processing of the BYE response (in bye_reply_cb()) from the caller. (3) The BYE request is send to the caller. However, it replies with "481 Call Leg/Transaction Does Not Exist" because the request is missing the callee's To tag which is not stored in the dialog module prior to the call's transition to the confirmed state. (4) The caller's 481 response is dialog-handled in bye_reply_cb(). What this function basically does is on reception of a final response, run the proxy-initiated BYE request (not the response) through the state machine and decrease the reference counter *if the new dialog state is "terminated" after completion of the state machine*. However, because BYE requests in the early state do not trigger state transitions due to the fact that the dialog module cannot tell yet whether a single branch or the entire call was terminated the "early" state is maintained and, in consequence, the reference counter *not decremented*. (5) Generation of a BYE request for the callee is triggered. However, it fails because no callee Contact is available. The reference counter will not be touched either though. (6) When the UAC sends out a CANCEL request the call is torn down. However, the dialog structure will not be deleted because the reference counter can not drop lower than one.
Great analysis.
However, the latter is something to be done in future dialog module. To prevent any kind of leakage in the current implementation, my proposed fix would be to
(1) Deny proxy-initiated dialog termination unless the call is in the "confirmed" state and
It makes sense as termination of early-dialog would involve terminating an existing INVITE transaction which would require a 408 to the UAC and CANCEL/BYE to the UAS. Perhaps a future feature ;)
(2) always decrement the reference counter during BYE response handling.
So, never rely on the response to the BYE and never expect that the BYE would get a response, am I right? If so I fully agree. A BYE initiated by the proxy shoud always work except in case it's sent at the same time as an in-dialog request by an endpoint, so the CSeq of the BYE could be too small. In order to prevent it, the BYE generated by the proxy should ensure that its CSeq value is 5-10 times greater than the last CSeq value in the dialog (for each side). And when the BYE is sent then update the dialog status without waiting for the response, do you agree?
Thanks a lot!