At 10:07 16/02/2007, samuel wrote:
It appears that there are two things mixed together. The 100 is sent upstream whereas the problem appears downstream. Not sending 100 downstream doesn't solve the race conditions located downstream.
The problem is that SER doesn't delay sending the CANCEL, it just gives up on it. The issues is being bug-tracked now, there is no ETA yet.
Thank you for reporting!
-jiri
-- Jiri Kuthan http://iptel.org/~jiri/
Jiri Kuthan wrote:
As a side observation regarding to protocol's fault tolerance / race prone, it would seem there's no way for a stateful proxy to let a callee (1) whose provisional reply got lost (or not sent, in the first place, as provisionals is a MUST only for stateful proxies), but (2) whose final answer was received, about the fact that the call was canceled in the meantime, even though the proxy has the clear picture - and the only one that can have it - at the moment it sees the final reply from callee; (callee which will, however, time out, eventually, waiting for the ACK).
Bogdan.
At 16:56 19/02/2007, Bogdan Pintea wrote:
As a side observation regarding to protocol's fault tolerance / race prone, it would seem there's no way for a stateful proxy to let a callee (1) whose provisional reply got lost (or not sent, in the first place, as provisionals is a MUST only for stateful proxies), but (2) whose final answer was received, about the fact that the call was canceled in the meantime, even though the proxy has the clear picture - and the only one that can have it - at the moment it sees the final reply from callee;
Hi Bogdan,
just thinking loudly how to best narrow down the problem. I think there are two: one (2) is race-condition. This occurs even if no provisional answer is lost -- it is just that CANCEL meets 200 on the net and client has then to sort it out.
The other problem (1) is a kind of amplifier in that it enlarges the race condition window through lacking reliability of provisional answers.
So I think that the solution is the same ... put the burden on UAC. It is then UAC's responsibility to terminate a too-late-cancelled call (even if "too late" is actually caused by reliability issues).
RFC3261 "If the INVITE results in 2xx final response(s) to the INVITE, this means that a UAS accepted the invitation while the CANCEL was in progress. The UAC MAY continue with the sessions established by any 2xx responses, or MAY terminate them with BYE."
So I guess that the harm is sustainable even though it indeed doesnt look too nice.
(callee which will, however, time out, eventually, waiting for the ACK).
How come? UAS keeps retransmitting 200s till ACK comes. The ACK should come rather early. It is then UAC's choice to BYE the too-late-cancelled call or not.
-jiri
Hi Jiri,
Jiri Kuthan wrote:
Yeah, you're right, my "there is no way" was too categorical. The callee is left at the mercy of the two "MAY"s above, and the existence of "ghost calls" would indicate that none of the two is chosen, while not really breaking the specs... (it's not like "must either send 2xx or bye"). It only takes a responsible caller client to make sure that the "tried-to- cancel-but-actually-failed, even-though-in-a-successful-manner" dialog gets eventually brought down, not let to timeout on callee's side.
I think I even saw this scenario as an attack for capacity starvation on PSTN GWs somewhere.
But the 'generic finding' was that the proxy knows what's going on and could, theoretically, put a quick end, in a predictable way, to callee's lurch, but it can't (as it's just a proxy; maybe yet another, not so solid, argument for B2Bs in network's core...).
Bogdan.