Hi Sean,
Yes, t_check() sets T as NULL if no transaction is matched, but the
reply_received() function (that calls t_check), if T was set to NULL
will go to "not_found" label and set T to T_UNDEFINED.
Do you agree on this? if so, we can start working in adding some more
debug logs to see where the problem is.
Regards,
Bogdan
Sean O'Donnell wrote:
Hi all,
I’m using openser as a call distributor/proxy between a soft-switch/SBC and
voicemail platform. I’m seeing a problem with openser in that it is sometimes
cancels an in-progress call (fr_inv_timer firing) because it didn’t match the
200/OK with the call.
After some investigation, I noticed that this was happening after a missing ACK
on a previous call caused the voicemail platform to retransmit 200/OK responses
beyond the TM wt_timer expiration, which in turn left several openser child
processes (those that received a 200 after wt_timer expiration) in a state such
that they might not properly match transactions on subsequent calls.
My setup:
I have openser 1.2.0 operating on a linux box with two network interfaces, with
one interface (call it the outside interface) taking incoming calls from the
soft-switch, and the other (inside) connected to the VM platform. I have
openser configured to use both interfaces (see config below) and the TM wt_timer
set to 5 seconds (default). As this is a voicemail system, all of the call
traffic is inbound from the soft-switch. Given the traffic flow, for the most
part the openser child processes servicing the inside interface are handling
responses (180,183,200) from the VM platform.
Call scenario:
When an INVITE arrives from the soft-switch, openser forwards it to the VM
platform. The VM platform responds with a 180 and then a 200. I've noticed
several instances where the soft-switch did not respond with an ACK. This
caused the VM platform to retransmit the 200 several times over a 10 second
period. These were absorbed correctly by openser for the duration of wt_timer.
After the timer expired, however, each openser child process that received a
retransmitted 200 logged something like this:
4(2715) DEBUG: t_reply_matching: hash 45870 label 727647196 branch 0
4(2715) DEBUG: t_reply_matching: no matching transaction exists
4(2715) DEBUG: t_reply_matching: failure to match a transaction
4(2715) DEBUG: t_check: end=(nil)
When I look at the TM code, the static variable T in t_lookup.c is now NULL for
this child process.
On a subsequent inbound call, the INVITE is passed to the VM correctly, and the
180 transaction matches (causing the fr_inv_timer to be armed). If the 200 is
read by child proc 2715, I see:
4(2715) DEBUG: t_check: start=(nil)
4(2715) DEBUG: t_check: T previously sought and not found
The 200 is forwarded back to the soft-switch, which responds with an ACK. Both
end-points think the call is up, but since openser never matched the 200 with
the call, the fr_inv_timer fires and cancels the call. Basically, child proc
2715 won’t match any transaction after this unless it happens to process a
request.
I think this problem is made worse by the fact that I’m using two network
interfaces, and that the openser children on the inside interface handle (for
the most part) only responses. This problem was touched on here:
http://lists.openser.org/pipermail/users/2007-November/014188.html but I
didn’t see any follow up. Also, I’ve checked openser 1.2.3 and 1.3.1 for
fixes, but I don’t think this has been addressed.
I have a work around, I think, by upping the wt_timer to something like 15
seconds, but I was wondering if there is any scenario in which leaving T=NULL is
desirable.
Thanks in advance
Sean