Hello again,
This might turn out to be a bug, but I'm posting here first because my
config is fairly complicated and it might not be the sanest it could be.
The problem manifests as follows: with the sole change being the enabling
of track_cseq_updates for the dialog module, kamailio fails to process the
100 Trying it receives following authentication initiated by the uac_auth()
function. Without the track_cseq_updates option enabled, the call proceeds
successfully, albeit with the CSeq for the auth-carrying INVITE having the
same value as the original.
Here's a simplified version of the configuration routes involved in this:
request_route {
...
route(DISPATCH_PROVIDER);
...
}
route[DISPATCH_PROVIDER] {
if ( !ds_select_dst("10", "8") ) {
t_send_reply("503", "Downstream carrier unavailable");
exit;
}
t_on_branch_failure("DISPATCH_PROVIDER_FAILOVER");
t_on_branch("PROVIDER_BRANCH");
t_on_failure("DISPATCH_PROVIDER_FAILURE");
route(RELAY);
exit;
}
failure_route[DISPATCH_PROVIDER_FAILURE] {
if (t_is_canceled()) {
exit;
}
if ( t_check_status("401|407") ) {
$avp(arealm) = "authrealm.com";
$avp(auser) = "authusername";
$avp(apass) = "verys3kr1t";
uac_auth();
t_on_branch("PROVIDER_BRANCH");
t_on_branch_failure("DISPATCH_PROVIDER_FAILOVER");
t_on_failure("DISPATCH_PROVIDER_FAILURE");
$du = "sip:" + $T_rpl($si) + ":" + $T_rpl($sp); // I'll
explain why
this is here later in this e-mail
t_relay();
exit;
}
}
branch_route[PROVIDER_BRANCH] {
uac_replace_from("$dlg_var(my_new_from)");
uac_replace_to("$dlg_var(my_new_to)");
$rU = "my new RURI user";
}
event_route[tm:branch-failure:DISPATCH_PROVIDER_FAILOVER] {
if (t_is_canceled()) {
exit;
}
if ( t_check_status("401|407") ) {
return;
# next DST - only for 5xx or local timeout
} else if ( t_check_status("5[0-9][0-9]") || (t_branch_timeout() &&
!t_branch_replied()) ) {
if ( ds_next_dst() ) {
t_on_branch("PROVIDER_BRANCH");
t_on_branch_failure("DISPATCH_PROVIDER_FAILOVER");
t_on_failure("DISPATCH_PROVIDER_FAILURE");
route(RELAY);
exit;
} else {
xlog("L_NOTICE", "--- SCRIPT_DISPATCH_PROVIDER_FAILOVER:
Failed
to route request to PROVIDER! Giving Up. Negative response will be sent
upstream.\n");
}
}
}
One will rightfully wonder why use both branch-failure routes and
failure_route to handle things. Well, it's not obvious from this because
the relevant parts are removed for brevity, but I'm generally using branch
failure event routes to try other destinations in the same destination set,
and failure routes to additionally try other downstream "providers" in case
all entries is the current destination set have failed. I don't think these
actions/routes are relevant because the problem manifests without any of
these failover mechanisms (of switching over to the next provider) engaging.
Regarding the line "$du = "sip:" + $T_rpl($si) + ":" +
$T_rpl($sp);" in the
failure_route, which might seem a little odd, this was added because
without it, uac_auth() will send the auth-carrying INVITE always to the
first destination in the destination set even if ds_next_dst has been
called from the branch-failure event route. Unfortunately I haven't been
able to determine if this actually helped, because I have only received a
5xx error once from the downstream peer before the "fix" and haven't been
able to reproduce this since then. But that's another issue unrelated to
the one I'm asking about here.
So what happens with track_cseq_updates enabled is, after uac_auth()
successfully sends the authenticated INVITE and the peer starts sending
provisional responses, kamailio doesn't seem to acknowledge them. Instead,
it will retransmit the auth INVITE as if there was a firewall preventing
the 1xx responses from being admitted. Then, the timeout will engage and
the configuration script will behave as if t_branch_timeout returns true
(it will do ds_next_dst).
So I was wondering if this is to be expected with this configuration, or if
this should be reported as a bug. Thanks!
Best regards,
George