New subject: [Serusers] ReTransmission of CANCELs on 0.9.7

26 Oct 2009


      As you may recall, this is a problem where frequently (well,
virtually all of the time) when SER passed a BYE or CANCEL
message through,  the device receiving such messages from SER
would reject them, because the Via: header line that SER added
to the BYE or CANCEL message didn't match the one that SER
added to the INVITE message sent earlier.
I brought this up a couple of times in the list previously.
As there was not any response on this point at all, I have
done some digging and have a simplified example and analysis
of what SER appears to be doing wrong.
First, here is an example of the problem occurring.  Here
are key header lines from the INVITE and a BYE message of
a sample call:
INVITE INPUT TO SER
From: 
sip:9715551212@10.11.12.13;isup-oli=62;user=phone;tag=000a0285+1+9c6d0174+ff665515
To: sip:3719110000@10.20.30.40;user=phone
Call-ID: 16305237000a0285
CSeq: 252558264 INVITE
SER EMITTED THIS INVITE (to called party)
Via: SIP/2.0/UDP 
66.77.88.99;branch=z9hG4bKe47c.7be033a8ca37f0e8a1c6bfe63b043efe.0
From: 
sip:9715551212@10.11.12.13;isup-oli=62;user=phone;tag=000a0285+1+9c6d0174+ff665515
To: sip:3719110000@10.20.30.40;user=phone
Call-ID: 16305237000a0285
CSeq: 252558264 INVITE
BYE INPUT TO SER (from calling party)
Call-ID: 16305237000a0285
From: 
sip:9715551212@10.11.12.13;isup-oli=62;user=phone;tag=000a0285+1+9c6d0174+ff665515
To: sip:3719110000@10.20.30.40;user=phone;tag=15358744   
CSeq: 252558265 BYE
SER EMITTED THIS BYE (to called party)
Via: SIP/2.0/UDP 
66.77.88.99;branch=z9hG4bKf47c.c79c6922afd022e9b4cdf00110f17fd5.0
Call-ID: 16305237000a0285
From: 
sip:9715551212@10.11.12.13;isup-oli=62;user=phone;tag=000a0285+1+9c6d0174+ff665515
To: sip:3719110000@10.20.30.40;user=phone;tag=15358744
CSeq: 252558265 BYE
The called switch rejects the BYE with a 481 because top-most Via:
in the BYE does not match the one from the INVITE.
Here are the two VIAs added by SER that do not match:
INVITE Via: SIP/2.0/UDP 
66.77.88.99;branch=z9hG4bKe47c.7be033a8ca37f0e8a1c6bfe63b043efe.0
BYE    Via: SIP/2.0/UDP 
66.77.88.99;branch=z9hG4bKf47c.c79c6922afd022e9b4cdf00110f17fd5.0
A review of how SER constructs the branch=nnnn value reveals where
the differences are coming from:
#1. e47c vs f47c difference caused by the Cseq: number changing
between INVITE and BYE messages.  The Cseq: value is supposed
to be different in the INVITE and BYE per RFC 3261.
The Cseq: should not be used in generating the branch=nnnn value.
In this example, the BYE just happened to get the very next
sequence number, but it will always be a value other than
the sequence number seen in the INVITE.
#2. 7be033a8ca37f0e8a1c6bfe63b043efe vs c79c6922afd022e9b4cdf00110f17fd5
is caused by ;tag=15358744 added to BYE To: header line.   At minimum,
any tags present on the To: and From: header should not be used in
generating the branch=nnnn value because of the handling of tags
in called-party-BYE situations.  Also, there is no requirement
in RFC 3261 that the To: nor From: header lines be used in
generating the branch=nnnn value at all.
Based on what I am seeing, it appears that neither the Cseq: nor the
complete To: or From: header lines should have ever been used in
constructing a branch=nnnn value, because of the adverse affects
on CANCEL and BYE messages.
Four things appear to be key to causing the branch=nnnn computation
to generate non-compliant results in this example, as well as the
case of a CANCEL message sent prior to the completion of the
INVITE.  (There could be other scenarios.)
#1. The Cseq: value of the BYE will be different than the INVITE
(per RFC 3261).   The Cseq: header line should not ever be used in
generation the hash for a branch=nnnn value as it will prevent
the branch in BYE from matching the one in the INVITE.
#2. If the called party generated the BYE message, the tags on the
To: and From: header lines will be exchanged (also per RFC 3261),
so the entire To: and From: header line's content cannot be used
for generating a hash for a branch=nnnn value that is expected
to match earlier messages that had a branch=nnnn value computed
prior to the tag swap.
#3. If a tag was previously absent on the To:/From:, a BYE from
either party will add one.  Again, the entire To: and From:
header lines cannot be used for generating a hash for a
branch=nnnn value that matches earlier messages prior to the
addition of any tag.
#4. Any deviation in the From: and To: header lines between INVITE
and BYE/CANCEL will yield a different branch=nnnn value, even
though the parameters of the From: and To: header lines may still
be perfectly legal and technically identical.  Any deviation
may be the result of a sloppy SIP implementation in the calling
equipment, but isn't valid grounds for not handling the message
correctly.  Examples include things like alternate number formats
(as in To: sip:8885551212@66.77.88.99:5060   (seen in INVITE)
versus To: sip:8885551212@66.77.88.99:5060 (seen in CANCEL
from same calling device), or differences in the amount of
whitespace (including trailing whitespace), which is supposed
to be ignored in SIP messages but the MD5 generator doesn't
appear to know that it should ignore whitespace.
Any one or a combination of these items (and possibly others)
will cause the current branch=nnnn generation code in SER
to generate a different branch=nnnn value and a receiving
system who is following RFC 3261 will reject the BYE or
CANCEL message that has passed through SER.  This is
non-compliant behavior.
For BYE, you only get into this situation if you are using
Record-Route to force the BYE message to pass through SER so
that it can tear down the call (unforce_rtp_proxy() and
similar actions), but for a CANCEL message, anybody can run into
this problem, if syn_branch=0.   So, while this defect in the
branch=nnnn generation doesn't affect everybody, it still
appears to be wrong.
Is there by chance a compliant version of char_msg_val()
exists that doesn't have these issues?  That is, it doesn't
use the Cseq: header line, and if it uses the From: or To:
header line at all, it computes using only the called/calling
digits, and not anything else that may appear on those
header lines.    I also don't understand why this is being
MD5'ed.  The RFC only says that the branch must be unique,
and there are lots of cheaper has algorithms out there.
Thanks for reading this far and thanks in advance for
anyone who has fixed this or can come up with a fix.

[Serusers] Via: branch tags from SER for CANCELs/BYEs frequently don't match value in the INVITE