As you may recall, this is a problem where frequently (well, virtually all of the time) when SER passed a BYE or CANCEL message through, the device receiving such messages from SER would reject them, because the Via: header line that SER added to the BYE or CANCEL message didn't match the one that SER added to the INVITE message sent earlier.
I brought this up a couple of times in the list previously. As there was not any response on this point at all, I have done some digging and have a simplified example and analysis of what SER appears to be doing wrong.
First, here is an example of the problem occurring. Here are key header lines from the INVITE and a BYE message of a sample call:
INVITE INPUT TO SER From: sip:9715551212@10.11.12.13;isup-oli=62;user=phone;tag=000a0285+1+9c6d0174+ff665515 To: sip:3719110000@10.20.30.40;user=phone Call-ID: 16305237000a0285 CSeq: 252558264 INVITE
SER EMITTED THIS INVITE (to called party) Via: SIP/2.0/UDP 66.77.88.99;branch=z9hG4bKe47c.7be033a8ca37f0e8a1c6bfe63b043efe.0 From: sip:9715551212@10.11.12.13;isup-oli=62;user=phone;tag=000a0285+1+9c6d0174+ff665515 To: sip:3719110000@10.20.30.40;user=phone Call-ID: 16305237000a0285 CSeq: 252558264 INVITE
BYE INPUT TO SER (from calling party) Call-ID: 16305237000a0285 From: sip:9715551212@10.11.12.13;isup-oli=62;user=phone;tag=000a0285+1+9c6d0174+ff665515 To: sip:3719110000@10.20.30.40;user=phone;tag=15358744 CSeq: 252558265 BYE
SER EMITTED THIS BYE (to called party) Via: SIP/2.0/UDP 66.77.88.99;branch=z9hG4bKf47c.c79c6922afd022e9b4cdf00110f17fd5.0 Call-ID: 16305237000a0285 From: sip:9715551212@10.11.12.13;isup-oli=62;user=phone;tag=000a0285+1+9c6d0174+ff665515 To: sip:3719110000@10.20.30.40;user=phone;tag=15358744 CSeq: 252558265 BYE
The called switch rejects the BYE with a 481 because top-most Via: in the BYE does not match the one from the INVITE.
Here are the two VIAs added by SER that do not match: INVITE Via: SIP/2.0/UDP 66.77.88.99;branch=z9hG4bKe47c.7be033a8ca37f0e8a1c6bfe63b043efe.0 BYE Via: SIP/2.0/UDP 66.77.88.99;branch=z9hG4bKf47c.c79c6922afd022e9b4cdf00110f17fd5.0
A review of how SER constructs the branch=nnnn value reveals where the differences are coming from:
#1. e47c vs f47c difference caused by the Cseq: number changing between INVITE and BYE messages. The Cseq: value is supposed to be different in the INVITE and BYE per RFC 3261. The Cseq: should not be used in generating the branch=nnnn value. In this example, the BYE just happened to get the very next sequence number, but it will always be a value other than the sequence number seen in the INVITE.
#2. 7be033a8ca37f0e8a1c6bfe63b043efe vs c79c6922afd022e9b4cdf00110f17fd5 is caused by ;tag=15358744 added to BYE To: header line. At minimum, any tags present on the To: and From: header should not be used in generating the branch=nnnn value because of the handling of tags in called-party-BYE situations. Also, there is no requirement in RFC 3261 that the To: nor From: header lines be used in generating the branch=nnnn value at all.
Based on what I am seeing, it appears that neither the Cseq: nor the complete To: or From: header lines should have ever been used in constructing a branch=nnnn value, because of the adverse affects on CANCEL and BYE messages.
Four things appear to be key to causing the branch=nnnn computation to generate non-compliant results in this example, as well as the case of a CANCEL message sent prior to the completion of the INVITE. (There could be other scenarios.)
#1. The Cseq: value of the BYE will be different than the INVITE (per RFC 3261). The Cseq: header line should not ever be used in generation the hash for a branch=nnnn value as it will prevent the branch in BYE from matching the one in the INVITE.
#2. If the called party generated the BYE message, the tags on the To: and From: header lines will be exchanged (also per RFC 3261), so the entire To: and From: header line's content cannot be used for generating a hash for a branch=nnnn value that is expected to match earlier messages that had a branch=nnnn value computed prior to the tag swap.
#3. If a tag was previously absent on the To:/From:, a BYE from either party will add one. Again, the entire To: and From: header lines cannot be used for generating a hash for a branch=nnnn value that matches earlier messages prior to the addition of any tag.
#4. Any deviation in the From: and To: header lines between INVITE and BYE/CANCEL will yield a different branch=nnnn value, even though the parameters of the From: and To: header lines may still be perfectly legal and technically identical. Any deviation may be the result of a sloppy SIP implementation in the calling equipment, but isn't valid grounds for not handling the message correctly. Examples include things like alternate number formats (as in To: sip:8885551212@66.77.88.99:5060 (seen in INVITE) versus To: sip:8885551212@66.77.88.99:5060 (seen in CANCEL from same calling device), or differences in the amount of whitespace (including trailing whitespace), which is supposed to be ignored in SIP messages but the MD5 generator doesn't appear to know that it should ignore whitespace.
Any one or a combination of these items (and possibly others) will cause the current branch=nnnn generation code in SER to generate a different branch=nnnn value and a receiving system who is following RFC 3261 will reject the BYE or CANCEL message that has passed through SER. This is non-compliant behavior.
For BYE, you only get into this situation if you are using Record-Route to force the BYE message to pass through SER so that it can tear down the call (unforce_rtp_proxy() and similar actions), but for a CANCEL message, anybody can run into this problem, if syn_branch=0. So, while this defect in the branch=nnnn generation doesn't affect everybody, it still appears to be wrong.
Is there by chance a compliant version of char_msg_val() exists that doesn't have these issues? That is, it doesn't use the Cseq: header line, and if it uses the From: or To: header line at all, it computes using only the called/calling digits, and not anything else that may appear on those header lines. I also don't understand why this is being MD5'ed. The RFC only says that the branch must be unique, and there are lots of cheaper has algorithms out there.
Thanks for reading this far and thanks in advance for anyone who has fixed this or can come up with a fix.