On Dec 04, 2009 at 22:35, Frank Durda IV frank.durda@hypercube-llc.com wrote:
That's the problem with a discussion that bounces back and forth for weeks or months before you finally just climb under the hood, add hundreds of printfs/xlogs and figure out what it is doing wrong yourself and then have to figure out a way to make it right or at least closer to what the RFC and real-life equipment claim is right. Once that was done, the muddle of earlier theories went away.
Just for future reference: for bugs it's better to report them on the bug tracker or send a message to serdev or sr-dev@sip-router.org (serdev@lists.iptel.org was merged with openser-dev into sr-dev, serdev is now an alias for sr-dev). If you want to send a message to the -dev list you must be subscribed to it.
Anyway, I have fixed the problem myself from all appearances, and the INVITE/CANCEL mismatch issue is closed here.
Note that you should leave the cseq number in the hash computation. Otherwise you might have problems with older sip implementations (rfc2543 did non mandate the uniqueness of branch hence hashing only on callid, via and uri would produce the same value for messages in the same dialog).
- the MD5 performance problem you are worried about has been addressed.
I admit we have answered by classifying this as a non-problem, still this is our best-knowledge of the matter based on quite details profiling. Do you have any numbers supporting this is a real problem?
The root problem was that the branch value that was computed for the INVITE wasn't the answer that was computed for the CANCEL. How slowly or how quickly that was done was not the main problem, although I do see enormous waste in using md5 for generating a hash that could have been devised from a hundred simpler algorithms. It's just a branch tag! It doesn't have to defeat NSA crypotgraphic analysis. time_t in hex down to the usec would easily exceed the uniqueness requirement and be a lot smaller and is usually right there in an integer, ready for saving a copy of and/or using.
The branch tag must be unique for each transaction and also it must be the _same_ for the same transaction (e.g. retransmissions a.s.o.). This means that we cannot use time_t, unless we keep transaction state (and remember it). Now, if syn_branch==1 (default), in statefull mode we use 2 integers to generate the branch number and no md5 or other hash (somewhat equivalent to using time_t). We only use md5 if syn_branch==0. The syn_branch==0 mode, by definition (it must be reboot-safe which means it must generate the same branch even after a reboot) must use some way of deriving an unique branch from the message => it must use some sort of unique hash.
We could add more hash algorithms and even make them a runtime config option, but they must have a very low probability of collisions.
[...]
So all those things were obviously done in the name of speed, but then what does SER do on every message to generate a measly branch tag? Why, it uses an intentionally slow and complex computational algorithm (MD5) when such intensive number crunching isn't needed. Something of a dual-personality there when it comes to wanting to be fast or not caring about speed.
Because most people use it with syn_branch==1 where no MD5 is computed and ser is not that optimized for syn_branch==0 and nobody complained. Another reason is that MD5 is guaranteed to have an extremely low probability of collisions (although for branch computations probably MD4 will be enough).
[...]
By the way, with maybe one exception, I plan to post all the improvements/fixes I have made to SER (most of which are in and around NAThelper and rtpproxy), and maybe they will rolled into the main tree or maybe not, but at least they will be available to others and might help them avoid some headaches we have had to endure.
Thanks!
- I cannot possibly comment on interoperability of "high-dollar SBCs and
switches" to the general extent you are implying. I'm worried you can be dramattically disappointed if you tie all your expectations to dollars. I only know with certainty that in the specific cases we have encountered I can impossibly assert that "high-dollar" and " brands" means knowing how INVITE shall look like. In fact, we have been using SER to fix INVITEs from high-dollar brands to look like they are supposed to look like. Which is a double-edged sword, as frequently turning a message into something that A likes makes it hard-to-swallow for B. Unless you are in a single-vendor environment, the likelihood of necessity to address interop issues is, say, higher than noticeable.
My point here is when the SER maintainers or active respondents get feedback on these lists that these well-respected devices do not behave well with SER on specific points and that these devices rebel against coding shortcuts (like INVITE!=CANCEL or syn_branch=1), the response here should to be to address the problem and devise a fix, not to tell me or some other messenger how something else, ANYTHING else should change but not SER. I think it unlikely that I or anybody else on this list could get the maker of a SBC that costs $250,000 per box to change their device to overlook a point in the RFC that uses the word MUST three times. When you get caught not being compatible, undertake the work to be compatible.
RFC3261 9.1 (Canceling a Request / Client Behaviour): ... " The Request-URI, Call-ID, To, the numeric part of CSeq, and From header fields in the CANCEL request MUST be identical to those in the request being cancelled, including tags."
We do have a lot of workarounds for various devices and we will introduce new ones if needed, but in general if you say X is broken, nobody will fix it unless it clearly breaks the RFC, there is a common request or a device that's common enough. Some developer has also to see your message (which was in fact the problem in this case).
I would also like to add that we had very little interoperability problems so far and in the last 7 years ser was at least at one SIPit (sip interop. testing event) per year.
That being said I'll update the code in question to use to and from tags instead of the whole header, but it would make it only in new versions (at least until it will be heavily tested).
- I agree that lack of parameter passing is a shortcoming. I agree the
documentation is suboptimal. I'm very thankful to all participants who spend their valuable time and return SER's value by their contributions, but there is no "central contribution control" that would allow someone to cause the participants to address your particular wishes.
I agree and to those participants who provided constructive suggestions over the past two years, I thanked them publicly and privately, and do again now in case they missed it. To the developers et all, well to be honest I haven't seen much of them. I mean, I count 70 times that Jiri has posted here in this group from Feb 2008 thru Oct 2009. (I posted a higher number over the same period.) Anyway, I know that at some point each developer put a lot of effort into writing this piece or that part of SER at some point in the past. I also realize that people get other jobs, get families, run out of the spare time needed to stuff like this, and so software and documentation fall into the marginally maintained category. Maybe that is what I'm seeing here, but I don't know.
No, what you're seeing is developers reading serusers only occasionally In my case I have a hard time keeping up with email and I prefer to concentrate on sr-dev and on development. ser does improve and there was a lot of documentation work going on and a lot of documentation coming from kamailio/openser as a result of the sip-router project (ser-kamailio merge).
[...]
Andrei