Jiri Kuthan wrote:
While I have a lot of sympathies for your disappointment, I'm not really sure you are tying it with the proper causes. Let me explain my viewpoint:
- I think the Email you have received from Andrei explains even using
references why Via(INVITE)==Via(BYE) requirement violates standards and actual functionality as well. Why do you think you have not been getting a good advice? What's wrong with the CANCEL suggestion?
The problem here is that the discussion of BYE being involved went away some time ago. The branch tag in CANCELs not matching the branch tag for the INVITEs for the same call was the problem and what needed a fix. (I incorrectly assumed initially that BYE would be wrong as well, but as it is not a two method message sequence, BYE can't get into trouble with the branch tag computation going awry. Just INVITE & CANCEL for a call must always have the same branch tag.
That's the problem with a discussion that bounces back and forth for weeks or months before you finally just climb under the hood, add hundreds of printfs/xlogs and figure out what it is doing wrong yourself and then have to figure out a way to make it right or at least closer to what the RFC and real-life equipment claim is right. Once that was done, the muddle of earlier theories went away.
Anyway, I have fixed the problem myself from all appearances, and the INVITE/CANCEL mismatch issue is closed here.
- the MD5 performance problem you are worried about has been addressed.
I admit we have answered by classifying this as a non-problem, still this is our best-knowledge of the matter based on quite details profiling. Do you have any numbers supporting this is a real problem?
The root problem was that the branch value that was computed for the INVITE wasn't the answer that was computed for the CANCEL. How slowly or how quickly that was done was not the main problem, although I do see enormous waste in using md5 for generating a hash that could have been devised from a hundred simpler algorithms. It's just a branch tag! It doesn't have to defeat NSA crypotgraphic analysis. time_t in hex down to the usec would easily exceed the uniqueness requirement and be a lot smaller and is usually right there in an integer, ready for saving a copy of and/or using.
Generally we don't think that for any given hardware performance is a problem with SER. Of course it can degrade for example by use of database, or any other expensive operations, but I'm quite confident that SER's thoughput is excellent and if service logic consumes more resources hardly anything can be done. At a point of time, more throughput takes more boxes but I think that this threshold is actually very high with SER.
For this particular task, MD5 is just a needless waste of computational power. Inefficiencies in this and other tasks took CPU away that could have been used for other things, never mind what. I could have run more rtpproxy sessions on the computer if it wasn't so busy doing MD5 and other unnecessary math. (Understand that for years of my career I had to count T-states on individual instructions while writing assembly language drivers so that things could happen in the alloted time, so unnecessary math == bloat and I point out such poor practice when I see it.)
And that is one of the things that baffle me about SER as a whole. SER goes out of its way in some places to do things in a way that someone thought would make the code run really really fast, like using inline code macros, or building the 32-bit integer representations of strings you were expecting to find and comparing those, all clearly done in the name of speed. The latter probably doesn't help much on modern compilers with aggressive optimizers, but such coding practices makes the memory footprint bigger and risks more paging. Meanwhile, the second technique created a hardcoding that broke the ability for SER to handle SIP-T/SIP-I, something SER could have passed transparently otherwise, ao that was something of a foolish thing to do for the perceived speed gains. I even suspected the lack of a way to pass variables to functions was because of an obession with speed, not because someone didn't know how to add four or five lines of rules to the lex file.
So all those things were obviously done in the name of speed, but then what does SER do on every message to generate a measly branch tag? Why, it uses an intentionally slow and complex computational algorithm (MD5) when such intensive number crunching isn't needed. Something of a dual-personality there when it comes to wanting to be fast or not caring about speed.
With certainty I know that SER has been used in the big and in the *biggest* deployments. I'm worried that this may sound a bit unfriendly towards the effort you guys developed or purchased as professional service, but I think that the presence of such deployments demonstrates scale is a non-issue in reasonably dimensioned environment. SER doesn't come up with dimensioning plans, one of the reasons being that it is non-trivial to provide generally valid assertions (traffic, hardware, confgiruation, dependencies on database, network architecture, all of these differences in actual deployments make it hard to provide general rules of thumb.)
And I know people at other telecom companies or companies that have a telecom presence or product, ones with really well known names and a zillion dollars. Some of these use SER, and you know what they tell me? They say, yeah we had to hire programmers to fix the problems in SER and write the missing bits, but it was the closest starting point to what we wanted that we could find.
So, yes companies far bigger than mine are using it, but they are having to hire people to panel-beat it into an usable shape and document it. These outfits are on this list or can see its contents and are seeing my words (I know because they have commented on my messages I have posted on this list before), but some of these companies have a rule to not post anything back to lists like this because then their competition might think they were doing something in this or that area or their enemies might know where weaknesses and vulnerabilities exist that could be exploited. It seems that this is one of those things that happen when your company gets big enough or has people at the top that are paranoid enough.
By the way, with maybe one exception, I plan to post all the improvements/fixes I have made to SER (most of which are in and around NAThelper and rtpproxy), and maybe they will rolled into the main tree or maybe not, but at least they will be available to others and might help them avoid some headaches we have had to endure.
- I cannot possibly comment on interoperability of "high-dollar SBCs and
switches" to the general extent you are implying. I'm worried you can be dramattically disappointed if you tie all your expectations to dollars. I only know with certainty that in the specific cases we have encountered I can impossibly assert that "high-dollar" and " brands" means knowing how INVITE shall look like. In fact, we have been using SER to fix INVITEs from high-dollar brands to look like they are supposed to look like. Which is a double-edged sword, as frequently turning a message into something that A likes makes it hard-to-swallow for B. Unless you are in a single-vendor environment, the likelihood of necessity to address interop issues is, say, higher than noticeable.
My point here is when the SER maintainers or active respondents get feedback on these lists that these well-respected devices do not behave well with SER on specific points and that these devices rebel against coding shortcuts (like INVITE!=CANCEL or syn_branch=1), the response here should to be to address the problem and devise a fix, not to tell me or some other messenger how something else, ANYTHING else should change but not SER. I think it unlikely that I or anybody else on this list could get the maker of a SBC that costs $250,000 per box to change their device to overlook a point in the RFC that uses the word MUST three times. When you get caught not being compatible, undertake the work to be compatible.
- I agree that lack of parameter passing is a shortcoming. I agree the
documentation is suboptimal. I'm very thankful to all participants who spend their valuable time and return SER's value by their contributions, but there is no "central contribution control" that would allow someone to cause the participants to address your particular wishes.
I agree and to those participants who provided constructive suggestions over the past two years, I thanked them publicly and privately, and do again now in case they missed it. To the developers et all, well to be honest I haven't seen much of them. I mean, I count 70 times that Jiri has posted here in this group from Feb 2008 thru Oct 2009. (I posted a higher number over the same period.) Anyway, I know that at some point each developer put a lot of effort into writing this piece or that part of SER at some point in the past. I also realize that people get other jobs, get families, run out of the spare time needed to stuff like this, and so software and documentation fall into the marginally maintained category. Maybe that is what I'm seeing here, but I don't know.
There are also cases where the continued development is in a pay-for version and the free version languishes, with the carrot that if we pay for it, that version will be better. I expected the latter was the case for SER. Great, except that offering to pay for help and fixes didn't work either.
Believe me, two years ago my company tried five different ways to get someone at the listed "pay support" entities listed on the SER web site to pay attention to us, tell us how much and we were prepared to put them on a plane and have them configure our lab setup and make it work cleanly and efficiently. However, we couldn't even get a reply to the voice-mails and e-mails left. So even the pay-for support didn't seem too promising, and after three weeks of being ignored with cash in hand and deadlines looming, we just resigned ourselves to the fact that if it didn't work right or didn't do something we wanted, we would have to fix it or write it ourselves, and here we are.
Frank Durda IV - send mail to this address and remove the "LOSE" and adjust the month/year password accordingly: <uhclemLOSE.dec09%nemesis.lonestar.org> http://nemesis.lonestar.org "The guy that said that the only stupid question is the one that was never asked clearly has never worked a computer center help desk." Copyright 2009, ask before reprinting.