Frank Durda IV wrote:
Jiri Kuthan wrote:
While I have a lot of sympathies for your disappointment, I'm not really sure you are tying it with the proper causes. Let me explain my viewpoint:
- I think the Email you have received from Andrei explains even using
references why Via(INVITE)==Via(BYE) requirement violates standards and actual functionality as well. Why do you think you have not been getting a good advice? What's wrong with the CANCEL suggestion?
The problem here is that the discussion of BYE being involved went away some time ago. The branch tag in CANCELs not matching the branch tag for the INVITEs for the same call was the problem and what needed a fix. (I incorrectly assumed initially that BYE would be wrong as well, but as it is not a two method message sequence, BYE can't get into trouble with the branch tag computation going awry. Just INVITE & CANCEL for a call must always have the same branch tag.
That's the problem with a discussion that bounces back and forth for weeks or months before you finally just climb under the hood, add hundreds of printfs/xlogs and figure out what it is doing wrong yourself and then have to figure out a way to make it right or at least closer to what the RFC and real-life equipment claim is right. Once that was done, the muddle of earlier theories went away.
Anyway, I have fixed the problem myself from all appearances, and the INVITE/CANCEL mismatch issue is closed here.
why was syn_branch set to 0?
- the MD5 performance problem you are worried about has been addressed.
I admit we have answered by classifying this as a non-problem, still this is our best-knowledge of the matter based on quite details profiling. Do you have any numbers supporting this is a real problem?
The root problem was that the branch value that was computed for the INVITE wasn't the answer that was computed for the CANCEL. How slowly or how quickly that was done was not the main problem, although I do see enormous waste in using md5 for generating a hash that could have been devised from a hundred simpler algorithms. It's just a branch tag! It doesn't have to defeat NSA crypotgraphic analysis. time_t in hex down to the usec would easily exceed the uniqueness requirement and be a lot smaller and is usually right there in an integer, ready for saving a copy of and/or using.
Generally we don't think that for any given hardware performance is a problem with SER. Of course it can degrade for example by use of database, or any other expensive operations, but I'm quite confident that SER's thoughput is excellent and if service logic consumes more resources hardly anything can be done. At a point of time, more throughput takes more boxes but I think that this threshold is actually very high with SER.
For this particular task, MD5 is just a needless waste of computational power. Inefficiencies in this and other tasks took CPU away that could have been used for other things, never mind what. I could have run more rtpproxy sessions on the computer if it wasn't so busy doing MD5 and other unnecessary math.
While I don't disagree with that, let me reiterate that the performance penalty doesn't show in profiling and seems therefore a very little compelling problem. In other words, I don't think it would help you to boost throughput of your system at all.
to the topic of compliance with standards the following quoation from RFC3261 is probably of interest to you:
" The algorithm used to compute the hash is implementation-dependent, but MD5 (RFC 1321 [35]), expressed in hexadecimal, is a reasonable choice. "
(Understand that for years of my career I had to count T-states on individual instructions while writing assembly language drivers so that things could happen in the alloted time, so unnecessary math == bloat and I point out such poor practice when I see it.)
I don't dispute that's unnecessary despite the standard recommendation. The point is that as long as it has no impact on overall performance, this is unlikely to change for you, especially when the typical SER sentiment is that there is too much performance optimization.
And that is one of the things that baffle me about SER as a whole. SER goes out of its way in some places to do things in a way that someone thought would make the code run really really fast, like using inline code macros, or building the 32-bit integer representations of strings you were expecting to find and comparing those, all clearly done in the name of speed. The latter probably doesn't help much on modern compilers with aggressive optimizers, but such coding practices makes the memory footprint bigger and risks more paging. Meanwhile, the second technique created a hardcoding that broke the ability for SER to handle SIP-T/SIP-I,
Can you share more on that with me? I mean we are passing massive amounts of SIP-? and haven't yet run into such problems....
something SER could have passed transparently otherwise, ao that was something of a foolish thing to do for the perceived speed gains. I even suspected the lack of a way to pass variables to functions was because of an obession with speed, not because someone didn't know how to add four or five lines of rules to the lex file.
So all those things were obviously done in the name of speed, but then what does SER do on every message to generate a measly branch tag? Why, it uses an intentionally slow and complex computational algorithm (MD5) when such intensive number crunching isn't needed. Something of a dual-personality there when it comes to wanting to be fast or not caring about speed.
With certainty I know that SER has been used in the big and in the *biggest* deployments. I'm worried that this may sound a bit unfriendly towards the effort you guys developed or purchased as professional service, but I think that the presence of such deployments demonstrates scale is a non-issue in reasonably dimensioned environment. SER doesn't come up with dimensioning plans, one of the reasons being that it is non-trivial to provide generally valid assertions (traffic, hardware, confgiruation, dependencies on database, network architecture, all of these differences in actual deployments make it hard to provide general rules of thumb.)
And I know people at other telecom companies or companies that have a telecom presence or product, ones with really well known names and a zillion dollars. Some of these use SER, and you know what they tell me? They say, yeah we had to hire programmers to fix the problems in SER and write the missing bits, but it was the closest starting point to what we wanted that we could find.
I think I hear similar stories. Is there something you are finding surprising or unexpected in it? I also occur to think if any of these companies was to share one of the zillions or a fraction of it, it would find noble folks on the mailing list to reprogram SER for them to control power grids :-)
So, yes companies far bigger than mine are using it, but they are having to hire people to panel-beat it into an usable shape and document it. These outfits are on this list or can see its contents and are seeing my words (I know because they have commented on my messages I have posted on this list before), but some of these companies have a rule to not post anything back to lists like this because then their competition might think they were doing something in this or that area or their enemies might know where weaknesses and vulnerabilities exist that could be exploited. It seems that this is one of those things that happen when your company gets big enough or has people at the top that are paranoid enough.
Yes. It is pitiful but unfortunately I don't know what to change to that this improves.
By the way, with maybe one exception, I plan to post all the improvements/fixes I have made to SER (most of which are in and around NAThelper and rtpproxy), and maybe they will rolled into the main tree or maybe not, but at least they will be available to others and might help them avoid some headaches we have had to endure.
That's a man's word :-) Thank you very much indeed!
- I cannot possibly comment on interoperability of "high-dollar SBCs and
switches" to the general extent you are implying. I'm worried you can be dramattically disappointed if you tie all your expectations to dollars. I only know with certainty that in the specific cases we have encountered I can impossibly assert that "high-dollar" and " brands" means knowing how INVITE shall look like. In fact, we have been using SER to fix INVITEs from high-dollar brands to look like they are supposed to look like. Which is a double-edged sword, as frequently turning a message into something that A likes makes it hard-to-swallow for B. Unless you are in a single-vendor environment, the likelihood of necessity to address interop issues is, say, higher than noticeable.
My point here is when the SER maintainers or active respondents get feedback on these lists that these well-respected devices do not behave well with SER on specific points and that these devices rebel against coding shortcuts (like INVITE!=CANCEL or syn_branch=1), the response here should to be to address the problem and devise a fix, not to tell me or some other messenger how something else, ANYTHING else should change but not SER.
There is couple of things: First of all it is impossible to give you an SLA for solving your problems via the mailing list. folks on this mailing list are volunteers and have their obkligations too. I think there is a general sense of willingness to help, but as with most other volunteering activities there is simply no service level agreement.
The other thing is that I don't think that the problem has been understood yet (despite a lot of info you have supplied). Yet other thing is that many think (myself among those) that there is a certain bar which is not wise to be crossed in aligning to non-compliant implementations. There are actually some features trying to address broken implementations, but we have experienced when we went to far by accomodating some broken implementations that the resulting behaviour broke yet other ones.
I think it unlikely that I or anybody else on this list could get the maker of a SBC that costs $250,000 per box to change their device to overlook a point in the RFC that uses the word MUST three times. When you get caught not being compatible, undertake the work to be compatible.
I don't think there is a compliance problem here. I thikn you confirmed that the BYE problem is not a problem, and I don't think we have material showing what's wrong with CANCEL and syn_branch=1. For syn_branch=0 you have shown errors in upstream devices which SER amplifies but then I would choose not to make use of this config option.
- I agree that lack of parameter passing is a shortcoming. I agree the
documentation is suboptimal. I'm very thankful to all participants who spend their valuable time and return SER's value by their contributions, but there is no "central contribution control" that would allow someone to cause the participants to address your particular wishes.
I agree and to those participants who provided constructive suggestions over the past two years, I thanked them publicly and privately, and do again now in case they missed it. To the developers et all, well to be honest I haven't seen much of them. I mean, I count 70 times that Jiri has posted here in this group from Feb 2008 thru Oct 2009. (I posted a higher number over the same period.)
Well, I feel honored someone has considered it relevant to count number of my emails :-) Anyhow for those who are interested in that, I hope to keep contributing my modest ways but probably not by increasing my mailing-list rate. That's almost a full time job and I already have one.
Anyway, I know that at some point each developer put a lot of effort into writing this piece or that part of SER at some point in the past. I also realize that people get other jobs, get families, run out of the spare time needed to stuff like this, and so software and documentation fall into the marginally maintained category. Maybe that is what I'm seeing here, but I don't know.
At least for me with a job and family you are quite right.
There are also cases where the continued development is in a pay-for version and the free version languishes, with the carrot that if we pay for it, that version will be better. I expected the latter was the case for SER. Great, except that offering to pay for help and fixes didn't work either.
It is fair questions to ask ourselves.
There are companies that offer services (I'm not very experienced with these thought) and products. A company I occur to be intimately familiar with delivers a SER-based product in a rack on a truck with TL9000 certificate. I'm not aware of something inbetween, resembling a shareware in that it is lowly priced and highly productized. (Probably easier to be found for consumer software with large quantities and absolutely no customization.)
Believe me, two years ago my company tried five different ways to get someone at the listed "pay support" entities listed on the SER web site to pay attention to us, tell us how much and we were prepared to put them on a plane and have them configure our lab setup and make it work cleanly and efficiently. However, we couldn't even get a reply to the voice-mails and e-mails left. So even the pay-for support didn't seem too promising, and after three weeks of being ignored with cash in hand and deadlines looming, we just resigned ourselves to the fact that if it didn't work right or didn't do something we wanted, we would have to fix it or write it ourselves, and here we are.
just to add my 2 cents -- frequently I have been given some offers in the style "come and fix our bugs and pieces with which we are stalling with and we get you an hourly wage". Please don't think I'm trying to imply that's what you did -- I'm merely explaining why I got sort of mistrusting against "come and help us" offers, and probably many others did as well. I remember companies to whom we offered to fix a compelling problem before we settle on terms, and haven't heard of compensation until their system began crashing again. I remember several integrators who wanted to be paid for poor equivalent of what we had on the store, and were asking us to standby and fix their errors. That didn't appear very appealing to us.
For many like myself, work satisfaction is more important than compensation, and improving the baseline more attractive than customization. Frequently too some come and ask for things we have created in our commercial offerings built on top of SER. These things are available on the market at market price which DIYer companies wish to compete with, not realizing the amount of work (algorithm, testing, interop, maturing with other customers, field feedback). Frequently they end up with unexpected DIYer cost.
It all goes back to the ambitions of SER and a business model of it. SER is a really standalone server and not a whole system. Except ai1, it is not coming as a package with web and database, and it is not coming with a scalability and redundancy concept. Also I'm aware of proprietary modules that are not publicly available and can go beyond the GPLed codebase (particularly with routing and manageability). I suspect that with SER's performance, it is particularly an operator vehicle and the "few big customers" model makes the economy harder for an operator-grade package to be affordably available. Not that we haven't build such, but it has not been typically on par with pricing expectations of many whose expectations were more moderated by notion of open-source than the solution quality.
Anyhow if I had an ambition to operate a reasonably large critical setup, I would probably try to identify companies that have put such together as a package, deployed it with other customers, and align pricing expectations to that.
Well -- just a long excuse for myself should your message be on my voicemail :-) ... which I don't actually think so hopefuly.
-jiri