On 09/30/08 08:17, Alex Balashov wrote:
Daniel,
Daniel-Constantin Mierla wrote:
There are use cases even when doing stateful processing. So:
- hash over call id - it is fast, good distribution, can be used for
calls to be sent to gateways, etc, works for stateless processing as well
- hash over from uri - caller is sent to same server, good for cdr
collection, authentication, etc
- hash over to uri - good to send registrations for a user to same
location server
- hash over r-uri - good to send calls to same location server as the
registration server for that user
Using a farm of servers, grouped by users, by combining the last three you can route the sip messages inside your network to get auth, acc and location services ok, and the first one to send to gateways :-)
I understand the concept of same keys hashing to the same values. :-) If one hashes a value that stays consistent within a dialog, then all requests within that dialog will go to the same place (and not just the transaction, which is the only thing TM is good for). If one hashes a value that's going to always be the same for a given user (such as a From URI), they will always be directed to the same gateway, etc.
What I still don't understand is what benefit this deterministic domain of values - this sameness - confers from a practical perspective. Yes, I know that if I hash the From URI, the caller will be sent to the same server, but, which server? Clearly, the answer is, "Whichever server their From URI hashes to." Sure. But what particular usefulness does that have, whether one is doing stateful or stateless processing?
Obviously, using a hash is more elegant and simple than statically assigning my users various bindings, as you point out in examples like "good to send calls to the same location server as the registration server for that user." But still, I am brought to ask - without having some means of determining exactly what that server will be, what's the advantage? It's obviously not load balancing, unless I know that my From URIs are going to have a certain desirable distribution when hashed, which I don't. Just keeping certain paths the same is nice, of course, but I fail to see how it's actually useful.
Sure, it's great if my registrants always go to the same location server, but if 90% of my users end up going to one location server because of the distribution that the variance of their From URIs provides, what does this really give me except a predictable route? It's not as if I can use the hash to "find" a user's location server -- unless the location server was determined using the same hash also. What's the point?
The hash function was tested to get pretty fair distribution for AoR, most of the hashed values respect this format. If 90% of your users end up to same server, then you may need to code a bit :-) and add an alternative hash function to the module. For me the existing one seems good so far.
I do not need to know where a user call is going. Practically, they could share the same db backend for auth, but the location and other user profiles details may be in memory for speed purposes. In what I am doing, all the servers in a group have same config, if i add a new one, I get a new dispersion of the users across servers.
Getting the distribution is quite simple, take the has function and make a simple app that takes as parameter a string and outputs the hash value. Knowing your subscriber base ids, you can estimate the results of dispatching.
If you look to a more fair distribution, round robin is your solution, with its limitations, as well. There are some using even random hashing value and it meets their needs. So I believe that we see the benefits for something when we have a use case, I am not using many of kamailio/openser features/modules, but I am sure they have a practical usage somewhere and I may need sometime.
Cheers, Daniel