[kamailio/kamailio] rtpengine: hashing algorithm not offering a good enough distribution (#1911) - sr-dev

27 Mar 2019


      We are having problems in our environment with the way rtpengine distributes RTP sessions. It doesn't provide a good enough distribution among RTP nodes and this is mostly because the current algorithm is quite simple, doing the sum of the characters in the callid and after that applying a 0xFF mask over this(which contributes even more to the distribution of this algorithm).  This may affect our systems under heavy load and we've been looking for an alternative.
To have something to compare I have designed a tesing application in which I generate CallIds, apply a hashing algorithm and assign the result to a set/node. For example I have 10 nodes each of weight 10. Let's say hash value is 125 I do 125 mod (10 * 10) = 25 therefore the hash will go to node 3.
I've been running some tests over various hashing algorithms such as: SHA 256, jenkins(simple hashing algorithm found on wikipedia), md5, sha1, ripemd, crc32 and the one used in the rtpengine. For each algorithm I test from 1000 randomly generated callids to 10 million with a multiplyer of 10(5 tests for each algorithm). The callIds have 16 randomly generated bytes and 16 fixed bytes which are always the same. This way I'm trying to reproduce the way callIds are generated most of them containing a randomly generated part and a fixed part.
...
From the results I can tell for sure that multiple rounded hashing algorithms from libssl(md5, sha1, ripemd, sha 2) offer a much better distribution than a trivial hashing function so we're looking to change the current algorithm with SHA 1 because SHA 2 takes more time without significant improvement.
It would be nice to push this upstream, but there are some issues. First, using these functions, rtpengine module will depend on libssl. What is your opinion about this? I've looked if we can copy any of the algos from libssl into the code but this is not as easy as it seems. Secondly, if you agree to replacing the algorithm, how would you approach the issue: only replace the current one or add the posibility for the user to select between multiple algorithms?
Any suggestion of testing scenarios or hashing algorithms that we've not considered is much welcome.
The results of the tests are attached to his post.  
[h_funcs_compare.txt](https://github.com/kamailio/kamailio/files/3014285/h_funcs_compare.txt)
-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/1911