We are having problems in our environment with the way rtpengine distributes RTP sessions. It doesn't provide a good enough distribution among RTP nodes and this is mostly because the current algorithm is quite simple, doing the sum of the characters in the callid and after that applying a 0xFF mask over this(which contributes even more to the distribution of this algorithm). This may affect our systems under heavy load and we've been looking for an alternative. To have something to compare I have designed a tesing application in which I generate CallIds, apply a hashing algorithm and assign the result to a set/node. For example I have 10 nodes each of weight 10. Let's say hash value is 125 I do 125 mod (10 * 10) = 25 therefore the hash will go to node 3. I've been running some tests over various hashing algorithms such as: SHA 256, jenkins(simple hashing algorithm found on wikipedia), md5, sha1, ripemd, crc32 and the one used in the rtpengine. For each algorithm I test from 1000 randomly generated callids to 10 million with a multiplyer of 10(5 tests for each algorithm). The callIds have 16 randomly generated bytes and 16 fixed bytes which are always the same. This way I'm trying to reproduce the way callIds are generated most of them containing a randomly generated part and a fixed part.
From the results I can tell for sure that multiple rounded hashing algorithms from libssl(md5, sha1, ripemd, sha 2) offer a much better distribution than a trivial hashing function so we're looking to change the current algorithm with SHA 1 because SHA 2 takes more time without significant improvement.
It would be nice to push this upstream, but there are some issues. First, using these functions, rtpengine module will depend on libssl. What is your opinion about this? I've looked if we can copy any of the algos from libssl into the code but this is not as easy as it seems. Secondly, if you agree to replacing the algorithm, how would you approach the issue: only replace the current one or add the posibility for the user to select between multiple algorithms? Any suggestion of testing scenarios or hashing algorithms that we've not considered is much welcome.
The results of the tests are attached to his post. [h_funcs_compare.txt](https://github.com/kamailio/kamailio/files/3014285/h_funcs_compare.txt)
If you want to use crypto hashing from libcrypto, the best way to do it is to add what you need to crypto module and then have an option via modparam in rtpengine module to bind to crypto module and use that function. Otherwise, it should be using the internal hashing function (can be changed from what is doing now with another hash function from those available in the core).
Not exactly the same, but a similar solution was made for having better "random" call-id for requests generated by Kamailio. That is done also via "crypto" module, but setting some core callback, instead of exporting an intermodule API. See this parameter:
* https://www.kamailio.org/docs/modules/stable/modules/crypto.html#crypto.p.re...
Great. The only issue I see is can I break tm() if I use generate_callid()? I see a static counter in crypto_generate_callid(). At first glance it looks safe, even though using tm + rtpengine won't generate the same results as tm only.
You do not have to use what exists now, like the generate_callid(), that can stay as it is and used only by tm.
You can add something in a similar fashion, in the way that the code depending on libcrypto/libssl is in the crypto module and then you call it from rtpengine. If you don't know how that can be done, add the function you want in crypto module, then I will export it to be used by rtpengine.
I see. I'll write a function in crypto generating hash for a given value and export it through srapi. Is that ok?
I think it is ok with such option, too. Maybe we other parts of code want to use such hashing function in the future. That's the reason I added in this way for callid, thinking that maybe even requests generated in other parts to be sent stateless will use it.
Otherwise, there is a way to export a function using module interface. You can look at rr module, it exports load_rr(...) via mod interface, then in the rr/api.h is an inline function load_rr_api(...) that makes it easy for other modules to load exported API. Then you can look at path or uac modules how they load the rr API and use it.
Great. Thank you for the tips!
Just to add another comment here: Using a "real" cryptographic hash function is of course better from mathematical Point of view. But it might be good enough to just use the CRC32 function, that e.g. also carrierroute uses. If I remember the 1&1 internal test suite good enough, this is for a larger number of request quite ok (+/- 1% accuracy). Have a look to the carrierroute module and to core/hash_func.[c,h]. This could save you a bit of work and would be usable without the openssl dependency.
@ionutionita92 any news on this enhancement idea from your side?
@ionutionita92 any news on this enhancement idea from your side?
the patch is ready, i have to test it in our systems to see how it behaves, but lately I've been busy withan internal project and didn't have much time for this. Will try to come with an update as soon as possible.
Thank you for the feedback, no hurry necessary.
Pull request has been merged, close this one as well.
Closed #1911.