- update weighted/relative weighted algos to use probability based sort - calls from 0-99 now follow the weighted distributions - removed the now unused shuffle_uint100array() function
#### Pre-Submission Checklist - [ x ] Commit message has the format required by CONTRIBUTING guide - [ x ] Commits are split per component (core, individual modules, libs, utils, ...) - [ x ] Each component has a single commit (if not, squash them into one commit) - [ x ] No commits to README files for modules (changes must be done to docbook files in `doc/` subfolder, the README file is autogenerated)
#### Type Of Change - [ x ] Small bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds new functionality) - [ ] Breaking change (fix or feature that would change existing functionality)
#### Checklist: - [ ] PR should be backported to stable branches - [ ] Tested changes locally - [ ] Related to issue #XXXX (replace XXXX with an open issue number)
#### Description The previous random sort for weights in a dispatcher set distributed correctly only after a minimum of n=100 calls. This PR fixes the sort algorithm to distribute calls based on those weights, even for n<100 calls. The real benefit here is customers can see the weighted distribution occur sooner now, when testing.
You can view, comment on, or merge this pull request online at:
https://github.com/kamailio/kamailio/pull/3531
-- Commit Summary --
* dispatcher: fix weighted sort for n<100 calls
-- File Changes --
M src/modules/dispatcher/dispatch.c (313) M src/modules/dispatcher/doc/dispatcher.xml (9) M src/modules/dispatcher/doc/dispatcher_admin.xml (20)
-- Patch Links --
https://github.com/kamailio/kamailio/pull/3531.patch https://github.com/kamailio/kamailio/pull/3531.diff
Thanks for this PR! However, it brings significant changes to the exiting behaviour/design of the weight distribution, which was done in purpose to be randomised along the 100 calls.
I think your PR should be changed to add a new dispatching, algorithm, leaving the existing one as it is.
A few more comments for coding style, to keep it as much as possible coherent: declare variables at the beginning of function or block; one line comments should be in a single line `/* ... */`, not three lines.
@miconda thanks for reviewing this. I chose to go with replacement since I believe this better represents the original intention of the weighted / relative weighted algorithms.
Now to be honest, I could be completely wrong in thinking that. At n>100 calls; or simply when scaling, the new algorithm distributes the same percentage of calls to each destination. The only difference being the ordering the destinations are selected.
If you think that is incorrect then I can just move it to a new dispatching algorithm.
And i'll update the formatting on those comments as well.
If I understand it correctly, then going for a new algorithm makes only sense if we consider that the algorithm seems to be broken for less than 100 calls a feature of this algorithm. ;-) Its probably a better idea to fix the bug, instead of offering two algorithms for our user, a broken and a fixed one.
@devopsec pushed 1 commit.
13351fab04743058a30212f39ec39cc0dca81c08 dispatcher: fix weighted sort for n<100 calls
@miconda updated formatting per https://github.com/kamailio/kamailio/pull/3531#issuecomment-1668288141
@henningw I agree. I think it is a bug that calls are not distributed per the weights, until a minimum of 100 calls are made.
@henningw: the weight being expressed in percentage, you can't ensure proper accuracy for less than 100 calls, maybe in some particular cases, but not for most of the possible variants. Think about a simple to exemplify case of 100 destinations each with weight 1, if you have 2 calls, then two destinations get 50% of the traffic and the rest 0%. So try to fix that, I would love to see a solution.
@devopsec: I haven't checked the code, but I guess it is more like getting to a round robin with some destinations being many times in the group, like for example when there are two addresses, first with weight 75 and second with weight 25: first is added 3 times then the second (... and repeat). Is it randomised within these 4? Or just going to next one for each new call? If my guess is confirmed, then it is a new algorithm, the previous behaviour is lost, it is no longer a randomised distribution across 100 calls, but an ordered distribution (or randomised in group lower than 100).
@devopsec: looking at the updated commit, some fields of pointers are used before checking that the pointer is NULL: declaration of arrays using a field of a structure pointer -- we tried to avoid variable sized arrays in the past for compatibility with old compilers/C standards, I am not really against them, iirc they are allowed by C99, they have to bee relocated after the null check The old code does `pkg_malloc/free()` for `ds_dests_flags` for that reason.
For the records, variable length arrays seems to be no longer mandatory to be supported from C11 on, likely the modern compilers support them, though -- here is a good summary comparing with malloc:
- https://stackoverflow.com/questions/58163105/do-i-really-need-malloc/5816365...
@miconda yes the new code is not using randomized selection for the destination. It is selecting based on probability throughout (recalculate the percentage of calls to that destination each iteration, out of the total calls made). Think of the sorting loop as if it were "processing a new call" per each iteration of `i`. Maybe an explanation based on your above example; weight1=75, weight2=25, would clarify.
First iteration of `i` no calls have been processed, so the call goes to dst1. On the second iteration of `i` dst1 has 1/1 calls, 100% is greater than the intended weight, so we distribute the call to dst2. for `i=3` both dsts have 50% of the calls, but dst1 is intended to have 75% of the calls, so the call goes to dst1. This repeats until 100 "calls" have been processed.
The effect is the exact same in terms of distribution of calls as a percentage, i.e. the intention of the algorithm, but the "bug fix" is that distribution is equivalent at high call volumes and low call volumes. Thus, it is true that the new algorithm does not use random sorting under 100 calls like the last, but I believe that is a bug fix, not a new feature. I think that was a decent explanation, but if not please ask for more clarification.
On the topic of memory allocation, thanks for the reference, I am always happy to learn! In my live tests the compiler happily accepted my code and gdb gave me good output back... I don't tout to be a master of memory management so I will defer to you on whether `pkg_malloc/free()` is required here to be compliant with all the intended C specs.
@devopsec: at the end you get more or less to a specific order of selecting the addresses to route to. For example, with two addresses one 99 and the other 1, the one with weight 1 is going to be placed the 2nd (or 1st depending on the initial order). If weights are 90 and 10, it should end up like `1st, 2nd, 1st, 1st, 1st, 1st, 1st, 1st, 1st, 1st, 1st, 2nd, 1st, 1st...`. For calls of same type (e.g., codecs, audio/video) and duration, could bring some benefits, but for calls of random type and duration the random ordering proved to give pretty good load of resources.
With the above remarks, considering that random ordering was done on purpose at the time of implementation, I don't consider this PR a bug fix, but a different distribution algorithm (for example it can be called percentage ordered distribution).
@miconda Hmmm, I see your point there. No worries, I'll resubmit as a new algorithm.
@devopsec pushed 1 commit.
056007d615aaa9a060f14ff430347c8bc2c01ef0 Merge remote-tracking branch 'upstream/master'
@devopsec pushed 1 commit.
972723b29c20ae05be22285ca759414987424a68 htable add column packing features
Is any work still planned on refactoring to a new algorithm? Maybe this can be closed and a new PR created when it is done.
Closed #3531.
@miconda i'll open this as a new PR when I can get back around to it, thanks!