Hi,
What is the practical limit to the number of async worker processes?
With SIP child processes, it seems to be about the number of available CPUs in /proc/cpuinfo. After that--at least, per my testing--one begins to hit the point of diminishing returns, presumably due to SHM IPC and synchronisation issues.
Is the restriction similar in the async execution context?
Also, what is the point of core async_workers setting versus the 'workers' modparam to async? Are they supposed to equal each other? Does one override the other?
On 10/22/2014 09:36 PM, Alex Balashov wrote:
Hi,
What is the practical limit to the number of async worker processes?
With SIP child processes, it seems to be about the number of available CPUs in /proc/cpuinfo. After that--at least, per my testing--one begins to hit the point of diminishing returns, presumably due to SHM IPC and synchronisation issues.
Is the restriction similar in the async execution context?
On 23/10/14 04:03, Alex Balashov wrote:
Also, what is the point of core async_workers setting versus the 'workers' modparam to async? Are they supposed to equal each other? Does one override the other?
async_workers from core are common for all modules, being a decision not to have each module that wants async operations to create its own pool of processes. The workers defined by async module are only for that module and used only by async_route() and async_sleep().
The implementation is also different, the async module workers are more like timer processes (because both of async_route() and async_sleep() need to sleep some interval of time). The module itself keeps the lists of tasks in a structure optimized for timer execution. Each of this async module workers check from time to time to see if there is a task to be executed, executes what matches the time, then sleeps again for 100ms (iirc), then checks again...
The async_workers from core were designed to receive the job immediately. Because of that, there is an interprocess communication based on sockets in memory. The async workers are listening on them, so once a sip worker sends the task to them, an async worker will receive it.
Hopefully I was able to explain it enough to understand what happens behind.
Cheers, Daniel
On 10/22/2014 09:36 PM, Alex Balashov wrote:
Hi,
What is the practical limit to the number of async worker processes?
With SIP child processes, it seems to be about the number of available CPUs in /proc/cpuinfo. After that--at least, per my testing--one begins to hit the point of diminishing returns, presumably due to SHM IPC and synchronisation issues.
Is the restriction similar in the async execution context?
On Oct 24, 2014, at 9:15 AM, Daniel-Constantin Mierla miconda@gmail.com wrote:
On 23/10/14 04:03, Alex Balashov wrote:
Also, what is the point of core async_workers setting versus the 'workers' modparam to async? Are they supposed to equal each other? Does one override the other?
async_workers from core are common for all modules, being a decision not to have each module that wants async operations to create its own pool of processes. The workers defined by async module are only for that module and used only by async_route() and async_sleep().
The implementation is also different, the async module workers are more like timer processes (because both of async_route() and async_sleep() need to sleep some interval of time). The module itself keeps the lists of tasks in a structure optimized for timer execution. Each of this async module workers check from time to time to see if there is a task to be executed, executes what matches the time, then sleeps again for 100ms (iirc), then checks again...
The async_workers from core were designed to receive the job immediately. Because of that, there is an interprocess communication based on sockets in memory. The async workers are listening on them, so once a sip worker sends the task to them, an async worker will receive it.
I don't understand this stuff at all. I do know that when Freeswitch started using timerfd these sorts of issues got better by quite a bit. Maybe this would help here? Maybe you're already using this?
--FC
Thanks, Daniel. So, does this mean that the module "reserves" a "workers" number of common async_workers for its exclusive use? Or do those async_workers just receive whatever they are sent from any number of modules? In that case, what exactly is the role of the "workers" parameter? To limit the number of async_workers to which the module will send requests?
On 24 October 2014 09:15:19 GMT-04:00, Daniel-Constantin Mierla miconda@gmail.com wrote:
On 23/10/14 04:03, Alex Balashov wrote:
Also, what is the point of core async_workers setting versus the 'workers' modparam to async? Are they supposed to equal each other? Does one override the other?
async_workers from core are common for all modules, being a decision not to have each module that wants async operations to create its own pool of processes. The workers defined by async module are only for that module and used only by async_route() and async_sleep().
The implementation is also different, the async module workers are more like timer processes (because both of async_route() and async_sleep() need to sleep some interval of time). The module itself keeps the lists of tasks in a structure optimized for timer execution. Each of this async module workers check from time to time to see if there is a task to be executed, executes what matches the time, then sleeps again for 100ms (iirc), then checks again...
The async_workers from core were designed to receive the job immediately. Because of that, there is an interprocess communication based on sockets in memory. The async workers are listening on them, so once a sip worker sends the task to them, an async worker will receive it.
Hopefully I was able to explain it enough to understand what happens behind.
Cheers, Daniel
On 10/22/2014 09:36 PM, Alex Balashov wrote:
Hi,
What is the practical limit to the number of async worker processes?
With SIP child processes, it seems to be about the number of
available
CPUs in /proc/cpuinfo. After that--at least, per my testing--one
begins
to hit the point of diminishing returns, presumably due to SHM IPC
and
synchronisation issues.
Is the restriction similar in the async execution context?
-- Sent from my mobile, and thus lacking in the refinement one might expect from a fully fledged keyboard.
Alex Balashov - Principal Evariste Systems LLC 235 E Ponce de Leon Ave Suite 106 Decatur, GA 30030 United States Tel: +1-678-954-0671 Web: http://www.evaristesys.com/, http://www.alexbalashov.com
Hello,
On 23/10/14 03:36, Alex Balashov wrote:
Hi,
What is the practical limit to the number of async worker processes?
With SIP child processes, it seems to be about the number of available CPUs in /proc/cpuinfo. After that--at least, per my testing--one begins to hit the point of diminishing returns, presumably due to SHM IPC and synchronisation issues.
Is the restriction similar in the async execution context?
no specific restriction. Also, I haven't seen any degradation when using more sip worker processes that cpus, which I do have always (at least 2 per CPU), because a worker can do quite a lot of I/O.
Cheers, Daniel
On 10/24/2014 09:07 AM, Daniel-Constantin Mierla wrote:
Hello,
On 23/10/14 03:36, Alex Balashov wrote:
Hi,
What is the practical limit to the number of async worker processes?
With SIP child processes, it seems to be about the number of available CPUs in /proc/cpuinfo. After that--at least, per my testing--one begins to hit the point of diminishing returns, presumably due to SHM IPC and synchronisation issues.
Is the restriction similar in the async execution context?
no specific restriction. Also, I haven't seen any degradation when using more sip worker processes that cpus, which I do have always (at least 2 per CPU), because a worker can do quite a lot of I/O.
I have seen such degradation, at least in my VM testing environment. I find that I get the highest CPS (e.g. with sipp) with the smallest number of workers.
Actually, I get as good throughput with 4 workers (in an 8 "CPU" scenario on a quad-core processor) as I do with 8! Increasing beyond 8 leads to diminishing returns.
Hi,
What is the practical limit to the number of async worker processes?
With SIP child processes, it seems to be about the number of available CPUs in /proc/cpuinfo. After that--at least, per my testing--one begins to hit the point of diminishing returns, presumably due to SHM IPC and synchronisation issues.
Is the restriction similar in the async execution context?
no specific restriction. Also, I haven't seen any degradation when using more sip worker processes that cpus, which I do have always (at least 2 per CPU), because a worker can do quite a lot of I/O.
I have seen such degradation, at least in my VM testing environment. I find that I get the highest CPS (e.g. with sipp) with the smallest number of workers.
Actually, I get as good throughput with 4 workers (in an 8 "CPU" scenario on a quad-core processor) as I do with 8! Increasing beyond 8 leads to diminishing returns.
With real traffic some of workers can be busy on blocking operations such as database queries or host name resolution. So I prefer to add more workers to not get in situation when all my workers are blocked by something.
Vitaliy,
The argument against more workers holds that the specific interprocess communication used by them causes one to reach the point of diminishing returns rather quickly, due to contention and locking. In many applications, one can create dozens of hundreds of workers in such a situation, and for precisely the reason you mention. In my experience, 2*num_cpus is the absolute maximum in Kamailio before one starts to lose more in contention than one gains from the throughout of an additional worker.
Actual results vary and depend on the anatomy of the workload (i.e. how much data is operated upon in shared memory, amongst the processes, versus package memory). But in general, the number of workers Kamailio can have is quite low.
For example, I have a quad-core CPU. I can get about 400-500 CPS with 8 children. If I increase the number of children to 16, it plummets to 200 or less. When I increase further, throughout falls further.
On 24 October 2014 10:30:49 GMT-04:00, Vitaliy Aleksandrov vitalik.voip@gmail.com wrote:
Hi,
What is the practical limit to the number of async worker
processes?
With SIP child processes, it seems to be about the number of
available
CPUs in /proc/cpuinfo. After that--at least, per my testing--one begins to hit the point of diminishing returns, presumably due to
SHM
IPC and synchronisation issues.
Is the restriction similar in the async execution context?
no specific restriction. Also, I haven't seen any degradation when
using
more sip worker processes that cpus, which I do have always (at
least 2
per CPU), because a worker can do quite a lot of I/O.
I have seen such degradation, at least in my VM testing environment.
I
find that I get the highest CPS (e.g. with sipp) with the smallest number of workers.
Actually, I get as good throughput with 4 workers (in an 8 "CPU" scenario on a quad-core processor) as I do with 8! Increasing beyond
8
leads to diminishing returns.
With real traffic some of workers can be busy on blocking operations such as database queries or host name resolution. So I prefer to add more workers to not get in situation when all my workers are blocked by
something.
SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list sr-users@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
-- Sent from my mobile, and thus lacking in the refinement one might expect from a fully fledged keyboard.
Alex Balashov - Principal Evariste Systems LLC 235 E Ponce de Leon Ave Suite 106 Decatur, GA 30030 United States Tel: +1-678-954-0671 Web: http://www.evaristesys.com/, http://www.alexbalashov.com