Hi,
What is the practical limit to the number of async worker processes?
With SIP child processes, it seems to be about the number of available CPUs in /proc/cpuinfo. After that--at least, per my testing--one begins to hit the point of diminishing returns, presumably due to SHM IPC and synchronisation issues.
Is the restriction similar in the async execution context?
no specific restriction. Also, I haven't seen any degradation when using more sip worker processes that cpus, which I do have always (at least 2 per CPU), because a worker can do quite a lot of I/O.
I have seen such degradation, at least in my VM testing environment. I find that I get the highest CPS (e.g. with sipp) with the smallest number of workers.
Actually, I get as good throughput with 4 workers (in an 8 "CPU" scenario on a quad-core processor) as I do with 8! Increasing beyond 8 leads to diminishing returns.
With real traffic some of workers can be busy on blocking operations such as database queries or host name resolution. So I prefer to add more workers to not get in situation when all my workers are blocked by something.