TCP stack is not doing load distribution among children for SIP

List overview All Threads
Download

newer

older

Kamailio push for Android VoIP...

new keepalive module ?

Surendra Pullaiah

7 Mar 2017 7 Mar '17

7:50 a.m.

Dear Daniel,

A small concern in kamailio, existing TCP stack is not doing load distribution among children for SIP messages. For example UE-->kamailio1 (TCP listen (4 children)) ----->kamailio2 (TCP Listen 4 children)

As per stack if UE opens a connection with kamailio1 based on free children it is sharing the FD, so no problem towards from UE to kamailio1, then need requirement to forward same sip message to kamailio2, here kamailio1 is opening connection with kamailio2 and then forwarding. Suppose if we forward two requests to kamailio2, only one children is getting used among four. With this design suppose if we want to run our kamailio as two sip servers we may not achieve good results when performance matters.

Please suggest something on this to move further.

Regards surendra

Show replies by date

Daniel-Constantin Mierla

7 Mar 7 Mar

10:18 p.m.

New subject: TCP stack is not doing load distribution among children for SIP

Hello,

On 07/03/2017 08:50, Surendra Pullaiah wrote:

...

Dear Daniel,

A small concern in kamailio, existing TCP stack is not doing load distribution among children for SIP messages. For example UE-->kamailio1 (TCP listen (4 children)) ----->kamailio2 (TCP Listen 4 children)

As per stack if UE opens a connection with kamailio1 based on free children it is sharing the FD, so no problem towards from UE to kamailio1, then need requirement to forward same sip message to kamailio2, here kamailio1 is opening connection with kamailio2 and then forwarding. Suppose if we forward two requests to kamailio2, only one children is getting used among four. With this design suppose if we want to run our kamailio as two sip servers we may not achieve good results when performance matters.

Please suggest something on this to move further.

as I haven't implemented myself the tcp client/server in kamailio, it may require source code checking to safety prove my next statements ...

So, I expect kamailio wil reuse the connection between kamailio1 and kamailio2. The tcp manager process selects the least loaded tcp worker when a new connection is accepted, and the worker start consuming the packets on it until there is nothing to be read on it. The reason behind this approach is that a proxy is typically sending back a 100 trying or some other provisional response while handling the request. If the connection is very busy, so there are always packets to read, then practically the selected tcp workers keeps processing the traffic and never release the tcp connection back to the tcp manager.

If kamailio1 and kamailio2 are very close to each other, so tcp connect is very fast, you can try setting the option to close the connection as soon as the request is forwarded, via set_forward_close(), then each request will be on a different connection.

An alternative would be to listen on 4 ports in kamailio2 and do round robin forwarding to each of these ports.

Of course, it is open source, there is the option of patching the c code add an option to release back the tcp connection to the tcp manager as soon as reading the sip request.

Cheers, Daniel

-- Daniel-Constantin Mierla www.twitter.com/miconda -- www.linkedin.com/in/miconda Kamailio Advanced Training - Mar 6-8 (Europe) and Mar 20-22 (USA) - www.asipto.com Kamailio World Conference - May 8-10, 2017 - www.kamailioworld.com

Juha Heinanen

8 Mar 8 Mar

12:36 a.m.

New subject: TCP stack is not doing load distribution among children for SIP

Daniel-Constantin Mierla writes:

...

So, I expect kamailio wil reuse the connection between kamailio1 and kamailio2. The tcp manager process selects the least loaded tcp worker when a new connection is accepted, and the worker start consuming the packets on it until there is nothing to be read on it. The reason behind this approach is that a proxy is typically sending back a 100 trying or some other provisional response while handling the request. If the connection is very busy, so there are always packets to read, then practically the selected tcp workers keeps processing the traffic and never release the tcp connection back to the tcp manager.

So, for example, if k2 is a presence server and k1 is forwarding subscribes/publish requests to it, only one process at k2 would be processing them since the tcp connection between k1 and k2 is reused?

-- Juha

Daniel-Constantin Mierla

6:30 a.m.

New subject: TCP stack is not doing load distribution among children for SIP

On 08/03/2017 01:36, Juha Heinanen wrote:

...

Daniel-Constantin Mierla writes:

...
So, I expect kamailio wil reuse the connection between kamailio1 and kamailio2. The tcp manager process selects the least loaded tcp worker when a new connection is accepted, and the worker start consuming the packets on it until there is nothing to be read on it. The reason behind this approach is that a proxy is typically sending back a 100 trying or some other provisional response while handling the request. If the connection is very busy, so there are always packets to read, then practically the selected tcp workers keeps processing the traffic and never release the tcp connection back to the tcp manager.

So, for example, if k2 is a presence server and k1 is forwarding subscribes/publish requests to it, only one process at k2 would be processing them since the tcp connection between k1 and k2 is reused?

Those were the observations described by the one opening this thread and then I responded with why I think it happens so, but as I wrote, I haven't really checked the source code yet.

Cheers, Daniel

Juha Heinanen

7:01 a.m.

New subject: TCP stack is not doing load distribution among children for SIP

Daniel-Constantin Mierla writes:

...

...
So, for example, if k2 is a presence server and k1 is forwarding subscribes/publish requests to it, only one process at k2 would be processing them since the tcp connection between k1 and k2 is reused?

Those were the observations described by the one opening this thread and then I responded with why I think it happens so, but as I wrote, I haven't really checked the source code yet.

If that is true, then K tcp implementation is really broken. Why can't tcp manager distribute the requests that come over a shared single connection to the workers?

-- Juha

Daniel-Constantin Mierla

7:56 a.m.

New subject: TCP stack is not doing load distribution among children for SIP

On 08/03/2017 08:01, Juha Heinanen wrote:

...

Daniel-Constantin Mierla writes:

...
...
So, for example, if k2 is a presence server and k1 is forwarding subscribes/publish requests to it, only one process at k2 would be processing them since the tcp connection between k1 and k2 is reused?

Those were the observations described by the one opening this thread and then I responded with why I think it happens so, but as I wrote, I haven't really checked the source code yet.

If that is true, then K tcp implementation is really broken. Why can't tcp manager distribute the requests that come over a shared single connection to the workers?

Well, if you think this (my assumption) is broken, then your approach is also broken, because then you have one single sip tcp reader (the tcp manager), so you just shift the problem to another place.

Tcp manager process is just dealing with tcp connections only, not doing any sip stuff.

This situation is exposed only when having a trunk between two servers, otherwise when phones connect via tcp/tls, there is a connection for each. I already exposed my assumption of keeping the connection, as the sip tcp worker sends back some reply in most of the cases (auth challenge, trying...).

So, I wouldn't call the current implementation like broken at all, I know deployments with many hundred thousands of active tcp/tls connections, working for many years without issues. This is just someone reported a case that is not handled as one expects.

Besides writing some code to do an enhancement for this trunking case, another solution that came in my mind using current version is to leverage the async layer via the config file: if the request is sent from the other server, then suspend the transaction and resume it in an another processes (rtimer or maybe async workers). I already exposed two other solutions with using different ports or closing connections after sending out.

Cheers, Daniel

3063

Age (days ago)

3064

Last active (days ago)

sr-users@lists.kamailio.org

5 comments

3 participants

tags (0)

participants (3)

Daniel-Constantin Mierla
Juha Heinanen
Surendra Pullaiah