Hi,
Recently experienced issues with 1 TCP connection between 2 kamailio
servers:
1. KAM1 sends 2x forked INVITEs to KAM2
2. KAM2 starts config route processing for INVITE1. But blocks for ~1s due
to rtpengine module pinging some inactive IPs
3. KAM1 re-transmits forked INVITE2
Worth mentioning that:
1. KAM2 uses same TCP connection for receiving KDMQs too. During that
period, noticed KDMQ default_callback error triggered, due to timeout. So
clearly, no KDMQs were processed anymore, during that time.
2. No errors related to TCP connection logged
3. kamailio version 5.8, tcp_reuse_port=yes, and don't set any
route_locks_size, used 4 socket workers for that specific TCP connection
Looked for quite a while in tcp_main.c and tcp_read.c trying to figure out
what is happening with TCP connection(s) in general, and come to the
following conclusion:
TCP connection structure is held by the TCP socket worker process
until the SIP request is completely received in the buffer, parsed *and*
processed routing config for it. Afterwards TCP socket worker releases the
TCP connection structure by signalling this back to the TCP_MAIN process.
Thus other TCP socket worker would be able to handle *next* SIP request,
for *the same* TCP connection.
...but while one TCP socket worker executes config route, no other
TCP socket workers will be able to handle *next* SIP request, for *the
same* TCP connection.
My questions are:
1. Is the above conclusion correct? => this explains the above issue, and
want to double check I understood the core tcp code correctly
2. Can async socket workers solve this?
Thank you,
Stefan