handle_tcpconn_ev(): connect failed - sr-users

7 Sep 2017


      Hi.
Recently I've come across with TCP connection problem.
The topology is as following:
DNS srv load balancer - two kamailio proxy servers - one routing server.
Client appeals to NAPTR record like: sip.domain.com
So dns returns one of the proxy servers to client (depending on
weight/priority). Now both kamailio have the same priority and weight (the
goal is load balancing).
Routing server (now it is asterisk) working with chan_pjsip.so, that
supports NAPTR/SRV records.
He is able to resolve Record-Route / Route headers with value -
sip.domain.com (that proxy servers add to record-route headers while
relaying requests to him).
This topology is done to support present dialogs, even if proxy that
recently processed it, is dead.
But the problem comes, when routing server (asterisk) sends in-dialog
requests to the proxy, that wasn't used to establish the dialog.
Example, routing server obtains 200 OK from endpoint (relayed by kamailio1
to him) and he sends back ACK, but not to the kamailio1, he sends it to
kamailio2 (because he resolves NAPTR sip.domain.com and gets ip of second
kamailio). Kamailio2 processes the request as usual, because both kamailio
have the same db for dialog module, but when he tries to relay the request
to endpoint, he gots the error:
ERROR: <core> [tcp_main.c:4070]: handle_tcpconn_ev(): connect
XXX.XXX.XXX.XXX:52185 failed
The port that kamailio2 tries to use to relay the ACK, is port that
endpoint used to establish the dialog with kamailio1 and actually his TCP
connection is now established with kamailio1.
So kamailio2 tries to use the same port and gets the error.
And this is proper behavior I think.
There is no problem with UDP transport.
Has anyone seen the similar problem? That indeed is not a problem, but
proper behavior.
-- 
-- 
BR, Donat Zenichev
Wnet VoIP team
Tel:  +380(44) 5-900-808
http://wnet.ua