On 4 Apr 2025, at 10:55, Pyry Aaltonen via sr-users sr-users@lists.kamailio.org wrote:
Hello,
I´ve googled around about tcp timeout in kamailio tcp connections when the tcp connection is broken.
Found this old question and to me it feels like I’m having that same situation as original question https://www.mail-archive.com/sr-users@lists.kamailio.org/msg15020.html Also found this https://kamailio.org/mailman3/hyperkitty/list/sr-dev@lists.kamailio.org/thre... and it seems to be the same description pretty much that I´m seeing. When the tcp connection (or tls) is interrupted in the network it takes around 15min for kamailio to reset the outgoing tcp connection.
I see in the logs that when restarting kamailio process it logs 2025-03-27 12:25:18.697 { "level": "INFO", "module": "core", "file": "core/tcp_main.c", "line": 3282, "function": "tcp_init", "logprefix": "", "message": "Set TCP_USER_TIMEOUT=10000 ms" } So I think the fix from this https://github.com/kamailio/kamailio/commit/d893f3af1444c8c4c5db6cd53fb57770... is applied.
This is tested with 5.8.5 and I have tested this by setting up with dispatcher tcp connection to external host Then with iptables drop traffic to that host, waiting kamailio to notice that destination is down, removing the iptables input and it takes around 15min to recover. (also restarting the kamailio helps and resolves the connection) And this is what kamailio prints during the test: Apr 4 08:29:15 kamailio[156947]: { "level": "ERROR", "module": "xlog", "file": "xlog.c", "line": 278, "function": "", "logprefix": "", "message": "Destination down: OPTIONS sip:ext-host;transport=tcp (<null>)" } Apr 4 08:44:43 kamailio[156957]: { "level": "ERROR", "module": "core", "file": "core/tcp_read.c", "line": 267, "function": "tcp_read_data", "logprefix": "", "message": "error reading: Connection timed out (110) ([]:5060 -> []:47492)" } Apr 4 08:44:43 kamailio[156957]: { "level": "ERROR", "module": "core", "file": "core/tcp_read.c", "line": 1524, "function": "tcp_read_req", "logprefix": "", "message": "error reading - c: 0x7f8625dea9b0 r: 0x7f8625deaad8 (-1)" } Apr 4 08:45:05 kamailio[156957]: { "level": "ERROR", "module": "xlog", "file": "xlog.c", "line": 278, "function": "", "logprefix": "Source:[ext-host]:5060, Call-id:4d0e1ccd315c817f-156947@int-host, CSeq:10", "message": "Destination up: OPTIONS sip:ext.host;transport=tcp (<null>)" }
Any advice how to lower the timeout to be quicker in such event?
That’s dependent on the operating system. This is one of the reasons why SIP outbound required two active TCP connections in order to have a fast failover.
If you google for “unix tcp timeout” you will find many documents with hints. I think Geoff Houston wrote a good summary once upon a time, but I can’t find it any more.
/O
https://datatracker.ietf.org/doc/html/rfc5626
" For a UA to receive incoming requests, the UA has to connect to a server. Since the server can't connect to the UA, the UA has to make sure that a flow is always active. This requires the UA to detect when a flow fails. Since such detection takes time and leaves a window of opportunity for missed incoming requests, this mechanism allows the UA to register over multiple flows at the same time.”