Hi All,
In these days I'm facing an issue or a misconfiguration, I don't know, with our
remote apps (SIP clients) and our cloud server in which there's a Kamailio version
4.4.4.
Sometimes our app takes some seconds (about 10-12sec) to connect to the Kamailio socket
(sips on 5061) and our app is stuck on connect sytem call.
To debug this behaviour I put a rule in the iptable like this:
sudo iptables -I INPUT -p tcp -m tcp --dport 5061 -m state --state NEW -j LOG --log-level
1 --log-prefix "New Connection "
when everything goes well, there is one row which describes the new incoming connection
Mar 4 07:08:40 server-xip-01 kernel: New Connection IN=eth0 OUT=
MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00
PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=15819 DPT=5061 WINDOW=65535 RES=0x00 CWR ECE SYN
URGP=0
when our app gets stuck in the connect, there are many attempts and seems that Kamailio is
not ready to accept this connection:
Mar 4 07:11:54 server-xip-01 kernel: New Connection IN=eth0 OUT=
MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00
PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 CWR ECE SYN
URGP=0 Mar 4 07:11:54 server-xip-01 kernel: New Connection IN=eth0 OUT=
MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00
PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar
4 07:11:55 server-xip-01 kernel: New Connection IN=eth0 OUT=
MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00
PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar
4 07:11:55 server-xip-01 kernel: New Connection IN=eth0 OUT=
MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00
PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar
4 07:11:55 server-xip-01 kernel: New Connection IN=eth0 OUT=
MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00
PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar
4 07:11:56 server-xip-01 kernel: New Connection IN=eth0 OUT=
MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00
PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar
4 07:11:56 server-xip-01 kernel: New Connection IN=eth0 OUT=
MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00
PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar
4 07:11:56 server-xip-01 kernel: New Connection IN=eth0 OUT=
MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00
PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar
4 07:11:58 server-xip-01 kernel: New Connection IN=eth0 OUT=
MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00
PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar
4 07:12:00 server-xip-01 kernel: New Connection IN=eth0 OUT=
MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=64 TOS=0x00
PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0 Mar
4 07:12:05 server-xip-01 kernel: New Connection IN=eth0 OUT=
MAC=bc:76:4e:20:1f:fb:e8:ed:f3:3a:52:41:08:00 SRC=X.X.X.X DST=Y.Y.Y.Y LEN=48 TOS=0x00
PREC=0x00 TTL=45 ID=0 DF PROTO=TCP SPT=52900 DPT=5061 WINDOW=65535 RES=0x00 SYN URGP=0
When it happens our app takes more than 10 sec to connect to the Kamailio socket.
I tried to increase the children configuration parameter in the cfg file because I thought
there were not enough tcp receiver but with no luck.
Here are some configuration parameters:
children = 16 // was 8 before found this issue
# life time of TCP connection when there is no traffic # - a bit higher than registration
expires to cope with UA behind NAT
tcp_connection_lifetime=3605
open_files_limit=8192
tcp_max_connections=8192
tls_max_connections=8192
Here is the output of the ulimit command:
ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 31393
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 65535
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 31393
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
In the kamailio log file I don't see anything wrong, I don't see messages like
"[tcp_main.c:3913]: handle_new_connect(): maximum number of tls connections
exceeded".
Here is the output of kamctl ps command
Process:: ID=0 PID=11725 Type=main process - attendant
Process:: ID=1 PID=11727 Type=udp receiver child=0 sock=10.208.194.26:5061
Process:: ID=2 PID=11728 Type=udp receiver child=1 sock=10.208.194.26:5061
Process:: ID=3 PID=11729 Type=udp receiver child=2 sock=10.208.194.26:5061
Process:: ID=4 PID=11730 Type=udp receiver child=3 sock=10.208.194.26:5061
Process:: ID=5 PID=11731 Type=udp receiver child=4 sock=10.208.194.26:5061
Process:: ID=6 PID=11732 Type=udp receiver child=5 sock=10.208.194.26:5061
Process:: ID=7 PID=11733 Type=udp receiver child=6 sock=10.208.194.26:5061
Process:: ID=8 PID=11734 Type=udp receiver child=7 sock=10.208.194.26:5061
Process:: ID=9 PID=11735 Type=udp receiver child=8 sock=10.208.194.26:5061
Process:: ID=10 PID=11736 Type=udp receiver child=9 sock=10.208.194.26:5061
Process:: ID=11 PID=11737 Type=udp receiver child=10 sock=10.208.194.26:5061
Process:: ID=12 PID=11738 Type=udp receiver child=11 sock=10.208.194.26:5061
Process:: ID=13 PID=11739 Type=udp receiver child=12 sock=10.208.194.26:5061
Process:: ID=14 PID=11740 Type=udp receiver child=13 sock=10.208.194.26:5061
Process:: ID=15 PID=11741 Type=udp receiver child=14 sock=10.208.194.26:5061
Process:: ID=16 PID=11742 Type=udp receiver child=15 sock=10.208.194.26:5061
Process:: ID=17 PID=11743 Type=slow timer
Process:: ID=18 PID=11744 Type=timer
Process:: ID=19 PID=11745 Type=secondary timer
Process:: ID=20 PID=11746 Type=MI FIFO
Process:: ID=21 PID=11747 Type=USRLOC Timer
Process:: ID=22 PID=11748 Type=USRLOC Timer
Process:: ID=23 PID=11749 Type=USRLOC Timer
Process:: ID=24 PID=11750 Type=USRLOC Timer
Process:: ID=25 PID=11751 Type=ctl handler
Process:: ID=26 PID=11752 Type=RTIMER SEC EXEC
Process:: ID=27 PID=11753 Type=TIMER NH
Process:: ID=28 PID=11754 Type=tcp receiver (generic) child=0
Process:: ID=29 PID=11755 Type=tcp receiver (generic) child=1
Process:: ID=30 PID=11756 Type=tcp receiver (generic) child=2
Process:: ID=31 PID=11757 Type=tcp receiver (generic) child=3
Process:: ID=32 PID=11758 Type=tcp receiver (generic) child=4
Process:: ID=33 PID=11759 Type=tcp receiver (generic) child=5
Process:: ID=34 PID=11760 Type=tcp receiver (generic) child=6
Process:: ID=35 PID=11761 Type=tcp receiver (generic) child=7
Process:: ID=36 PID=11762 Type=tcp receiver (generic) child=8
Process:: ID=37 PID=11763 Type=tcp receiver (generic) child=9
Process:: ID=38 PID=11764 Type=tcp receiver (generic) child=10
Process:: ID=39 PID=11765 Type=tcp receiver (generic) child=11
Process:: ID=40 PID=11766 Type=tcp receiver (generic) child=12
Process:: ID=41 PID=11767 Type=tcp receiver (generic) child=13
Process:: ID=42 PID=11768 Type=tcp receiver (generic) child=14
Process:: ID=43 PID=11769 Type=tcp receiver (generic) child=15
Process:: ID=44 PID=11770 Type=tcp main process
Can anyone address me to find out this happens, please?
Please let me know if you need further information...
Thanks in advance,
Best Regards
Tomas
Tomas Zanet
R&D Software Designer
tzanet(a)came.com
CAME S.p.A.