Hi everybody,
regarding our TCP/TLS stability problems we have no decided to make test with kamailio
1.5.1
Nevertheless it would be interesting if there is a chance to get rid of this problems.
Is anybody using TLS?
Used modules: SNMP, mySQL
Summary of problems
Errors may be related to the following log file entries
un 17 09:01:27 si-.... /usr/local/sbin/openser[13921]: WARNING:core:send2child: no free
tcp receiver, connection passed to the leastbusy one (6)
Jun 17 08:54:52 si-.... /usr/local/sbin/openser[13921]: ERROR:core:tcpconn_new: shared
memory allocation failure
Jun 17 08:54:52 si-... /usr/local/sbin/openser[13921]: ERROR:core:handle_new_connect:
tcpconn_new failed, closing socket
And a few of these also (7613 times):
Jun 17 08:57:24 si-... /usr/local/sbin/openser[13880]: ERROR:core:tls_accept: some error
in SSL:
Jun 17 08:57:24 si-... /usr/local/sbin/openser[13880]: ERROR:core:tls_print_errstack:
error:1409C041:SSL routines:SSL3_SETUP_BUFFERS:malloc failure
shared memory consumption
shared memory is continously increasing (set to 1024)
PKG_MEM is 1 MB
high CPU load for some openser processes
normally after some days we get a high CPU load (50-90%) for a small number of the openser
processes
It looks like an endless loop and requires restart of openser
There may be an endless loop in
Pass_fd.c
again:
ret=sendmsg(unix_socket, &msg, 0);
if (ret<0){
if (errno==EINTR) goto again;
LM_CRIT("sendmsg failed on %d: %s\n", unix_socket, strerror(errno));
}
any comments on that?
Mit besten Grüßen | Best regards
Albert Munder
Robert Bosch GmbH
IT Systems Engineering (CI/ISE)
Postfach 30 02 20
70442 Stuttgart
GERMANY
www.bosch.com
Tel. +49 711 811-40562
Fax +49 711 811-5113333
Albert.Munder(a)de.bosch.com
Robert Bosch GmbH, Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart HRB 14000
Aufsichtsratsvorsitzender: Hermann Scholl; Geschäftsführung: Franz Fehrenbach, Siegfried
Dais;
Bernd Bohr, Wolfgang Chur, Rudolf Colm, Gerhard Kümmel, Wolfgang Malchow, Peter Marks;
Volkmar Denner, Peter Tyroller.
________________________________
Von: Henning Westerholt [mailto:henning.westerholt@1und1.de]
Gesendet: Dienstag, 30. Juni 2009 17:25
An: users(a)lists.kamailio.org
Cc: Munder Albert (CI/ISE)
Betreff: Re: [Kamailio-Users] OpenSER stability problems in pilot project
On Dienstag, 30. Juni 2009, Munder Albert (CI/ISE) wrote:
[..]
We are running OpenSER in a pilot project and
unfortunately have some stability problems.
Hallo Albert,
* Appr. 5000 subscriber accounts
* Appr. 1200 simultaneously registered users
* Signalling encrypted with TLS
* Media data encrypted with SRTP
* Clients: softphones and hardphones
* Re-registration time for clients: 3600 sec
I've not that much experience with TCP, but don't think that this numbers should
be a problem in a setup like this.
OpenSER configuration
· Works as stateful SIP Proxy
1 mySQL database
2 Version 1.3.4.-TLS
3 Tcp_children: 100 --> is it recommended to increase this number?
This are quite a lot of children, but ok.
4 Udp_children: 20
5 Tcp_connection_timeout: 3600
6 Shared memory:
· -m 512 when error occurred
1 Now set to 1024
How much PKG_MEM do you use? The default value?
Problems
* Shared memory consumption
Shared memory usage is permanently increasing (about 50 MB per day)
Application already crashed twice
This could be a memory leak, what modules do you use? And do you use any proprietary
modules? You could use the memory debugging to further investigate this:
http://www.kamailio.org/dokuwiki/doku.php/troubleshooting:memory
First messages were, these, repeated thousands of
times (5915 times):
Jun 17 08:54:52 si-.... /usr/local/sbin/openser[13921]:
ERROR:core:tcpconn_new: shared memory allocation failure Jun 17 08:54:52
si-... /usr/local/sbin/openser[13921]: ERROR:core:handle_new_connect:
tcpconn_new failed, closing socket And a few of these also (7613 times):
Jun 17 08:57:24 si-... /usr/local/sbin/openser[13880]:
ERROR:core:tls_accept: some error in SSL: Jun 17 08:57:24 si-...
/usr/local/sbin/openser[13880]: ERROR:core:tls_print_errstack:
error:1409C041:SSL routines:SSL3_SETUP_BUFFERS:malloc failure
This are caused from insufficient memory conditions. I can't comment on the TCP and
TLS errors. But before really starting to investigate this problem, would it be possible
for you to use a more recent version, e.g. kamailio 1.5.1 for testing?
* TCP errors, lost SIP messages
Examples from error messages:
14.100 times in log file from 17.06.09
Jun 17 04:03:15 si-... /usr/local/sbin/openser[13863]:
ERROR:core:tcp_blocking_connect: poll error: flags 18 Jun 17 04:03:15
si-... /usr/local/sbin/openser[13863]: ERROR:core:tcp_blocking_connect:
failed to retrieve SO_ERROR (111) Connection refused Jun 17 04:03:15 si-...
/usr/local/sbin/openser[13863]: ERROR:core:tcpconn_connect:
tcp_blocking_connect failed Jun 17 04:03:15 si-...
/usr/local/sbin/openser[13863]: ERROR:core:tcp_send: connect failed Jun 17
04:03:15 si-.. /usr/local/sbin/openser[13863]: ERROR:tm:msg_send: tcp_send
failed Jun 17 04:03:15 si-... /usr/local/sbin/openser[13863]:
ERROR:tm:t_forward_nonack: sending request failed
Appears at least 20 000 times; and in the day of the last shared memory
errors, it was 225.794 times in the log file (note that the number in
parenthesis is usually 1 or 2, but on that day it has reached 6): Jun 17
09:01:27 si-.... /usr/local/sbin/openser[13921]: WARNING:core:send2child:
no free tcp receiver, connection passed to the leastbusy one (6) Jun 17
09:01:27 si-... /usr/local/sbin/openser[13921]: WARNING:core:send2child: no
free tcp receiver, connection passed to the leastbusy one (5)
* Certificate validation problems
TCP traffic is currently significantly increased by some ( appr. 70)
clients which failed to validate the TLS certificate. Registration is
repeated every 5 sec.
Circa 30 thousand per day (on that day, it was 37.162 times in log)
Jun 17 04:03:10 si-024lc008 /usr/local/sbin/openser[13801]:
ERROR:core:tls_accept: some error in SSL: Jun 17 04:03:10 si-024lc008
/usr/local/sbin/openser[13801]: ERROR:core:tls_print_errstack:
error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca
Best regards,
Henning