Hi everybody,
regarding our TCP/TLS stability problems we have no decided to make test
with kamailio 1.5.1
Nevertheless it would be interesting if there is a chance to get rid of
this problems.
Is anybody using TLS?
Used modules: SNMP, mySQL
Summary of problems
Errors may be related to the following log file entries
un 17 09:01:27 si-…. /usr/local/sbin/openser[13921]:
WARNING:core:send2child: no free tcp receiver, connection passed to the
leastbusy one (6)
That means that all of the tcp workers are currently busy by having
connections assigned to it. That does not mean, that the worker process
is really busy.
Jun 17 08:54:52 si-…. /usr/local/sbin/openser[13921]:
ERROR:core:tcpconn_new: shared memory allocation failure
Jun 17 08:54:52 si-… /usr/local/sbin/openser[13921]:
ERROR:core:handle_new_connect: tcpconn_new failed, closing socket
You are running out of shared memory. Either you allocate too much or
there is somewhere a memory leak.
Please debug according to the following howto:
And a few of these also (7613 times):
Jun 17 08:57:24 si-… /usr/local/sbin/openser[13880]:
ERROR:core:tls_accept: some error in SSL:
Jun 17 08:57:24 si-… /usr/local/sbin/openser[13880]:
ERROR:core:tls_print_errstack: error:1409C041:SSL
routines:SSL3_SETUP_BUFFERS:malloc failure
openssl is running out of memory. openssl does not use openser's memory
manager but uses the standard OS malloc.
MAybe there are so many TCP/TLS connections that you run out of memory?
Strange.
*shared memory consumption*
shared memory is continously increasing (set to 1024)
What do you mean with "continously increasing". Openser's memory manager
allocates the memory for shared memory during startup. During runtime,
openser's shared memory stays constant.
If you experience increasing shared memory then this must be caused from
standard OS malloc which is used by other libraries (e.g. openssl,
libxml, mysqlclient, ...)
In this case there can be a bug in the library itself or openser uses
the library in a wrong way.
regards
Klaus
PKG_MEM is 1 MB
*high CPU load for some openser processes*
normally after some days we get a high CPU load (50-90%) for a small
number of the openser processes
It looks like an endless loop and requires restart of openser
There may be an endless loop in
Pass_fd.c
again:
ret=sendmsg(unix_socket, &msg, 0);
if (ret<0){
if (errno==EINTR) goto again;
LM_CRIT("sendmsg failed on %d: %s\n", unix_socket, strerror(errno));
}
any comments on that?
Mit besten Grüßen | Best regards
*Albert Munder*
Robert Bosch GmbH
IT Systems Engineering (CI/ISE)
Postfach 30 02 20
70442 Stuttgart
GERMANY
www.bosch.com
Tel. +49 711 811-40562
Fax +49 711 811-5113333
___Albert.Munder(a)de.bosch.com_
Robert Bosch GmbH, Sitz: Stuttgart, Registergericht: Amtsgericht
Stuttgart HRB 14000
Aufsichtsratsvorsitzender: Hermann Scholl; Geschäftsführung: Franz
Fehrenbach, Siegfried Dais;
Bernd Bohr, Wolfgang Chur, Rudolf Colm, Gerhard Kümmel, Wolfgang
Malchow, Peter Marks;
Volkmar Denner, Peter Tyroller.
------------------------------------------------------------------------
*Von:* Henning Westerholt [mailto:henning.westerholt@1und1.de]
*Gesendet:* Dienstag, 30. Juni 2009 17:25
*An:* users(a)lists.kamailio.org
*Cc:* Munder Albert (CI/ISE)
*Betreff:* Re: [Kamailio-Users] OpenSER stability problems in pilot project
On Dienstag, 30. Juni 2009, Munder Albert (CI/ISE) wrote:
[..]
We are running OpenSER in a pilot project and
unfortunately have some stability problems.
Hallo Albert,
* Appr. 5000 subscriber accounts
* Appr. 1200 simultaneously registered users
* Signalling encrypted with TLS
* Media data encrypted with SRTP
* Clients: softphones and hardphones
* Re-registration time for clients: 3600 sec
I've not that much experience with TCP, but don't think that this
numbers should be a problem in a setup like this.
OpenSER configuration
· Works as stateful SIP Proxy
1 mySQL database
2 Version 1.3.4.-TLS
3 Tcp_children: 100 --> is it recommended to increase this number?
This are quite a lot of children, but ok.
4 Udp_children: 20
5 Tcp_connection_timeout: 3600
6 Shared memory:
· -m 512 when error occurred
1 Now set to 1024
How much PKG_MEM do you use? The default value?
Problems
* Shared memory consumption
Shared memory usage is permanently increasing (about 50 MB per day)
Application already crashed twice
This could be a memory leak, what modules do you use? And do you use any
proprietary modules? You could use the memory debugging to further
investigate this:
http://www.kamailio.org/dokuwiki/doku.php/troubleshooting:memory
First messages were, these, repeated thousands of
times (5915 times):
Jun 17 08:54:52 si-.... /usr/local/sbin/openser[13921]:
ERROR:core:tcpconn_new: shared memory allocation failure Jun 17 08:54:52
si-... /usr/local/sbin/openser[13921]: ERROR:core:handle_new_connect:
tcpconn_new failed, closing socket And a few of these also (7613 times):
Jun 17 08:57:24 si-... /usr/local/sbin/openser[13880]:
ERROR:core:tls_accept: some error in SSL: Jun 17 08:57:24 si-...
/usr/local/sbin/openser[13880]: ERROR:core:tls_print_errstack:
error:1409C041:SSL routines:SSL3_SETUP_BUFFERS:malloc failure
This are caused from insufficient memory conditions. I can't comment on
the TCP and TLS errors. But before really starting to investigate this
problem, would it be possible for you to use a more recent version, e.g.
kamailio 1.5.1 for testing?
* TCP errors, lost SIP messages
Examples from error messages:
14.100 times in log file from 17.06.09
Jun 17 04:03:15 si-... /usr/local/sbin/openser[13863]:
ERROR:core:tcp_blocking_connect: poll error: flags 18 Jun 17 04:03:15
si-... /usr/local/sbin/openser[13863]: ERROR:core:tcp_blocking_connect:
failed to retrieve SO_ERROR (111) Connection refused Jun 17 04:03:15
si-...
/usr/local/sbin/openser[13863]:
ERROR:core:tcpconn_connect:
tcp_blocking_connect failed Jun 17 04:03:15 si-...
/usr/local/sbin/openser[13863]: ERROR:core:tcp_send: connect failed
Jun 17
04:03:15 si-.. /usr/local/sbin/openser[13863]:
ERROR:tm:msg_send:
tcp_send
failed Jun 17 04:03:15 si-...
/usr/local/sbin/openser[13863]:
ERROR:tm:t_forward_nonack: sending request failed
Appears at least 20 000 times; and in the day of the last shared memory
errors, it was 225.794 times in the log file (note that the number in
parenthesis is usually 1 or 2, but on that day it has reached 6): Jun 17
09:01:27 si-.... /usr/local/sbin/openser[13921]: WARNING:core:send2child:
no free tcp receiver, connection passed to the leastbusy one (6) Jun 17
09:01:27 si-... /usr/local/sbin/openser[13921]:
WARNING:core:send2child: no
free tcp receiver, connection passed to the
leastbusy one (5)
* Certificate validation problems
TCP traffic is currently significantly increased by some ( appr. 70)
clients which failed to validate the TLS certificate. Registration is
repeated every 5 sec.
Circa 30 thousand per day (on that day, it was 37.162 times in log)
Jun 17 04:03:10 si-024lc008 /usr/local/sbin/openser[13801]:
ERROR:core:tls_accept: some error in SSL: Jun 17 04:03:10 si-024lc008
/usr/local/sbin/openser[13801]: ERROR:core:tls_print_errstack:
error:14094418:SSL routines:SSL3_READ_BYTES:tlsv1 alert unknown ca
Best regards,
Henning
------------------------------------------------------------------------
_______________________________________________
Kamailio (OpenSER) - Users mailing list
Users(a)lists.kamailio.org
http://lists.kamailio.org/cgi-bin/mailman/listinfo/users
http://lists.openser-project.org/cgi-bin/mailman/listinfo/users