Good catch!

As I said in my first mail, I also add the issue with latest 5.2.X so I suppose the deb package has the same issue for 52X.

Is the extra binary to load still there? I will check that as soon as I'm online...

Tks a lot!
Aymeric

Le lun. 16 déc. 2019 à 11:16, Daniel-Constantin Mierla <miconda@gmail.com> a écrit :

Hello,

for some reason the binary doesn't seem to have the libssl mutex fix, in my system with the libssl 1.1 gives:

# kamailio -I
Print out of kamailio internals
  Version: kamailio 5.3.1 (x86_64/linux) f36ac2
  Default config: /tmp/kamailio-5.3/etc/kamailio/kamailio.cfg
  Default paths to modules: /tmp/kamailio-5.3/lib64/kamailio/modules
  Compile flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED
  MAX_RECV_BUFFER_SIZE=262144
  MAX_URI_SIZE=1024
  BUF_SIZE=65535
  DEFAULT PKG_SIZE=8MB
  DEFAULT SHM_SIZE=64MB
  ADAPTIVE_WAIT_LOOPS=1024
  TCP poll methods: poll, epoll_lt, epoll_et, sigio_rt, select
  Source code revision ID: f36ac2
  Compiled with: gcc 9.2.1
  Compiled architecture: x86_64
  Compiled on: 11:11:20 Dec 16 2019
Thank you for flying kamailio!

The important part above is the presence of TLS_PTHREAD_MUTEX_SHARED compile time flag in the output.

Needs to be investigated why the dep packages have the kamailio binary without the libssl mutex fix enabled.

Cheers,
Daniel

On 16.12.19 09:22, Aymeric Moizard wrote:
Hi Daniel,

Tks a lot for lookint at it.

$ ldd /usr/lib/x86_64-linux-gnu/kamailio/modules/tls.so
        linux-vdso.so.1 (0x00007fff997dd000)
        libssl.so.1.1 => /usr/lib/x86_64-linux-gnu/libssl.so.1.1 (0x00007fe40b53c000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fe40b19d000)
        libcrypto.so.1.1 => /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1 (0x00007fe40ad03000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fe40aaff000)
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fe40a8e2000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe40ba4a000)

$ /usr/sbin/kamailio -I
Print out of kamailio internals
  Version: kamailio 5.3.1 (x86_64/linux)
  Default config: /etc/kamailio/kamailio.cfg
  Default paths to modules: /usr/lib/x86_64-linux-gnu/kamailio/modules
  Compile flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
  MAX_RECV_BUFFER_SIZE=262144
  MAX_URI_SIZE=1024
  BUF_SIZE=65535
  DEFAULT PKG_SIZE=8MB
  DEFAULT SHM_SIZE=64MB
  ADAPTIVE_WAIT_LOOPS=1024
  TCP poll methods: poll, epoll_lt, epoll_et, sigio_rt, select
  Source code revision ID: unknown
  Compiled with: gcc 6.3.0
  Compiled architecture: x86_64
  Compiled on:
Thank you for flying kamailio!

Additional note:
I have tried to better understand the pike module and after reading the "end" of the module documentation,
I do better understand the "Tree of IP" and settings.

The pike documentation, for each settins and description, should refer to the section "Chapter 3. Developer Guide",
otherwise, the parameters cannot be understood. Also, it's not possible to understand, according to me, the real time
for removing an IP from the tree (removing it 100% or only last node of IP)

Looking again at my statistics, I feel the first graph is definitly showing an issue.  This graph is showing
"$stat(location-users)" and "$stat(location-contacts)". During the 10 hours, many users are banned, unregistred, etc..
so it is really not expected that the number of registred users is maintained. From what I understand, the fact
that the stats went down when deadlock dissapeared obviouly means kamailio threads was in a bad state for the
last 10 hours...

If you need more information, let me know...
Regards
Aymeric

Le lun. 16 déc. 2019 à 08:22, Daniel-Constantin Mierla <miconda@gmail.com> a écrit :

Hello,

can you provide output of ldd for tls.so and output of "kamailio -I" (that's an uppercase i)?

Cheers,
Daniel

On 13.12.19 16:39, Aymeric Moizard wrote:
Hi List,

History:
* In the past, I had deadlock which was, most probably, related to ssl1.1.
  We have discussed this issue, and a fix is supposed to workaround the issue that was detected.
* With latest 5.2.X, I have experienced ONCE a similar behavior with TCP and TLS being mostly stuck. I have not been using this version much, but the fix was supposed to be in the core of kamailio.

The status of the server this night:
* I'm today running version: kamailio 5.3.1 (x86_64/linux), 
* Installed on stretch using http://deb.kamailio.org/kamailio53 repository.
* This versions use libssl1.1
* A user reported that he can't connect with TCP
* An average of 5000 IPs per 10 minutes are being banned by the pike module
   (could be twice the same)
Yesterday/Today:
* at the end of the outage, I had 2479 IP in my ipban htable. (which is equivalent to my statistics showing 2 bans/IP every 10 minutes = 5000)
* looking at my logs, it appears that most (ALL?) ip being banned... are my regular users.
* looking at my logs, I can't understand why pike would block them.

This is a graph for statistics on my service for the last 24 hours:
https://www.antisip.com/sip-antisip-com-register/status2.html  

Yesterday, at 22:18:39, kamailio started to BAN some IPs. 52 IPs were banned in a period of 10 minutes. I can confirm this from my logs.

My pike configuration is this one:

modparam("pike", "sampling_time_unit", 2)
modparam("pike", "reqs_density_per_unit", 64)
modparam("pike", "remove_latency", 4)

When detecting the issue, this morning, I typed:

$> sudo kamctl stats
$> sudo kamcmd htable.dump ipban
//FAILURE (answer too large...)
$> sudo kamctl trap

Then, I started an agent with TCP and it worked...???
Then, a few seconds, may be a minute after:

$> sudo kamcmd htable.dump ipban
//SUCCESS and shows 2479 banned ip.

and... everything is back to normal in a few minutes.

I haven't restarted kamailio, and all statistics are as expected, as usual.

Thus, it looks that " sudo kamctl trap" has triggered something. I already
experienced a similar behavior -when testing my ssl1.1 deadlock last year-.

2 questions:
1/ I beleive my "pike" configuration should not ban users. Is my pike configuration wrong?
As an example, pike has banned an IP sending one message/second. I believe my configuration should accept that?

2/ Could there still be a TLS issue with libssl1.1?

This is the result of the "kamctl trap":


Sorry for the long story & hoping to find a long term solution or at least a workaround!

Regards
Aymeric

--

_______________________________________________
Kamailio (SER) - Users Mailing List
sr-users@lists.kamailio.org
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
-- 
Daniel-Constantin Mierla -- www.asipto.com
www.twitter.com/miconda -- www.linkedin.com/in/miconda
Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com


--
-- 
Daniel-Constantin Mierla -- www.asipto.com
www.twitter.com/miconda -- www.linkedin.com/in/miconda
Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com