Tks Daniel,
I have installed the workaround.
lsof seems to indicate that I have installed and
pre-loaded openssl_mutex_shared.so correctly.
I will let you know if I see the issue again.
Tks!
Aymeric
Le lun. 20 mai 2019 à 09:49, Daniel-Constantin Mierla <miconda(a)gmail.com> a
écrit :
Hello,
this kind of behaviour, with long time blocking and then moving on, is a
symptom of the same issue. One of the observed behaviours was that
attaching with gdb and detaching make code running further, that's what
kamctl trap does. I haven't looked deeper, but my guess is that some
signals are sent during the gdb operations.
It would be good if you can test with the workaround and see the results.
There was already a report that the issue was not seen after a rather long
running time.
Cheers,
Daniel
On 17.05.19 16:03, Aymeric Moizard wrote:
Hi!
I haven't used the workaround yet: I'm focusing on trying to make sure I
have the same issue
or trying to figure out how to force it to happen.
I have started to check again the server today and I started by this
command:
$> sudo kamcmd tls.list
In my previous description, the above was a dead lock. Today, It finally
completed, but
after 5 minutes. (I suspect 5 minutes is abnormal)
During the long running command:
-> UDP was working
-> TCP was not:
-> The TCP connection is being ESTABLISHED, but the SIP message was not
replied.
(this was the behavior I had before)
At the same time, I took a trap "sudo kamctl trap". (during the dead lock)
-> one thread is on "tls_list" (tls_rpc.c:154)
-> one thread is on tcpconn_get (core/tcp_main.c:1449) called
from tcp_send (core/tcp_main.c:1716)
and seems to be sending a 484 Address Incomplete on a TLS connection
-> 2 threads are on CRYPTO_THREAD_write_lock on a backtrace showing
"SSL_do_handshake/tls_accept"
Suddenly, "sudo kamcmd tls.list" completed, and then, my TCP Agent received
4 answers from kamailio for the last 4 REGISTER sent.
I have a network capture for my TCP agent.
I have a trap showing 2 thread waiting on "CRYPTO_THREAD_write_lock"
Conclusion:
The use-case showed that the lock was VERY long.
The use-case showed that the lock was TEMPORARY...
Side-note: From my understanding of the multi-fork/openssl issue, I would
expect
to see dead lock happening very fast after a kamailio restart?
Do you expect the preload workaround to work in such behavior?
Or do you consider that my issue is different?
Because there is no "real" dead-lock, I don't understand why "my"
issue
would be related to libssl1.1...
My gdb trap, network capture are available in private exchange if you
need! (please ask me by direct email)
Tks
Aymeric
--
Antisip -
http://www.antisip.com