Again additionnal information:
Doing new capture: after the failure, I can see that a TCP
connection is made with second SRV record:
sip.mobipouce.com
(91.199.234.46)
I got:
SYN ACK ->
sip.mobipouce.com
ACK <-
sip.mobipouce.com
PSH, ACK <-
sip.mobipouce.com
ACK ->
sip.mobipouce.com
I'm guessing that this is where the stack trace is dead locked because
no SUBSCRIBE is sent then... -> #2 0x080a93fd in tcp_send ()
strangly in this "tcp_send" method, there is no
TCPCONN_LOCK/TCPCONN_UNLOCK: instead, there is
a
lock_get(&c->write_lock);
...
lock_release(&c->write_lock);
May be still correct anyway...
Tks,
Aymeric MOIZARD / ANTISIP
amsip -
http://www.antisip.com
osip2 -
http://www.osip.org
eXosip2 -
http://savannah.nongnu.org/projects/exosip/
On Thu, 28 Jan 2010, Henning Westerholt wrote:
On Thursday 28 January 2010, Aymeric Moizard wrote:
here is the backtrace I have. unfortunatly
without debug symbol!
I found the same for many of the kamailio process. "sched_yield"
is pending for ever. My system is a debian/etch.
#0 0xffffe424 in __kernel_vsyscall ()
#1 0xb7cef4ac in sched_yield () from /lib/tls/i686/cmov/libc.so.6
#2 0x080a93fd in tcp_send ()
#3 0xb7975679 in send_pr_buffer () from /usr/lib/kamailio/modules/tm.so
#4 0xb79789ac in t_forward_nonack () from /usr/lib/kamailio/modules/tm.so
#5 0xb7974784 in t_relay_to () from /usr/lib/kamailio/modules/tm.so
#6 0xb7983a11 in load_tm () from /usr/lib/kamailio/modules/tm.so
#7 0x081cf810 in mem_pool ()
#8 0x00000000 in ?? ()
I guess most t_relay operation towards my "mobipouce.com" domain
with one IP being down breaks each kamailio process one after the
other... I'm not sure every such t_relay operation is always breaking
exactly one thread each time.
I went through the lock/unlock of tcp_main.c but it seems every
lock has an unlock at least...
Hi Aymeric,
i remember that we observed this "sched_yield" problems on one old 0.9 system
after some time (like weeks or month). We did not found the solution in this
case, after a restart it was gone again..
You mentioned in an earlier mail that you see this related to UDP traffic, but
in the log file and also in your investigations you think its related to TPC?
Regards,
Henning
Viele Grüße,
Henning