Re: [sr-dev] Problem with TCP and EPOLL

14 Feb 2012


      Sorry this was originally posted incorrectly, so I’m reposting....
I have been having problems with TCP under load.  What I have been seeing is 
TCP buffers failing to be serviced and, when wr_timeout exceeds the 
configured value for tcp_send_timeout, kamailio kills the connection. 
Increasing tcp_send_timeout doesn't help, even setting this to a big value 
(such as 45 seconds) just delays the disconnection.
Putting some tracing into the code shows that wbufq_add() is repeatedly 
called, but wbufq_run() is called for that connection far less than I would 
expect.  wbufq_run() is frequently called for other connections.  It looks 
like wbufq_run() doesn't get called when lots of wbufq_add()s are happening 
for a connection?  wbufq_run() only appears to be called for a connection 
after some time has passed from the last wbufq_add().
The connection in question is a local loopback between the RLS and Presence 
modules (both running in the same Kamailio instance).  However, it may just 
be a coincidence that this is the affected connection as it is also the one 
with the most traffic.
My suspicion is that the bug is in the io_wait_loop_epoll() routine.
Can anybody with experience of this part of the code help?
Paul Pankhurst
Engineering Director
Crocodile RCS Ltd

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

Re: [sr-dev] Problem with TCP and EPOLL