Hello,
so it is not a crash, right? No coredump or some segfault report, but just it doesn't start -- did I get it correctly?
Given you run a lot of instances, maybe you run out of file descriptors, can you check the OS limits for them?
Also, running out of memory might result in such behaviour.
Cheers,
Daniel
Hi, we have a machine running 16 Kamailio instances, and while upgrading to 4.4.1 (from 4.3.5), 8 of them wouldn't start. When downgrading to 4.3.5, they all start again. All of them have pretty identical configuration files, except IPs, ports and some code enabled or disabled via defines. After comparing a working and non-working configuration and adjusting setting by setting, we finally ended up with a working configuration. The difference is, that it won't start when a part of the code DOES NOT get included. If it gets included, it will start. This is the mentioned part of our main route: #!ifdef ENABLE_INV_RATELIMIT # Check for INVITE limit if (is_method("INVITE") && $au == $null && !($ua =~ "sipgate") ) { $var(invcount) = $shtcn(invcount=>%~$fU); xlog("L_INFO", "INVITE Requests from $fU in last 30 seconds: $var(invcount)\n"); if ($var(invcount) < 12) { $var(uniqcid) = $ci + $Ts + $ft; $var(tkey) = $fU + '-' + $(var(uniqcid){s.md5}{s.substr,0,10}); $sht(invcount=>$(var(tkey))) = 1; $var(uniqcid) = $null; $var(tkey) = $null; } if ($var(invcount) > 10) { if ($var(invcount) == 11 ) { xlog("L_NOTICE", "User $fU ($var(domain2use)) over ratelimit for new calls, rejecting.\n"); } # Enable this only after evaluating the impact! append_to_reply("Retry-After: 30\r\n"); sl_send_reply("503", "Call Rate Limit Exceeded"); exit; } } #!endif If we put this line at the top of the configuration file, everything works: #!define ENABLE_INV_RATELIMIT If we delete this line, startup does not work. It just sits in ps for one minute without forking, and then gets terminated. We enabled a bit of debugging, and this is apparently the error causing Kamailio to shutdown: May 25 14:50:15 kammel /usr/sbin/kamailio[24989]: DEBUG: <core> [sr_module.c:920]: init_mod_child(): rank 53: nathelper May 25 14:50:15 kammel /usr/sbin/kamailio[24987]: DEBUG: <core> [local_timer.c:61]: init_local_timer(): timer_list between 0x9f0428 and 0xa34428 May 25 14:50:15 kammel /usr/sbin/kamailio[24987]: DEBUG: <core> [io_wait.h:376]: io_watch_add(): DBG: io_watch_add(0x9f0240, 82, 1, (nil)), fd_no=0 May 25 14:50:15 kammel /usr/sbin/kamailio[24987]: ERROR: <core> [io_wait.h:459]: io_watch_add(): epoll_ctl failed: Bad file descriptor [9] May 25 14:50:15 kammel /usr/sbin/kamailio[24987]: CRITICAL: <core> [tcp_read.c:1747]: tcp_receive_loop(): failed to add tcp main socket to the fd list May 25 14:50:15 kammel /usr/sbin/kamailio[24987]: CRITICAL: <core> [tcp_read.c:1815]: tcp_receive_loop(): exiting... I have no idea, how this part of the code could lead to this error, but it is reproducable, that at least on this system setting or disabling this define fixes or breaks the startup. Does anybody have an idea, what's happening there? Best Regards, Sebastian _______________________________________________ SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list sr-users@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users
-- Daniel-Constantin Mierla http://www.asipto.com - http://www.kamailio.org http://twitter.com/#!/miconda - http://www.linkedin.com/in/miconda