Hi Daniel,
This happens randomly. It is not a testbed
and never reproducible on the test server.
I can still give you access if you wish to
analyse the core files to check whats
happening here exactly.
It simply feels like on heavy load the
evapi socket gives up. Earlier I was
running 5.0.2 and in that kamailio would
get stuck by not sending any async events
on evapi socket. Not even the
evapi:connection-closed event gets
triggered. After I upgraded to latest
stable version (5.1.5), it at least started
crashing. Here are two core dumps put onto
pastebin.
https://pastebin.com/nn6gJapm
https://pastebin.com/ph7b8vFH
Thanks for all the support,
- Jayesh
On Fri, Sep 7, 2018 at 1:55 AM
Daniel-Constantin Mierla <miconda(a)gmail.com
<mailto:miconda@gmail.com>> wrote:
Hello,
are you able to reproduce it somehow?
Or just happens randomly?
Is it on a testbed where I could get
access to investigate the corefiles? If
not, then I will ask for more details
from cores over email, first would be
'bt full' for both cores and 'p
*dead_cell' for the second one.
Cheers,
Daniel
On 05.09.18 17:47, Jayesh Nambiar wrote:
> Hi Daniel,
> Have got these core dumps. Let me know
> if I should be doing a bt full. I'll
> pastebin and send. Thanks,
>
> Core was generated by
> `/usr/local/kamailio/sbin/kamailio -P
> /var/run/siptrunk.pid -f
> /usr/local/carrie'.
> Program terminated with signal
> SIGSEGV, Segmentation fault.
> #0 0x0000000000505d30 in
> sip_msg_shm_clone
> (org_msg=0x7f6ab0d9f618,
> sip_msg_len=0x7ffdddb2e8bc,
> clone_lumps=1) at core/sip_msg_clone.c:491
> 491LUMP_LIST_LEN(len, org_msg->add_rm);
> (gdb) bt
> #0 0x0000000000505d30 in
> sip_msg_shm_clone
> (org_msg=0x7f6ab0d9f618,
> sip_msg_len=0x7ffdddb2e8bc,
> clone_lumps=1) at core/sip_msg_clone.c:491
> #1 0x00007f6af0bdf68d in fake_req
> (shmem_msg=0x7f6ab0d9f618,
> extra_flags=0, uac=0x7f6ab1738980,
> len=0x7ffdddb2e8bc) at t_reply.c:854
> #2 0x00007f6af0c3aa27 in
> t_continue_helper (hash_index=58039,
> label=413633661, rtact=0x7f6af10500f0,
> cbname=0x0, cbparam=0x0) at
> t_suspend.c:293
> #3 0x00007f6af0c3eed4 in t_continue
> (hash_index=58039, label=413633661,
> route=0x7f6af10500f0) at t_suspend.c:583
> #4 0x00007f6aae4dd010 in w_t_continue
> (msg=0x7ffdddb2fa60,
> idx=0x7f6af1098e90 "8\306\t\361j\177",
> lbl=0x7f6af1098ff0
> "\240\264\t\361j\177",
> rtn=0x7f6af1099150 "0\275\t\361j\177")
> at tmx_mod.c:760
> #5 0x000000000045b477 in do_action
> (h=0x7ffdddb2f850, a=0x7f6af109ab38,
> msg=0x7ffdddb2fa60) at core/action.c:1085
> #6 0x0000000000467fd5 in run_actions
> (h=0x7ffdddb2f850, a=0x7f6af1096630,
> msg=0x7ffdddb2fa60) at core/action.c:1565
> #7 0x000000000045b234 in do_action
> (h=0x7ffdddb2f850, a=0x7f6af10a0f80,
> msg=0x7ffdddb2fa60) at core/action.c:1058
> #8 0x0000000000467fd5 in run_actions
> (h=0x7ffdddb2f850, a=0x7f6af10a0f80,
> msg=0x7ffdddb2fa60) at core/action.c:1565
> #9 0x0000000000468797 in
> run_top_route (a=0x7f6af10a0f80,
> msg=0x7ffdddb2fa60, c=0x0) at
> core/action.c:1654
> #10 0x00007f6aabe79370 in
> evapi_run_cfg_route
> (evenv=0x7ffdddb30250, rt=3,
> rtname=0x7f6aac08cb18 <_evapi_rts+56>)
> at evapi_dispatch.c:161
> #11 0x00007f6aabe7f271 in
> evapi_recv_client
> (loop=0x7f6aabe698e0,
> watcher=0x27af5e0, revents=1) at
> evapi_dispatch.c:467
> #12 0x00007f6aabc5fd73 in
> ev_invoke_pending () from
> /usr/lib/x86_64-linux-gnu/libev.so.4
> #13 0x00007f6aabc633de in ev_run ()
> from /usr/lib/x86_64-linux-gnu/libev.so.4
> #14 0x00007f6aabe7867c in ev_loop
> (loop=0x7f6aabe698e0, flags=0) at
> /usr/include/ev.h:835
> #15 0x00007f6aabe83fc6 in
> evapi_run_dispatcher
> (laddr=0x7f6af0f72300 "127.0.0.1",
> lport=8060) at evapi_dispatch.c:705
> #16 0x00007f6aabe6e262 in child_init
> (rank=0) at evapi_mod.c:213
> #17 0x0000000000542cad in
> init_mod_child (m=0x7f6af0f71b70,
> rank=0) at core/sr_module.c:943
> #18 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f72968,
> rank=0) at core/sr_module.c:939
> #19 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f73d38,
> rank=0) at core/sr_module.c:939
> #20 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f74670,
> rank=0) at core/sr_module.c:939
> #21 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f76708,
> rank=0) at core/sr_module.c:939
> #22 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f76c08,
> rank=0) at core/sr_module.c:939
> #23 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f770d0,
> rank=0) at core/sr_module.c:939
> #24 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f77cf0,
> rank=0) at core/sr_module.c:939
> #25 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f78808,
> rank=0) at core/sr_module.c:939
> #26 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f78bd8,
> rank=0) at core/sr_module.c:939
> #27 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f794c8,
> rank=0) at core/sr_module.c:939
> #28 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f79920,
> rank=0) at core/sr_module.c:939
> #29 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f7a330,
> rank=0) at core/sr_module.c:939
> #30 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f7afd0,
> rank=0) at core/sr_module.c:939
> #31 0x0000000000542971 in
> init_mod_child (m=0x7f6af0f7bc80,
> rank=0) at core/sr_module.c:939
> #32 0x000000000054303d in init_child
> (rank=0) at core/sr_module.c:970
> #33 0x0000000000425399 in main_loop ()
> at main.c:1701
> #34 0x000000000042bd5c in main
> (argc=13, argv=0x7ffdddb31088) at
> main.c:2638
>
> And this:
> [New LWP 15804]
> [Thread debugging using libthread_db
> enabled]
> Using host libthread_db library
> "/lib/x86_64-linux-gnu/libthread_db.so.1".
> Core was generated by
> `/usr/local/kamailio/sbin/kamailio -P
> /var/run/siptrunk.pid -f
> /usr/local/carrie'.
> Program terminated with signal
> SIGSEGV, Segmentation fault.
> #0 0x00007f6af0b90b70 in
> free_cell_helper
> (dead_cell=0x7f6ab0a6baa8, silent=0,
> fname=0x7f6af0c8f630 "timer.c",
> fline=654) at h_table.c:230
> 230foo = tt->next;
> (gdb) bt
> #0 0x00007f6af0b90b70 in
> free_cell_helper
> (dead_cell=0x7f6ab0a6baa8, silent=0,
> fname=0x7f6af0c8f630 "timer.c",
> fline=654) at h_table.c:230
> #1 0x00007f6af0c24409 in wait_handler
> (ti=932640643, wait_tl=0x7f6ab0a6bb28,
> data=0x7f6ab0a6baa8) at timer.c:654
> #2 0x00000000004bb445 in
> timer_list_expire (t=932640643,
> h=0x7f6ab03ad158,
> slow_l=0x7f6ab03ae480, slow_mark=271)
> at core/timer.c:874
> #3 0x00000000004bb8ab in
> timer_handler () at core/timer.c:939
> #4 0x00000000004bbd30 in timer_main
> () at core/timer.c:978
> #5 0x00000000004250f9 in main_loop ()
> at main.c:1691
> #6 0x000000000042bd5c in main
> (argc=13, argv=0x7ffdddb31088) at
> main.c:2638
>
> On Wed, Sep 5, 2018 at 3:13 PM
> Daniel-Constantin Mierla
> <miconda(a)gmail.com
> <mailto:miconda@gmail.com>> wrote:
>
> Hello,
>
> the backtrace doesn't show any
> hint about kamailio, only from
> closelog() up.
>
> It may be the core generated by
> shutdown procedure, have you
> enabled one core file per
> pid/process? If not, do it and
> reproduce the issue again, you may
> get two core files, one being the
> runtime issue and the other one
> from shutdown procedure, which
> likely is an effect of the other
> one. The one from the runtime is
> more relevant.
>
> Cheers,
> Daniel
>
>
> On 05.09.18 10:09, Jayesh Nambiar
> wrote:
>> Hello,
>> I'm using kamailio 5.1.5 with
>> evapi. I have a node.js
>> connecting with kamailio evapi to
>> which I send events and also
>> consume events based on which I
>> do the routing. I have 8 evapi
>> workers defined in the config.
>> The problem is that kamailio
>> randomly crashes on high load.
>> I'm assuming that it is related
>> to the evapi module as rest of
>> the config is pretty straight
>> forward. I could get a core file
>> and here's the core dump:
>> [New LWP 14042]
>> [Thread debugging using
>> libthread_db enabled]
>> Using host libthread_db library
>> “/lib/x86_64-linux-gnu/libthread_db.so.1”.
>> Core was generated by
>> `/usr/local/kamailio/sbin/kamailio
>> -P /var/run/siptrunk.pid -f
>> /usr/local/carrie’.
>> Program terminated with signal
>> SIGABRT, Aborted.
>> #0 0x00007f9995283428 in
>> __GI_raise (sig=sig@entry=6) at
>> ../sysdeps/unix/sysv/linux/raise.c:54
>> 54
>> ../sysdeps/unix/sysv/linux/raise.c:
>> No such file or directory.
>> (gdb) bt
>> #0 0x00007f9995283428 in
>> __GI_raise (sig=sig@entry=6) at
>> ../sysdeps/unix/sysv/linux/raise.c:54
>> #1 0x00007f999528502a in
>> __GI_abort () at abort.c:89
>> #2 0x000000000041a029 in
>> sig_alarm_abort (signo=14) at
>> main.c:646
>> #3 <signal handler called>
>> #4 0x00007f999534f497 in
>> __libc_cleanup_routine
>> (f=<optimized out>) at
>> ../sysdeps/nptl/libc-lockP.h:291
>> #5 closelog () at
>> ../misc/syslog.c:415
>> #6 0x0000000000000000 in ?? ()
>>
>> Any help in this regards is would
>> allow me to identify the reason
>> of the crash. Thanks for the support.
>>
>> - Jayesh
>>
>>
>> _______________________________________________
>> Kamailio (SER) - Users Mailing List
>> sr-users(a)lists.kamailio.org
>> <mailto:sr-users@lists.kamailio.org>
>>
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
>
> --
> Daniel-Constantin Mierla --
www.asipto.com
<http://www.asipto.com>
>
www.twitter.com/miconda
> <http://www.twitter.com/miconda> --
www.linkedin.com/in/miconda
> <http://www.linkedin.com/in/miconda>
> Kamailio World Conference --
www.kamailioworld.com
> <http://www.kamailioworld.com>
> Kamailio Advanced Training, Nov 12-14, 2018, in
Berlin --
www.asipto.com <http://www.asipto.com>
>
--
Daniel-Constantin Mierla --
www.asipto.com
<http://www.asipto.com>
www.twitter.com/miconda
<http://www.twitter.com/miconda> --
www.linkedin.com/in/miconda
<http://www.linkedin.com/in/miconda>
Kamailio World Conference --
www.kamailioworld.com
<http://www.kamailioworld.com>
Kamailio Advanced Training, Nov 12-14, 2018, in Berlin --
www.asipto.com <http://www.asipto.com>
--
Daniel-Constantin Mierla --
<http://www.linkedin.com/in/miconda>
Kamailio World Conference --
<http://www.kamailioworld.com>
Kamailio Advanced Training, Nov 12-14, 2018, in Berlin --