Submitting this on behalf of Jacob Greene, who posted a day or two ago:
https://lists.kamailio.org/pipermail/sr-users/2020-May/109315.html
Version in production is Debian 9 and Kamailio 5.3.4 from Debian packages:
ii kamailio 5.3.4+bpo9 amd64 very fast, dynamic and configurable SIP server
ii kamailio-postgres-modules:amd64 5.3.4+bpo9 amd64 PostgreSQL database connectivity module for the Kamailio SIP server
ii kamailio-presence-modules:amd64 5.3.4+bpo9 amd64 SIP presence modules for Kamailio
ii kamailio-python3-modules:amd64 5.3.4+bpo9 amd64 The app_python3 module for the Kamailio SIP server
rc kamailio-xml-modules:amd64 5.2.6+stretch amd64 XML based extensions for Kamailio's Management Interface
Anyway, a Kamailio child process periodically exits -- about ten times a day -- with a SIGABRT
set off by a failed assertion somewhere inside the app_python3
bureaucracy:
(gdb) where
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x00007f2c2f76042a in __GI_abort () at abort.c:89
#2 0x00007f2c2f757e67 in __assert_fail_base (fmt=<optimized out>,
assertion=assertion@entry=0x7f2c2892bda8 "PyBytes_Check(pframe->f_code->co_name)",
file=file@entry=0x7f2c2892bd9c "apy_kemi.c", line=line@entry=1819,
function=function@entry=0x7f2c2892d0b0 "sr_apy_kemi_exec_func") at assert.c:92
#3 0x00007f2c2f757f12 in __GI___assert_fail (assertion=0x7f2c2892bda8 "PyBytes_Check(pframe->f_code->co_name)",
file=0x7f2c2892bd9c "apy_kemi.c", line=1819, function=0x7f2c2892d0b0 "sr_apy_kemi_exec_func") at assert.c:101
#4 0x00007f2c2890bd57 in sr_apy_kemi_exec_func () from /usr/lib/x86_64-linux-gnu/kamailio/modules/app_python3.so
#5 0x00007f2c288d525c in ?? () from /usr/lib/x86_64-linux-gnu/kamailio/modules/app_python3.so
#6 0x00007f2c282ad259 in PyCFunction_Call () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#7 0x00007f2c283de739 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#8 0x00007f2c283de6ad in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#9 0x00007f2c284a0074 in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#10 0x00007f2c283dcadf in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#11 0x00007f2c283de6ad in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#12 0x00007f2c284a0074 in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#13 0x00007f2c284a0153 in PyEval_EvalCodeEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#14 0x00007f2c28333558 in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#15 0x00007f2c2840fa37 in PyObject_Call () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#16 0x00007f2c2845b95c in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#17 0x00007f2c2840fa37 in PyObject_Call () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#18 0x00007f2c2849eb77 in PyEval_CallObjectWithKeywords () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
#19 0x00007f2c2891c136 in apy_exec () from /usr/lib/x86_64-linux-gnu/kamailio/modules/app_python3.so
#20 0x00007f2c288dfe94 in sr_kemi_config_engine_python () from /usr/lib/x86_64-linux-gnu/kamailio/modules/app_python3.so
#21 0x0000558ec29ba7b4 in sr_kemi_route ()
#22 0x0000558ec2ab7c03 in receive_msg ()
#23 0x0000558ec2997e4a in udp_rcv_loop ()
#24 0x0000558ec28de718 in main_loop ()
#25 0x0000558ec28e7a39 in main ()
The fuller backtrace for the top few frames:
Thread 1 (Thread 0x7f2c3031db40 (LWP 26211)):
#0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
set = {__val = {0, 140720998929208, 139827752272384, 139827751795724, 140724530937856, 94071669029552,
94071669029653, 94071669029552, 94071669029552, 94071669029661, 94071669029852, 94071669029552, 94071669029852,
0, 0, 0}}
pid = <optimized out>
tid = <optimized out>
#1 0x00007f2c2f76042a in __GI_abort () at abort.c:89
save_stage = 2
act = {__sigaction_handler = {sa_handler = 0x558ec469d990, sa_sigaction = 0x558ec469d990}, sa_mask = {__val = {
139827751803746, 140720998929248, 109, 139827755099200, 1819, 139827635998896, 139827635638836,
139827752272384, 139827751811289, 139827755115808, 139827751803256, 139827752815328, 139827752801040,
94071641717939, 139827763892224, 139827635994024}}, sa_flags = 1819, sa_restorer = 0x7f2c2892d0b0}
sigs = {__val = {32, 0 <repeats 15 times>}}
#2 0x00007f2c2f757e67 in __assert_fail_base (fmt=<optimized out>,
assertion=assertion@entry=0x7f2c2892bda8 "PyBytes_Check(pframe->f_code->co_name)",
file=file@entry=0x7f2c2892bd9c "apy_kemi.c", line=line@entry=1819,
function=function@entry=0x7f2c2892d0b0 "sr_apy_kemi_exec_func") at assert.c:92
str = 0x558ec466a760 ""
total = 4096
#3 0x00007f2c2f757f12 in __GI___assert_fail (assertion=0x7f2c2892bda8 "PyBytes_Check(pframe->f_code->co_name)",
file=0x7f2c2892bd9c "apy_kemi.c", line=1819, function=0x7f2c2892d0b0 "sr_apy_kemi_exec_func") at assert.c:101
No locals.
#4 0x00007f2c2890bd57 in sr_apy_kemi_exec_func () from /usr/lib/x86_64-linux-gnu/kamailio/modules/app_python3.so
No symbol table info available.
#5 0x00007f2c288d525c in ?? () from /usr/lib/x86_64-linux-gnu/kamailio/modules/app_python3.so
No symbol table info available.
#6 0x00007f2c282ad259 in PyCFunction_Call () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#7 0x00007f2c283de739 in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#8 0x00007f2c283de6ad in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#9 0x00007f2c284a0074 in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#10 0x00007f2c283dcadf in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#11 0x00007f2c283de6ad in PyEval_EvalFrameEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#12 0x00007f2c284a0074 in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#13 0x00007f2c284a0153 in PyEval_EvalCodeEx () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#14 0x00007f2c28333558 in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#15 0x00007f2c2840fa37 in PyObject_Call () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#16 0x00007f2c2845b95c in ?? () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#17 0x00007f2c2840fa37 in PyObject_Call () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#18 0x00007f2c2849eb77 in PyEval_CallObjectWithKeywords () from /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0
No symbol table info available.
#19 0x00007f2c2891c136 in apy_exec () from /usr/lib/x86_64-linux-gnu/kamailio/modules/app_python3.so
No symbol table info available.
#20 0x00007f2c288dfe94 in sr_kemi_config_engine_python () from /usr/lib/x86_64-linux-gnu/kamailio/modules/app_python3.so
No symbol table info available.
#21 0x0000558ec29ba7b4 in sr_kemi_route ()
No symbol table info available.
#22 0x0000558ec2ab7c03 in receive_msg ()
No symbol table info available.
#23 0x0000558ec2997e4a in udp_rcv_loop ()
No symbol table info available.
#24 0x0000558ec28de718 in main_loop ()
No symbol table info available.
#25 0x0000558ec28e7a39 in main ()
No symbol table info available.
For what it's worth, line 1819 in apy_kemi.c
in the latest 5.3 HEAD
(which I'm not sure this binary package reflects) is:
1801 if(unlikely(cfg_get(core, core_cfg, latency_limit_action)>0)
1802 && is_printable(cfg_get(core, core_cfg, latency_log))) {
1803 gettimeofday(&tve, &tz);
1804 tdiff = (tve.tv_sec - tvb.tv_sec) * 1000000
1805 + (tve.tv_usec - tvb.tv_usec);
1806 if(tdiff >= cfg_get(core, core_cfg, latency_limit_action)) {
1807 pstate = PyThreadState_GET();
1808 if (pstate != NULL && pstate->frame != NULL) {
1809 pframe = pstate->frame;
1810 }
1811
1812 LOG(cfg_get(core, core_cfg, latency_log),
1813 "alert - action KSR.%s%s%s(...)"
1814 " took too long [%u ms] (file:%s func:%s line:%d)\n",
1815 (ket->mname.len>0)?ket->mname.s:"",
1816 (ket->mname.len>0)?".":"", ket->fname.s, tdiff,
1817 (pframe)?PyBytes_AS_STRING(pframe->f_code->co_filename):"",
1818 (pframe)?PyBytes_AS_STRING(pframe->f_code->co_name):"",
1819 (pframe)?PyCode_Addr2Line(pframe->f_code, pframe->f_lasti):0);
1820 }
1821 }
1822
1823 return ret;
1824 }
1825
That's inside sr_apy_kemi_exec_func()
.
Some suspicion falls upon this Python function:
def set_send_socket(self):
KSR.info('[socket] forced_socket is: {}'.format(KSR.pv.get('$fs')))
if KSR.pv.get('$fs'):
KSR.info('[socket] socket already set')
return
elif not self.is_local():
KSR.info('[socket] set socket to 5065')
KSR.corex.set_send_socket(MY_IP + ':5065')
else:
KSR.info('[socket] set socket to 5060')
KSR.corex.set_send_socket(MY_IP + ':5060')
KSR.info('[socket] socket set')
... simply because the last message out of KEMI which appears prior to the crash is one of its log emissions ...
May 21 18:20:25 proxy1 /usr/sbin/kamailio[26210]: INFO: 1697311439-5060-88@BA.CFC.EH.IF - {NOTIFY}: <core> [core/kemi.c:104]: sr_kemi_core_info(): domain exists result: None
May 21 18:20:25 proxy1 /usr/sbin/kamailio[26211]: INFO: 1361696919-5060-39@BA.CFC.EH.BFC - {NOTIFY}: <core> [core/kemi.c:104]: sr_kemi_core_info(): [socket] forced_socket is: None
May 21 18:20:25 proxy1 /usr/sbin/kamailio[26213]: INFO: 813204231-5060-53@BA.BF.J.CBF - {NOTIFY}: <core> [core/receive.c:518]: receive_msg(): reply-route executed in: 260 usec
May 21 18:20:25 proxy1 /usr/sbin/kamailio[26212]: INFO: 1941306664-5060-17@BA.CFC.EH.CAG - {NOTIFY}: <core> [core/receive.c:518]: receive_msg(): reply-route executed in: 412 usec
May 21 18:20:25 proxy1 /usr/sbin/kamailio[26209]: ALERT: <core> [main.c:766]: handle_sigs(): child process 26210 exited by a signal 6
May 21 18:20:25 proxy1 /usr/sbin/kamailio[26209]: ALERT: <core> [main.c:769]: handle_sigs(): core was not generated
May 21 18:20:25 proxy1 /usr/sbin/kamailio[26209]: INFO: <core> [main.c:792]: handle_sigs(): terminating due to SIGCHLD
However, it's hard to troubleshoot without debug symbols and due to the third-party binary libraries, of course. Furthermore, the problem is not possible to reproduce on demand, though its incidence does seem to correlate to higher call volumes. Would building from source to provide more flexibility be recommended at this stage?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.