Hey liangbaolin,
I tried to look into this, but the backtrace is a bit strange, because
those line numbers don't match up at all -
Or maybe I'm looking at a different version than you are, or the core is
from a different version of the source code?
Cheers,
-Dragos
On Fri, Oct 11, 2024 at 12:27 PM liangbaolin via sr-dev <
sr-dev(a)lists.kamailio.org> wrote:
Description
hi, I encountered a problem where the CDP module is extremely prone to
process crashes. The following are screenshots of the logs and core files.
I couldn't find the exception code that caused the problem, but I suspect
that the TCP link was properly established, but the peer did not initialize
or handle the exception properly, resulting in an exception when the packet
was parsed incorrectly and disconnected later. In addition, since the
socket is not a normal peer, it will constantly rebuild the chain, but the
CDP does not recognize and process it properly. The socket will continue to
grow, but the number of peers will not increase.
_20241011175120.png (view on web)
<https://github.com/user-attachments/assets/88a0fc31-9ff6-4f46-9394-f49c787850e8>
Troubleshooting Reproduction Debugging Data
Core was generated by `/usr/sbin/kamailio -f /etc/kamailio_dra/kamailio_dra.cfg -P
/var/run/kamailio_d'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f9860fae336 in cc_acc_client_stateful_sm_process (s=0x7f9861bd8980,
event=21874, msg=0x55720d57c702 <qm_free+8029>) at acctstatemachine.c:304
304 }
(gdb) bt
#0 0x00007f9860fae336 in cc_acc_client_stateful_sm_process (s=0x7f9861bd8980,
event=21874, msg=0x55720d57c702 <qm_free+8029>) at acctstatemachine.c:304
#1 0x00007f9860fae391 in atomic_get_and_set_int (var=0x776f6e6b6e752072, v=32766) at
../../core/mem/../atomic/atomic_x86.h:242
#2 0x00007f9860fb16eb in disconnect_serviced_peer (sp=0x7f98e28287d0, locked=0) at
receiver.c:232
#3 0x00007f9860fbd56b in receive_loop (original_peer=0x0) at receiver.c:942
#4 0x00007f9860fb4c02 in receiver_process (p=0x0) at receiver.c:488
#5 0x00007f9860f50c80 in diameter_peer_start (blocking=0) at diameter_peer.c:278
#6 0x00007f9860f413df in cdp_child_init (rank=0) at cdp_mod.c:274
#7 0x000055720d3df6c2 in init_mod_child (m=0x7f98e279b180, rank=0) at
core/sr_module.c:920
#8 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279be50, rank=0) at
core/sr_module.c:912
#9 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279c2c0, rank=0) at
core/sr_module.c:912
#10 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279cc40, rank=0) at
core/sr_module.c:912
#11 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279d130, rank=0) at
core/sr_module.c:912
#12 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279d6c0, rank=0) at
core/sr_module.c:912
#13 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279db10, rank=0) at
core/sr_module.c:912
#14 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279dff0, rank=0) at
core/sr_module.c:912
#15 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279fae0, rank=0) at
core/sr_module.c:912
#16 0x000055720d3dfff0 in init_child (rank=0) at core/sr_module.c:999
#17 0x000055720d23d70a in main_loop () at main.c:1942
#18 0x000055720d2488f5 in main (argc=16, argv=0x7ffea4fde838) at main.c:3256
(gdb) bt full
#0 0x00007f9860fae336 in cc_acc_client_stateful_sm_process (s=0x7f9861bd8980,
event=21874, msg=0x55720d57c702 <qm_free+8029>) at acctstatemachine.c:304
x = 0x7f9861b97000
ret = 441
rc = 1627304660
record_type = 32664
__func__ = "cc_acc_client_stateful_sm_process"
#1 0x00007f9860fae391 in atomic_get_and_set_int (var=0x776f6e6b6e752072, v=32766) at
../../core/mem/../atomic/atomic_x86.h:242
No locals.
#2 0x00007f9860fb16eb in disconnect_serviced_peer (sp=0x7f98e28287d0, locked=0) at
receiver.c:232
__llevel = 0
__func__ = "disconnect_serviced_peer"
#3 0x00007f9860fbd56b in receive_loop (original_peer=0x0) at receiver.c:942
__llevel = -1526863824
rfds = {__fds_bits = {0, 0, 0, 0, 1024, 0 <repeats 11 times>}}
efds = {__fds_bits = {0 <repeats 16 times>}}
tv = {tv_sec = 0, tv_usec = 883496}
n = 1
max = 298
cnt = 0
msg = 0x0
sp = 0x7f98e28287d0
sp2 = 0x7f98e2827e30
p = 0x0
fd = 295
fd_exchange_pipe_local = 28
__func__ = "receive_loop"
#4 0x00007f9860fb4c02 in receiver_process (p=0x0) at receiver.c:488
__llevel = -990730168
__func__ = "receiver_process"
#5 0x00007f9860f50c80 in diameter_peer_start (blocking=0) at diameter_peer.c:278
pid = 0
k = 1
seed = 1112701621
p = 0x0
__func__ = "diameter_peer_start"
#6 0x00007f9860f413df in cdp_child_init (rank=0) at cdp_mod.c:274
__llevel = 0
__func__ = "cdp_child_init"
#7 0x000055720d3df6c2 in init_mod_child (m=0x7f98e279b180, rank=0) at
core/sr_module.c:920
ret = 0
__func__ = "init_mod_child"
#8 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279be50, rank=0) at
core/sr_module.c:912
ret = 1
__func__ = "init_mod_child"
#9 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279c2c0, rank=0) at
core/sr_module.c:912
ret = 0
__func__ = "init_mod_child"
#10 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279cc40, rank=0) at
core/sr_module.c:912
ret = 0
__func__ = "init_mod_child"
---Type <return> to continue, or q <return> to quit---
#11 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279d130, rank=0) at
core/sr_module.c:912
ret = 0
__func__ = "init_mod_child"
#12 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279d6c0, rank=0) at
core/sr_module.c:912
ret = 0
__func__ = "init_mod_child"
#13 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279db10, rank=0) at
core/sr_module.c:912
ret = 32766
__func__ = "init_mod_child"
#14 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279dff0, rank=0) at
core/sr_module.c:912
ret = 21874
__func__ = "init_mod_child"
#15 0x000055720d3df2b5 in init_mod_child (m=0x7f98e279fae0, rank=0) at
core/sr_module.c:912
ret = 12
__func__ = "init_mod_child"
#16 0x000055720d3dfff0 in init_child (rank=0) at core/sr_module.c:999
ret = 0
type = 0x55720d6ece7b "PROC_MAIN"
__func__ = "init_child"
#17 0x000055720d23d70a in main_loop () at main.c:1942
i = 1639542784
pid = 50
si = 0x0
si_desc =
"\240\233\"\rrU\000\000@\361\367a\230\177\000\000\000\343\375\244\376\177\000\000\327~2\rrU\000\000\000\343\375\244\376\177\000\000\025\t>\r\005\000\000\000\000\000\000\000\037\000\000\000\000;\213\222\213\017\r\202h\r\000\000\000\000\000\000\060\000\000\000\000\000\000\000\240\233\"\rrU\000\000\060\350\375\244\376\177",
'\000' <repeats 18 times>, "\340\342\375\244\376\177\000\000
\351O\rrU\000"
nrprocs = 21874
woneinit = 0
__func__ = "main_loop"
#18 0x000055720d2488f5 in main (argc=16, argv=0x7ffea4fde838) at main.c:3256
cfg_stream = 0x55720ef75260
c = -1
r = 0
tmp = 0x7ffea4fdfee3 ""
tmp_len = 32766
port = 5060
proto = 0
aproto = 0
ahost = 0x0
aport = 0
options = 0x55720d6b3698
":f:cm:M:dVIhEeb:B:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:"
ret = -1
seed = 2959679812
rfd = 4
debug_save = 0
debug_flag = 0
dont_fork_cnt = 2
n_lst = 0x0
p = 0x7f996220a3d0 ""
st = {st_dev = 50, st_ino = 2995745, st_nlink = 1, st_mode = 16877, st_uid = 103,
st_gid = 105, __pad0 = 0, st_rdev = 0, st_size = 4096, st_blksize = 4096, st_blocks = 8,
st_atim = {tv_sec = 1726638407, tv_nsec = 558385502}, st_mtim = {tv_sec = 1727831142,
tv_nsec = 822744119}, st_ctim = {tv_sec = 1727831142, tv_nsec = 822744119},
__glibc_reserved = {0, 0, 0}}
---Type <return> to continue, or q <return> to quit---
l1 = 2048
tbuf =
"pQ!b\231\177\000\000\070-\000b\231\177\000\000\020\347\375\244\376\177\000\000\367\344\376a\231\177",
'\000' <repeats 18 times>,
"\001\000\000\000\000\000\000\000(W!b\231\177\000\000\000Q!b\231\177\000\000\001\000\000\000\000\000\000\000\300\200
b\231\177\000\000\017Q\377a\231\177\000\000\020W!b\231\177", '\000'
<repeats 19 times>,
"S\376\244\376\177\000\000\300\212\225\001\000\000\000\000\207\026=a\231\177\000\000`\347\375\244\376\177\000\000\220Q\376\244\376\177\000\000\002\000\000\000\231\177\000\000\000\000\000\000\000\000\000\000\300\346\375\244\376\177\000\000\003\000\000\000\000\000\000\000\260\346\375\244\376\177\000\000\000\000\000\000\000\000\000\000"...
option_index = 0
long_options = {{name = 0x55720d6b59f6 "help", has_arg = 0, flag = 0x0,
val = 104}, {name = 0x55720d6b087c "version", has_arg = 0, flag = 0x0, val =
118}, {name = 0x55720d6b59fb "alias", has_arg = 1, flag = 0x0, val = 1024},
{name = 0x55720d6b5a01 "subst", has_arg = 1,
flag = 0x0, val = 1025}, {name = 0x55720d6b5a07 "substdef", has_arg
= 1, flag = 0x0, val = 1026}, {name = 0x55720d6b5a10 "substdefs", has_arg = 1,
flag = 0x0, val = 1027}, {name = 0x55720d6b5a1a "server-id", has_arg = 1, flag =
0x0, val = 1028}, {
name = 0x55720d6b5a24 "loadmodule", has_arg = 1, flag = 0x0, val =
1029}, {name = 0x55720d6b5a2f "modparam", has_arg = 1, flag = 0x0, val = 1030},
{name = 0x55720d6b5a38 "log-engine", has_arg = 1, flag = 0x0, val = 1031}, {name
= 0x55720d6b5a43 "debug", has_arg = 1,
flag = 0x0, val = 1032}, {name = 0x55720d6b5a49 "cfg-print",
has_arg = 0, flag = 0x0, val = 1033}, {name = 0x55720d6b5a53 "atexit", has_arg =
1, flag = 0x0, val = 1034}, {name = 0x55720d6b5a5a "all-errors", has_arg = 0,
flag = 0x0, val = 1035}, {name = 0x0,
has_arg = 0, flag = 0x0, val = 0}}
__func__ = "main"
(gdb)
(gdb) info locals
cfg_stream = 0x55720ef75260
c = -1
r = 0
tmp = 0x7ffea4fdfee3 ""
tmp_len = 32766
port = 5060
proto = 0
aproto = 0
ahost = 0x0
aport = 0
options = 0x55720d6b3698
":f:cm:M:dVIhEeb:B:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:x:X:Y:"
ret = -1
seed = 2959679812
rfd = 4
debug_save = 0
debug_flag = 0
dont_fork_cnt = 2
n_lst = 0x0
p = 0x7f996220a3d0 ""
st = {st_dev = 50, st_ino = 2995745, st_nlink = 1, st_mode = 16877, st_uid = 103, st_gid
= 105, __pad0 = 0, st_rdev = 0, st_size = 4096, st_blksize = 4096, st_blocks = 8, st_atim
= {tv_sec = 1726638407, tv_nsec = 558385502}, st_mtim = {tv_sec = 1727831142, tv_nsec =
822744119},
st_ctim = {tv_sec = 1727831142, tv_nsec = 822744119}, __glibc_reserved = {0, 0, 0}}
l1 = 2048
tbuf =
"pQ!b\231\177\000\000\070-\000b\231\177\000\000\020\347\375\244\376\177\000\000\367\344\376a\231\177",
'\000' <repeats 18 times>,
"\001\000\000\000\000\000\000\000(W!b\231\177\000\000\000Q!b\231\177\000\000\001\000\000\000\000\000\000\000\300\200
b\231\177\000\000\017Q\377a\231\177\000\000\020W!b\231\177", '\000'
<repeats 19 times>,
"S\376\244\376\177\000\000\300\212\225\001\000\000\000\000\207\026=a\231\177\000\000`\347\375\244\376\177\000\000\220Q\376\244\376\177\000\000\002\000\000\000\231\177\000\000\000\000\000\000\000\000\000\000\300\346\375\244\376\177\000\000\003\000\000\000\000\000\000\000\260\346\375\244\376\177\000\000\000\000\000\000\000\000\000\000"...
option_index = 0
long_options = {{name = 0x55720d6b59f6 "help", has_arg = 0, flag = 0x0, val =
104}, {name = 0x55720d6b087c "version", has_arg = 0, flag = 0x0, val = 118},
{name = 0x55720d6b59fb "alias", has_arg = 1, flag = 0x0, val = 1024}, {name =
0x55720d6b5a01 "subst", has_arg = 1,
flag = 0x0, val = 1025}, {name = 0x55720d6b5a07 "substdef", has_arg = 1,
flag = 0x0, val = 1026}, {name = 0x55720d6b5a10 "substdefs", has_arg = 1, flag =
0x0, val = 1027}, {name = 0x55720d6b5a1a "server-id", has_arg = 1, flag = 0x0,
val = 1028}, {
name = 0x55720d6b5a24 "loadmodule", has_arg = 1, flag = 0x0, val = 1029},
{name = 0x55720d6b5a2f "modparam", has_arg = 1, flag = 0x0, val = 1030}, {name =
0x55720d6b5a38 "log-engine", has_arg = 1, flag = 0x0, val = 1031}, {name =
0x55720d6b5a43 "debug", has_arg = 1,
flag = 0x0, val = 1032}, {name = 0x55720d6b5a49 "cfg-print", has_arg = 0,
flag = 0x0, val = 1033}, {name = 0x55720d6b5a53 "atexit", has_arg = 1, flag =
0x0, val = 1034}, {name = 0x55720d6b5a5a "all-errors", has_arg = 0, flag = 0x0,
val = 1035}, {name = 0x0, has_arg = 0,
flag = 0x0, val = 0}}
__func__ = "main"
(gdb) list
299 if(s) {
300 AAASessionsUnlock(s->hash);
301 }
302
303 return ret;
304 }
Log Messages
2024-10-11T08:23:20.234947572Z 21(60) ERROR: cdp [receiver.c:783]: receive_loop():
select_recv(): Bad file descriptor
2024-10-11T08:23:24.906121404Z 21(60) ERROR: cdp [receiver.c:783]: receive_loop():
select_recv(): Bad file descriptor
2024-10-11T08:23:41.857946233Z 21(60) ERROR: cdp [receiver.c:783]: receive_loop():
select_recv(): Bad file descriptor
2024-10-11T08:25:18.095639136Z 25(64) WARNING: cdp [peermanager.c:337]: peer_timer():
Inactivity on peer [
scscf32.ims.mnc011.mcc460.3gppnetwork.org] and no DWA, Closing
peer...
2024-10-11T08:43:38.138243609Z 31(70) CRITICAL: <core> [core/pass_fd.c:281]:
receive_fd(): EOF on 34
2024-10-11T08:43:47.476315811Z 0(39) ALERT: <core> [main.c:805]: handle_sigs():
child process 60 exited by a signal 11
2024-10-11T08:43:47.476380054Z 0(39) ALERT: <core> [main.c:809]: handle_sigs():
core was generated
2024-10-11T08:43:47.503780368Z 0(39) CRITICAL: cdp [diameter_peer.c:447]:
diameter_peer_destroy(): destroy_diameter_peer(): Bye Bye from C Diameter Peer test
- *Operating System*:
version: kamailio 5.8.1 (x86_64/linux) 07b761
flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS,
DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, MEM_JOIN_FREE,
Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX,
FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR,
USE_DST_BLOCKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED
ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144,
MAX_SEND_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT
PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: 07b761
compiled on 10:00:57 Oct 9 2024 with gcc 7.5.0
—
Reply to this email directly, view it on GitHub
<https://github.com/kamailio/kamailio/issues/3999>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABO7UZKVG2MSCXKHIKVVDOTZ26RMPAVCNFSM6AAAAABPYT6WY6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGU4DCMBRGM2TENA>
.
You are receiving this because you are subscribed to this thread.Message
ID: <kamailio/kamailio/issues/3999(a)github.com>
_______________________________________________
Kamailio (SER) - Development Mailing List
To unsubscribe send an email to sr-dev-leave(a)lists.kamailio.org