Thank you @charlesrchance , I did some tests with this setup: kamailio.cfg (meaningful lines): ``` fork=yes children=1 tcp_connection_lifetime=3605 pv_buffer_size=8192
# ----- dmq params ----- modparam("dmq", "server_address", DMQ_SERVER_ADDRESS) modparam("dmq", "notification_address", DMQ_NOTIFICATION_ADDRESS) modparam("dmq", "multi_notify", 1) modparam("dmq", "num_workers", 1) modparam("dmq", "ping_interval", 15) modparam("dmq", "worker_usleep", 1000)
# ----- htable params ----- modparam("htable", "enable_dmq", 1) modparam("htable", "dmq_init_sync", 1) modparam("htable", "htable", "ht=>size=16;dmqreplicate=1;autoexpire=10800;") # Keep track of concurrent channels for accounts. Should be same as dialog modparam("htable", "htable", "ht1=>size=16;dmqreplicate=1;autoexpire=10800;") # Keep track of concurrent channels for accounts. Should be same as dialog modparam("htable", "htable", "ht2=>size=16;dmqreplicate=1;autoexpire=10800;") # Keep track of concurrent channels for accounts. Should be same as dialog modparam("htable", "htable", "ht3=>size=16;dmqreplicate=1;autoexpire=10800;") # Keep track of concurrent channels for accounts. Should be same as dialog
#!define ONEK "really 1 k chars, believe me :)"
event_route[htable:mod-init] { $var(name) = POD_NAME + "\n"; xlog("L_ALERT", "$var(name)"); if(POD_NAME == "kama-0") { $var(count) = 0; while($var(count) < 99) { $sht(ht=>$var(count)) = ONEK; $sht(ht1=>$var(count)) = ONEK; $sht(ht2=>$var(count)) = ONEK; $sht(ht3=>$var(count)) = ONEK; $var(count) = $var(count)+1; } } }
request_route { if ($rm == "KDMQ"){ dmq_handle_message(); } exit; } ```
Started kama-0 which has now 4 htables of ~99K size each Started 10 kubernetes pods and launched kamailio 100 times with a timeout of 3 seconds on each pod So we have roughly 1000 kamailios trying to get these htables from kama-0 I didn't see any dangerous CPU spike and the loop doesn't happen anymore.
There's something I'm worried of though: the memory of the DMQ worker (measured from top), which usually stays around 0.1% is now stable at 1.4% and it's not going down again
I fear there's a memory leak somewhere but I'm not sure where, I had some doubts while debugging the loop issue about how the json_t structures are freed but it could be caused by me not knowing well the code; can you give us any hint to help you understand this issue better?
Thanks