I am troubleshooting intermittent latency on a server, several processes will report slow action at the same time every now and then, something odd it that some of the actions are most likely not using any mutexes, I can not imagine how they can end up blocking unless the process is actually suspended by the OS, however the CPU usage and load average is quite low on this server.
Is there any reason why assigning to a PKG var could be delayed 3 seconds.
2019-11-13T21:58:04.018631614Z 23(42) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:115] took too long [2579486 us] $var(reply) = 0; // I wonder how this can endup blocking ?
2019-11-13T21:58:04.018748710Z 42(61) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:221] took too long [2222010 us] } // Samething here, simply exiting a scope
2019-11-13T21:58:04.015675095Z 11(30) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [t_relay_cancel (24)] cfg [/etc/kamailio/kamailio.cfg:86] took too long [2588818 us] } if (!t_relay_cancel()) { // Indeed a t_relay_cancel
2019-11-13T21:58:04.018647988Z 26(45) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [t_relay (24)] cfg [/etc/kamailio/kamailio.cfg:289] took too long [2739366 us]
t_relay(); // Indeed a t_relay
2019-11-13T21:58:04.018651243Z 15(34) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [sl_send_reply (26)] cfg [/etc/kamailio/kamailio.cfg:112] took too long [2587812 us] sl_send_reply("100", "Trying"); // Indeed a sl_send_reply
Not sure which mailing list is best for this question Regards Julien
I guess action type 16 IF_T is accounting for the entire scope block since it is recursive.
I wonder why var assignment in the main route scope would do the same :
2019-11-13T21:58:04.018631614Z 23(42) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:115] took too long [2579486 us] $var(reply) = 0; // I wonder how this can endup blocking ?
On Wed, Nov 13, 2019 at 2:22 PM Julien Chavanton jchavanton@gmail.com wrote:
I am troubleshooting intermittent latency on a server, several processes will report slow action at the same time every now and then, something odd it that some of the actions are most likely not using any mutexes, I can not imagine how they can end up blocking unless the process is actually suspended by the OS, however the CPU usage and load average is quite low on this server.
Is there any reason why assigning to a PKG var could be delayed 3 seconds.
2019-11-13T21:58:04.018631614Z 23(42) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:115] took too long [2579486 us] $var(reply) = 0; // I wonder how this can endup blocking ?
2019-11-13T21:58:04.018748710Z 42(61) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:221] took too long [2222010 us] } // Samething here, simply exiting a scope
2019-11-13T21:58:04.015675095Z 11(30) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [t_relay_cancel (24)] cfg [/etc/kamailio/kamailio.cfg:86] took too long [2588818 us] } if (!t_relay_cancel()) { // Indeed a t_relay_cancel
2019-11-13T21:58:04.018647988Z 26(45) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [t_relay (24)] cfg [/etc/kamailio/kamailio.cfg:289] took too long [2739366 us]
t_relay(); // Indeed a t_relay
2019-11-13T21:58:04.018651243Z 15(34) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [sl_send_reply (26)] cfg [/etc/kamailio/kamailio.cfg:112] took too long [2587812 us] sl_send_reply("100", "Trying"); // Indeed a sl_send_reply
Not sure which mailing list is best for this question Regards Julien
Is this a virtual machine? How often does it happen?
Sometime the VM manager "freezes" the VM for different purposes, like backup snapshots, ...
Cheers, Daniel
On 13.11.19 23:40, Julien Chavanton wrote:
I guess action type 16 IF_T is accounting for the entire scope block since it is recursive.
I wonder why var assignment in the main route scope would do the same :
2019-11-13T21:58:04.018631614Z 23(42) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:115] took too long [2579486 us] $var(reply) = 0; // I wonder how this can endup blocking ?
On Wed, Nov 13, 2019 at 2:22 PM Julien Chavanton <jchavanton@gmail.com mailto:jchavanton@gmail.com> wrote:
I am troubleshooting intermittent latency on a server, several processes will report slow action at the same time every now and then, something odd it that some of the actions are most likely not using any mutexes, I can not imagine how they can end up blocking unless the process is actually suspended by the OS, however the CPU usage and load average is quite low on this server. Is there any reason why assigning to a PKG var could be delayed 3 seconds. 2019-11-13T21:58:04.018631614Z 23(42) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:115] took too long [2579486 us] $var(reply) = 0; // I wonder how this can endup blocking ? 2019-11-13T21:58:04.018748710Z 42(61) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:221] took too long [2222010 us] } // Samething here, simply exiting a scope 2019-11-13T21:58:04.015675095Z 11(30) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [t_relay_cancel (24)] cfg [/etc/kamailio/kamailio.cfg:86] took too long [2588818 us] } if (!t_relay_cancel()) { // Indeed a t_relay_cancel 2019-11-13T21:58:04.018647988Z 26(45) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [t_relay (24)] cfg [/etc/kamailio/kamailio.cfg:289] took too long [2739366 us] t_relay(); // Indeed a t_relay 2019-11-13T21:58:04.018651243Z 15(34) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [sl_send_reply (26)] cfg [/etc/kamailio/kamailio.cfg:112] took too long [2587812 us] sl_send_reply("100", "Trying"); // Indeed a sl_send_reply Not sure which mailing list is best for this question Regards Julien
Kamailio (SER) - Users Mailing List sr-users@lists.kamailio.org https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
It is running in docker, the interval is not stable ~30sec. I know this is not related to DNS slow queries since I am monitoring them as well. I wonder what else can be intermittently blocking ... I am going to look at the TM locks and the callbacks.
Not many modules used on this one, maybe Dialog module using a DB local sqlite
modparam("dialog", "db_update_period", 60) modparam("dialog", "db_mode", 3)
On Wed, Nov 13, 2019 at 10:43 PM Daniel-Constantin Mierla miconda@gmail.com wrote:
Is this a virtual machine? How often does it happen?
Sometime the VM manager "freezes" the VM for different purposes, like backup snapshots, ...
Cheers, Daniel On 13.11.19 23:40, Julien Chavanton wrote:
I guess action type 16 IF_T is accounting for the entire scope block since it is recursive.
I wonder why var assignment in the main route scope would do the same :
2019-11-13T21:58:04.018631614Z 23(42) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:115] took too long [2579486 us] $var(reply) = 0; // I wonder how this can endup blocking ?
On Wed, Nov 13, 2019 at 2:22 PM Julien Chavanton jchavanton@gmail.com wrote:
I am troubleshooting intermittent latency on a server, several processes will report slow action at the same time every now and then, something odd it that some of the actions are most likely not using any mutexes, I can not imagine how they can end up blocking unless the process is actually suspended by the OS, however the CPU usage and load average is quite low on this server.
Is there any reason why assigning to a PKG var could be delayed 3 seconds.
2019-11-13T21:58:04.018631614Z 23(42) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:115] took too long [2579486 us] $var(reply) = 0; // I wonder how this can endup blocking ?
2019-11-13T21:58:04.018748710Z 42(61) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:221] took too long [2222010 us] } // Samething here, simply exiting a scope
2019-11-13T21:58:04.015675095Z 11(30) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [t_relay_cancel (24)] cfg [/etc/kamailio/kamailio.cfg:86] took too long [2588818 us] } if (!t_relay_cancel()) { // Indeed a t_relay_cancel
2019-11-13T21:58:04.018647988Z 26(45) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [t_relay (24)] cfg [/etc/kamailio/kamailio.cfg:289] took too long [2739366 us]
t_relay(); // Indeed a t_relay
2019-11-13T21:58:04.018651243Z 15(34) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [sl_send_reply (26)] cfg [/etc/kamailio/kamailio.cfg:112] took too long [2587812 us] sl_send_reply("100", "Trying"); // Indeed a sl_send_reply
Not sure which mailing list is best for this question Regards Julien
Kamailio (SER) - Users Mailing Listsr-users@lists.kamailio.orghttps://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
-- Daniel-Constantin Mierla -- www.asipto.comwww.twitter.com/miconda -- www.linkedin.com/in/miconda Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com
Hi,
Identifed root cause, appears to be a regression in the path module
When using : modparam("path", "use_received", 1) t_relay is not using the receive param of the route header anymore.
Causing a lot of transaction problems exacerbating the server. What is surprising is that I would have expected high CPU / LOAD.
Anyway I will open a bug to verify why this is not working in Kamailio 5 I tested a few things to make it work quickly and ended up creating a patch from the routing script itself until I can clarify the path module modifications.
more investigation in the bug report
On Wed, Nov 13, 2019 at 11:44 PM Julien Chavanton jchavanton@gmail.com wrote:
It is running in docker, the interval is not stable ~30sec. I know this is not related to DNS slow queries since I am monitoring them as well. I wonder what else can be intermittently blocking ... I am going to look at the TM locks and the callbacks.
Not many modules used on this one, maybe Dialog module using a DB local sqlite
modparam("dialog", "db_update_period", 60) modparam("dialog", "db_mode", 3)
On Wed, Nov 13, 2019 at 10:43 PM Daniel-Constantin Mierla < miconda@gmail.com> wrote:
Is this a virtual machine? How often does it happen?
Sometime the VM manager "freezes" the VM for different purposes, like backup snapshots, ...
Cheers, Daniel On 13.11.19 23:40, Julien Chavanton wrote:
I guess action type 16 IF_T is accounting for the entire scope block since it is recursive.
I wonder why var assignment in the main route scope would do the same :
2019-11-13T21:58:04.018631614Z 23(42) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:115] took too long [2579486 us] $var(reply) = 0; // I wonder how this can endup blocking ?
On Wed, Nov 13, 2019 at 2:22 PM Julien Chavanton jchavanton@gmail.com wrote:
I am troubleshooting intermittent latency on a server, several processes will report slow action at the same time every now and then, something odd it that some of the actions are most likely not using any mutexes, I can not imagine how they can end up blocking unless the process is actually suspended by the OS, however the CPU usage and load average is quite low on this server.
Is there any reason why assigning to a PKG var could be delayed 3 seconds.
2019-11-13T21:58:04.018631614Z 23(42) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:115] took too long [2579486 us] $var(reply) = 0; // I wonder how this can endup blocking ?
2019-11-13T21:58:04.018748710Z 42(61) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [corefunc (16)] cfg [/etc/kamailio/kamailio.cfg:221] took too long [2222010 us] } // Samething here, simply exiting a scope
2019-11-13T21:58:04.015675095Z 11(30) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [t_relay_cancel (24)] cfg [/etc/kamailio/kamailio.cfg:86] took too long [2588818 us] } if (!t_relay_cancel()) { // Indeed a t_relay_cancel
2019-11-13T21:58:04.018647988Z 26(45) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [t_relay (24)] cfg [/etc/kamailio/kamailio.cfg:289] took too long [2739366 us]
t_relay(); // Indeed a t_relay
2019-11-13T21:58:04.018651243Z 15(34) WARNING: <core> [core/action.c:1586]: run_actions(): alert - action [sl_send_reply (26)] cfg [/etc/kamailio/kamailio.cfg:112] took too long [2587812 us] sl_send_reply("100", "Trying"); // Indeed a sl_send_reply
Not sure which mailing list is best for this question Regards Julien
Kamailio (SER) - Users Mailing Listsr-users@lists.kamailio.orghttps://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
-- Daniel-Constantin Mierla -- www.asipto.comwww.twitter.com/miconda -- www.linkedin.com/in/miconda Kamailio World Conference - April 27-29, 2020, in Berlin -- www.kamailioworld.com