As explain at https://www.kamailio.org/wiki/tutorials/troubleshooting/memory#pkg_with_syst... valgrind can be used to check memory leaks if kamailio is compiled without ``*_MALLOC``.
I would like to support valgrind even if kamailio is compiled to use its own memory managers. In order to archive this, valgrind has to be informed of how those managers control the memory as described at http://valgrind.org/docs/manual/mc-manual.html#mc-manual.mempools
I have added valgrind instrumentation directives to **q_malloc**, but I suspect memcheck doesn't know how to check shared memory accessed be several processes.
for pkg memory, it gives correct reports such as: ``` ==134243== 13,584 bytes in 104 blocks are definitely lost in loss record 438 of 451 ==134243== at 0x64E0D0: qm_detach_free (q_malloc.c:270) ==134243== by 0x64EBC7: qm_malloc (q_malloc.c:415) ==134243== by 0x70A26C: addstr (cfg.lex:1382) ==134243== by 0x70A1DB: addchar (cfg.lex:1366) ==134243== by 0x70653D: yylex (cfg.lex:1170) ==134243== by 0x71305D: yyparse (cfg.tab.c:4775) ==134243== by 0x425A8C: main (main.c:2137) ``` (this "leak" is expected though)
but for any operation in shm, it gives (only, and tons of) inconsistent reports such as: ``` ==134284== Invalid read of size 4 ==134284== at 0x4C94D2: atomic_get_and_set_int (atomic_x86.h:223) ==134284== by 0x4C9677: futex_release (futexlock.h:134) ==134284== by 0x4EEAFB: tcpconn_do_send (tcp_main.c:2548) ==134284== by 0x4EBEC9: tcpconn_send_put (tcp_main.c:2289) ==134284== by 0x4E9733: tcp_send (tcp_main.c:2046) ==134284== by 0x6E2F612: msg_send_buffer (forward.h:218) ==134284== by 0x6E3204F: send_pr_buffer (t_funcs.c:70) ==134284== by 0x6DD4A5F: _reply_light (t_reply.c:554) ==134284== by 0x6DD56C6: _reply (t_reply.c:659) ==134284== by 0x6DDC7D5: t_reply (t_reply.c:1552) ==134284== by 0x6DF1E02: w_t_reply (tm.c:1246) ==134284== by 0x6DF7EE4: w_t_reply_wrp (tm.c:2041) ==134284== Address 0x9691ce0 is not stack'd, malloc'd or (recently) free'd ```
I've pushed the work in this branch: https://github.com/kamailio/kamailio/tree/tmp/valgrind
Do you have any info on whether memcheck can deal with shared pools?
Do you have any info on whether memcheck can deal with shared pools?
I do not but I will keep searching for info
I don't know either, but maybe for now it should just track pkg only. There is a field `type` inside the qm block that can be used to test if the operation is over pkg or shm.
I've restricted memcheck to pkg only, and also benefited from the "red zone" feature of memcheck: an empty zone between each chunk that will trigger an error in memcheck if it's accessed. This way, it's easier to catch a memory error exactly when it happens. Executing this code (new function in the **malloc_test** module): ```C static int mt_pkg_overflow_f(struct sip_msg* msg, char *p1,char *p2) { int i; unsigned long *a;
a = pkg_malloc(1024 * sizeof(unsigned long));
if (!a) { LM_ERR("no more pkg\n"); return -1; }
*(a - 1) = 0xdeadbeef; *(a + 1024) = 0xdeadc0de; for (i = 0 ; i < 1024; i++) { a[i] = (long)i; }
return 1; } ``` immediately triggers these errors: ``` ==56832== Invalid write of size 8 ==56832== at 0x93E1CD1: mt_pkg_overflow_f (malloc_test.c:689) ==56832== by 0x456E8B: do_action (action.c:1054) ==56832== by 0x463590: run_actions (action.c:1552) ==56832== by 0x463CFD: run_top_route (action.c:1641) ==56832== by 0x589484: receive_msg (receive.c:264) ==56832== by 0x49BDFC: receive_tcp_msg (tcp_read.c:1230) ==56832== by 0x49E0EB: tcp_read_req (tcp_read.c:1445) ==56832== by 0x4A0CB6: handle_io (tcp_read.c:1619) ==56832== by 0x493237: io_wait_loop_epoll (io_wait.h:1065) ==56832== by 0x4A2B05: tcp_receive_loop (tcp_read.c:1789) ==56832== by 0x509C52: tcp_init_children (tcp_main.c:4796) ==56832== by 0x423465: main_loop (main.c:1708) ==56832== Address 0x56d6fc8 is 632,184 bytes inside a fragment data (init) of size 8,119,728 client-defined ==56832== at 0x64E1EB: qm_malloc_init (q_malloc.c:261) ==56832== by 0x6584E7: qm_malloc_init_pkg_manager (q_malloc.c:1117) ==56832== by 0x642A23: pkg_init_manager (pkg.c:68) ==56832== by 0x4244FD: main (main.c:1931) ==56832== ==56832== Invalid write of size 8 ==56832== at 0x93E1CE3: mt_pkg_overflow_f (malloc_test.c:690) ==56832== by 0x456E8B: do_action (action.c:1054) ==56832== by 0x463590: run_actions (action.c:1552) ==56832== by 0x463CFD: run_top_route (action.c:1641) ==56832== by 0x589484: receive_msg (receive.c:264) ==56832== by 0x49BDFC: receive_tcp_msg (tcp_read.c:1230) ==56832== by 0x49E0EB: tcp_read_req (tcp_read.c:1445) ==56832== by 0x4A0CB6: handle_io (tcp_read.c:1619) ==56832== by 0x493237: io_wait_loop_epoll (io_wait.h:1065) ==56832== by 0x4A2B05: tcp_receive_loop (tcp_read.c:1789) ==56832== by 0x509C52: tcp_init_children (tcp_main.c:4796) ==56832== by 0x423465: main_loop (main.c:1708) ==56832== Address 0x56d8fd0 is 640,384 bytes inside a fragment data (init) of size 8,119,728 client-defined ==56832== at 0x64E1EB: qm_malloc_init (q_malloc.c:261) ==56832== by 0x6584E7: qm_malloc_init_pkg_manager (q_malloc.c:1117) ==56832== by 0x642A23: pkg_init_manager (pkg.c:68) ==56832== by 0x4244FD: main (main.c:1931) ```
The custom naming of the blocks ("fragment data") and the offset reported do not work as I expect: I'd expect the fist error to happen in a block named "fragment header" at its end and the last one in "fragment data" at its end too, so either my expectations are wrong or my code is.
Also I did not cover the instrumentation of the memory join code yet, so this should only be correct when **mem_join=0**.
This issue is stale because it has been open 6 weeks with no activity. Remove stale label or comment or this will be closed in 2 weeks.
Closed #949 as not planned.