User Tools

Site Tools


tutorials:troubleshooting:memory

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
tutorials:troubleshooting:memory [2015/01/16 16:57]
miconda [Using GDB]
tutorials:troubleshooting:memory [2021/06/01 20:44] (current)
giavac [Insufficient Memory]
Line 36: Line 36:
  
   * too small PKG or SHM - insufficient size to accommodate all data needed to be stored in memory   * too small PKG or SHM - insufficient size to accommodate all data needed to be stored in memory
-  * memory leak - some part of code allocates memory at runtine and does not free it+  * memory leak - some part of code allocates memory at runtime and does not free it
  
 ===== Monitoring Memory ===== ===== Monitoring Memory =====
Line 46: Line 46:
 <code> <code>
 kamctl stats shmem kamctl stats shmem
 +kamcmd mod.stats all shm
 </code> </code>
  
Line 52: Line 53:
 <code> <code>
 kamcmd pkg.stats kamcmd pkg.stats
 +kamcmd mod.stats all pkg
 </code> </code>
  
 Notice that for SHM only one group of statistics is printed, being one zone of memory, while for PKG you get a list with many groups of statistics, each specific for a Kamailio process (child). Notice that for SHM only one group of statistics is printed, being one zone of memory, while for PKG you get a list with many groups of statistics, each specific for a Kamailio process (child).
 +
 +In order to merge the free memory fragments one should enable memory join. Default is disabled (mem_join=0).
 +<code>
 +mem_join=1
 +</code>
 +
  
 ===== Analysis of Memory Incidents ===== ===== Analysis of Memory Incidents =====
Line 63: Line 71:
   * if the number of subscribers, traffic is constant, no larger data was reloaded (e.g., dispacher, lcr), then there is very likely a memory leak that has to be discovered and fixed   * if the number of subscribers, traffic is constant, no larger data was reloaded (e.g., dispacher, lcr), then there is very likely a memory leak that has to be discovered and fixed
  
-===== Troubleshooting ===== 
  
 ===== Memory Manager Debugging ===== ===== Memory Manager Debugging =====
Line 78: Line 85:
  
 <code> <code>
-MEMMNG=1 MEMDBG=1 make cfg ...+MEMDBG=1 make cfg ...
 </code> </code>
  
Line 99: Line 106:
  
 Then restart and wait a bit for getting some traffic processed. Then restart and wait a bit for getting some traffic processed.
 +
 +MEMMNG=0/1/2 to select from different memory allocation algorithms (fm, qm, tlsf) is deprecated. Use "-x" parameter when running kamailio instead; see kamailio -h for more details.
  
 To get the list of chunks from memory manager, there are two ways: To get the list of chunks from memory manager, there are two ways:
Line 114: Line 123:
     * for SHM memory:     * for SHM memory:
 <code> <code>
-kamcmd cfg.set_now_int core mem_dump_shm+kamcmd cfg.set_now_int core mem_dump_shm 1
 </code> </code>
  
Line 136: Line 145:
 For PKG is similar format, just SHM replaced with PKG in messages. For PKG is similar format, just SHM replaced with PKG in messages.
  
 +To generate summary report, do:
 +
 +<code>
 +# first set memlog lower than debug
 +kamcmd cfg.set_now_int core memlog 1
 +
 +kamcmd corex.shm_summary
 +</code>
 +
 +The log for f_malloc with debug enabled should look like:
 +
 +<code>
 +20(4082) NOTICE: fm_status: summarizing all alloc'ed. fragments:
 +20(4082) NOTICE: fm_status:  count=     1 size=     16640 bytes from <core>: counters.c: counters_prefork_init(207)
 +20(4082) NOTICE: fm_status:  count=     1 size=     14560 bytes from debugger: debugger_api.c: dbg_init_pid_list(572)
 +20(4082) NOTICE: fm_status:  count=     1 size=      4992 bytes from sl: sl_stats.c: init_sl_stats_child(125)
 +20(4082) NOTICE: fm_status:  count=     1 size=       256 bytes from tmx: tmx_pretran.c: tmx_init_pretran_table(90)
 +20(4082) NOTICE: fm_status:  count=     1 size=      6656 bytes from tm: t_stats.c: init_tm_stats_child(60)
 +20(4082) NOTICE: fm_status:  count=     1 size=      1248 bytes from kex: pkg_stats.c: pkg_proc_stats_init(79)
 +20(4082) NOTICE: fm_status:  count=     2 size=        64 bytes from <core>: cfg/cfg_struct.c: cfg_clone_str(130)
 +20(4082) NOTICE: fm_status:  count=     1 size=       704 bytes from <core>: cfg/cfg_struct.c: cfg_shmize(217)
 +20(4082) NOTICE: fm_status:  count=     3 size=        64 bytes from usrloc: udomain.c: build_stat_name(51)
 +</code>
 +
 +If you dumped the status with qm_malloc, you can extract the logs from syslog and count the unique allocations with next commands:
 +
 +<code>
 +grep qm_status /var/log/syslog >qm_status.txt
 +
 +# or:
 +# grep qm_status /var/log/messages >qm_status.txt
 +
 +grep alloc qm_status.txt | awk '{ print substr( $0, 16, length($0) ) }' | sort | uniq -c | sort -k1n
 +</code>
 ===== Using GDB ===== ===== Using GDB =====
  
Line 148: Line 191:
 if($i>2000) if($i>2000)
 if($a->u.is_free==0) if($a->u.is_free==0)
 +printf "=========== non-free fragment: %d\n", $i
 +p $a
 +p (void*)((char*)($a)+sizeof(struct qm_frag))
 +printf "----------- content\n"
 p *$a p *$a
 end end
 +end
 +set $a = ((struct qm_frag*)((char*)($a)+sizeof(struct qm_frag)+((struct qm_frag*)$a)->size+sizeof(struct qm_frag_end)))
 +set $i = $i + 1
 +end
 +</code>
 +
 +An alternative is to print all used chunks, but be aware that it may take some time:
 +
 +
 +<code c>
 +set $i=0
 +set $a = mem_block->first_frag
 +while($a < mem_block->last_frag_end)
 +if($a->u.is_free==0)
 +printf "=========== non-free fragment: %d\n", $i
 +p $a
 +p (void*)((char*)($a)+sizeof(struct qm_frag))
 +printf "----------- content\n"
 +p *$a
 end end
 set $a = ((struct qm_frag*)((char*)($a)+sizeof(struct qm_frag)+((struct qm_frag*)$a)->size+sizeof(struct qm_frag_end))) set $a = ((struct qm_frag*)((char*)($a)+sizeof(struct qm_frag)+((struct qm_frag*)$a)->size+sizeof(struct qm_frag_end)))
Line 167: Line 233:
 gdb --batch --command=/tmp/kamailio-dump-used-pkg.gdb /usr/sbin/kamailio 21907 gdb --batch --command=/tmp/kamailio-dump-used-pkg.gdb /usr/sbin/kamailio 21907
 </code> </code>
 +
 +===== PKG With System Malloc =====
 +
 +Kamailio can be compiled to use system malloc and free for PKG needs. You have to remove the PKG_MALLOC define from Makefile.defs and can add DBG_SYS_MALLOC to get more verbosity with memory operations in debug mode.
 +
 +Next is a diff showing the changes in Makefile.defs, but note that lines may vary on your specific Kamailio version.
 +
 +<code c>
 +diff --git a/Makefile.defs b/Makefile.defs
 +index 3890668..12ca37a 100644
 +--- a/Makefile.defs
 ++++ b/Makefile.defs
 +@@ -621,7 +621,7 @@ C_DEFS= $(extra_defs) \
 +         -DSER_VER=$(SER_VER) \
 +         -DCFG_DIR='"$(cfg_target)"'\
 +         -DRUN_DIR='"$(run_target)"'\
 +-        -DPKG_MALLOC \
 ++        -DDBG_SYS_MALLOC \
 +         -DSHM_MEM  -DSHM_MMAP \
 +         -DDNS_IP_HACK \
 +         -DUSE_MCAST \
 +</code>
 +
 +After updating Makefile.defs recompile and reinstall.
 +
 +Other tools available out there (e.g., valgrind) can be then used to track the PKG memory operations done by Kamailio.
  
 ===== OS Memory Reports ===== ===== OS Memory Reports =====
Line 172: Line 264:
 It may happen that various tools report memory usage increase on the server. That could be due to a leak issue or due to caching done by kernel. The memory for cache can be reclaimed and it is better to verify whether the increase is due to it or not, before going ahead to investigate other applications. It may happen that various tools report memory usage increase on the server. That could be due to a leak issue or due to caching done by kernel. The memory for cache can be reclaimed and it is better to verify whether the increase is due to it or not, before going ahead to investigate other applications.
  
-Kamailio itself uses in very few components extra memory directly from the system. Most of the operations are done in the memory zone reserved at startup and when that is filled, it starts printing out of memory errors. +Kamailio itself uses in very few components extra memory directly from the system (those are the only ones that can cause a system memory leak). Most of the operations are done in the memory zone reserved at startup and when that is filled, it starts printing out of memory errors. Kamailio will not get at runtime more system memory for those operations, even there is sufficient available in the system - the size reserved at startup is fixed.
  
 Here is the article that presents better the situation: Here is the article that presents better the situation:
  
-   http://blog.logicmonitor.com/2014/10/09/more-linux-memory-free-memory-that-is-not-free-nor-cache/+   https://www.logicmonitor.com/blog/more-linux-memory-free-memory-that-is-not-free-nor-cache/ 
 + 
 +An relevant excerpt from the blog article: 
 + 
 +<code> 
 +Looking at the contents of /proc/meminfo showed these two lines: 
 + 
 +Slab: 4126212 kB  
 +SReclaimable: 4096172 kB 
 + 
 +So – almost 4G of memory was in use by the kernel slab memory structures – but 
 + almost all of that memory was reclaimable. (Or, in effect, free.) 
 + 
 +So reclaimable slab space is yet another way that Linux memory can be in 
 + effect free, but not show up in the free memory statistics. 
 + 
 +</code>
  
 The respective memory can be reclaimed with command: The respective memory can be reclaimed with command:
tutorials/troubleshooting/memory.1421423872.txt.gz · Last modified: 2015/01/16 16:57 by miconda