I will reproduce the tests and hope I can catch something ...
Cheers, Daniel
On 03/19/07 18:52, Christian Schlatter wrote:
...
The memory statistics indeed show a high number of memory fragments:
before 'out of memory':
shmem:total_size = 536870912 shmem:used_size = 59607040 shmem:real_used_size = 60106488 shmem:max_used_size = 68261536 shmem:free_size = 476764424 shmem:fragments = 9897
after 'out of memory' (about 8000 calls per process):
shmem:total_size = 536870912 shmem:used_size = 4171160 shmem:real_used_size = 4670744 shmem:max_used_size = 68261536 shmem:free_size = 532200168 shmem:fragments = 57902
You can try to compile openser with -DQM_JOIN_FREE (add it in DEFS variable of Makefile.defs) and test again. Free fragments should be merged and fragmentation should not occur -- processing will be slower. We will try for next release to provide a better solution for that.
Compiling openser with -DQM_JOIN_FREE did not help. I'm not sure how big of a problem this fragmentation issue is.
What is the number of fragments with QM_JOIN_FREE after flooding?
The numbers included above are with QM_JOIN_FREE enabled.
Do you think it would make sense to restart our production openser instances from time to time just to make sure they're not running into this memory fragmentation limits?
The issue will occur only when the call rate reaches the limits of the proxy's memory. Otherwise the chunks are reused. Transactions and avps are rounded up to be sure there will be minimized the number of different sizes for memory chunks. It wasn't reported too often, maybe that's why no big attention was paid to it. This memory system is in place since the beginning of ser. Alternative is to use sysv shared memory, but is much slower, along with libc private memory manager.
I've done some more testing and the same out-of-memory stuff happens when I run sipp with 10 calls per second only. I tested with 'children=1' and I only could get through about 8200 calls (again those 8000 calls / process). And this is with QM_JOIN_FREE enabled.
Memory statistics:
before: shmem:total_size = 536870912 shmem:used_size = 2311976 shmem:real_used_size = 2335720 shmem:max_used_size = 2465816 shmem:free_size = 534535192 shmem:fragments = 183
after: shmem:total_size = 536870912 shmem:used_size = 1853472 shmem:real_used_size = 1877224 shmem:max_used_size = 2465816 shmem:free_size = 534993688 shmem:fragments = 547
So I'm not sure if this is really a fragmentation issue. 10 cps surely doesn't reach the proxy's memory.
Thoughts?
Christian
Cheers, Daniel
thanks, Christian
Cheers, Daniel
On 03/18/07 01:21, Christian Schlatter wrote:
Christian Schlatter wrote: ...
I always had 768MB shared memory configured though, so I still can't explain the memory allocation errors I got. Some more test runs revealed that I only get these errors when using a more production oriented config that loads more modules than the one posted in my earlier email. I now try to figure out what exactly causes these memory allocation errors that happen reproducibly after about 220s at 400 cps.
I think I found the cause for the memory allocation errors. As soon as I include an AVP write operation in the routing script, I get 'out of memory' messages after a certain number of calls generated with sipp.
The routing script to reproduce this behavior looks like (full config available at http://www.unc.edu/~cschlatt/openser/openser.cfg):
route{ $avp(s:ct) = $ct; # commenting this line solves # the memory problem
if (!method=="REGISTER") record_route(); if (loose_route()) route(1); if (uri==myself) rewritehost("xx.xx.xx.xx"); route(1);
}
route[1] { if (!t_relay()) sl_reply_error(); exit; }
An example log file showing the 'out of memory' messages is available at http://www.unc.edu/~cschlatt/openser/openser.log .
Some observations:
- The 'out of memory' messages always appear after about 8000 test
calls per worker process. One call consists of two SIP transactions and six end-to-end SIP messages. An openser with 8 children handles about 64'000 calls, whereas 4 children only handle about 32'000 calls. The sipp call rate doesn't matter, only number of calls.
- The 8000 calls per worker process are independent from the
amount of shared memory available. Running openser with -m 128 or -m 768 does not make a difference.
- The more AVP writes are done in the script, the less calls go
through. It looks like each AVP write is leaking memory (unnoticed by the memory statistics).
- The fifo memory statistics do not reflect the 'out of memory'
syslog messages. Even if openser does not route a single SIP message because of memory issues, the statistics still show a lot of 'free' memory.
All tests were done with openser SVN 1.2 branch on Ubuntu dapper x86. I think the same is true for 1.1 version but I haven't tested that yet.
Christian
Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users