On Friday 16 April 2010, IƱaki Baz Castillo wrote:
- A request arrives and it's handled by a worker
process.
- There is some memory leak in PKG MEM (or perhaps too few memory
allocated for it).
- The process still can do basic tasks as parsing and so.
- The script creates a dialog and does other operation requiring SHM
memory.
- When calling to t_relay() it fails due to non enough PKG mem, so
the transaction is not created.
- Also depending on the memory status it's possible that the process
can not generate a SIP error response (so there is no response).
- The client starts with retransmissions.
- These retransmissions would not match an existing transaction so all
the script process is done again (creating a new dialog and so).
- Again t_relay() fails.
Hello Inaki,
transaction live mainly in shared memory, so this could be another reason that
the t_relay/ t_newtran fails. But you're right, it should also fail due
insufficient private memory .
So what I'm in mind is a new script function to
get the current
available PKG mem, so the script can determine not to process the
request and reply (if it can) an error response. This would avoid the
creation of a new dialog, db queries and so on.
There are already some functions that output mem status, albeit in the log.
Take a look to pkg_status/shm_status() in cfgutils. So one could of course
implement a PV that returns the number of available memory, or a function that
checks for a certain range.
But i wonder if this is really necessary. The last out of memory condition we
observed in production is years away, if i remember correctly. So i'd suggest
that you just use a proper size of PKG mem pool (like 10MB per process) and
also enough shared memory, as RAM is really cheap this days. If then you still
get a out of memory condition then there is a memory leak in the code, and
this should just be fixed instead of trying to work around in the script here.
Best regards,
Henning