Hi,
what's the dialog module's "failed_dlgs" counter variable (defined in modules_k/dialog.c) supposed to indicate?
My understanding would be that it shows the number of dialogs that could not reach the "confirmed" call state, i.e., calls for which the caller received a non-2xx final response after the initial INVITE during routing. However, the variable is increased in situations for which a dialog could not be set up in the very first place due to either
- malformed SIP headers in the INVITE request, - lack of shared memory, and - unsuccessful callback registrations to the rr or tm module
which are all checked in dlg_handler.c's dlg_new_dialog() function prior to any further response processing.
What I'd prefer to do to the least is additionally update the counter variable every time a non-2xx final response has been received to indicate call failure in the SIP sense. Going one step further, I'd remove the update code from its current place because I do not see much meaning in running statistics on hard errors as given above. The cases are already covered by ERROR-level log messages so they will be reported anyway.
To sum up my goal: If no one objects, I will change the code such that hard errors are not covered by failed_dlgs anymore but failure to establish SIP sessions are.
Cheers,
--Timo