hello all
recently we are seeing some weird messages handling with dialogs in
Kamailio version 5.0
we sometimes are seeing messages like
/usr/local/kamailio/sbin/kamailio[15372]: NOTICE: dialog
[dlg_hash.c:249]: dlg_clean_run(): dialog in delete state is too old
(0x7fa65445c850 ref 3)
/usr/local/kamailio/sbin/kamailio[15372]: NOTICE: dialog
[dlg_hash.c:235]: dlg_clean_run(): dialog in early state is too old
(0x7fa652d57110 ref 1)
we increased the debug description adding some lines to the dialog
module code so we could track the calls of the calls that these messages
belong to, and we could see that those messages appeared in calls just
released at that moment, for example:
<134>Nov 4 11:21:38 localhost /usr/local/kamailio/sbin/kamailio[4108]:
INFO: mad-localhost-1 Call 97980 / Call-ID
1409565771_82382809(a)195.219.240.46: Creating dialog [8043:21772] with
hash id 21772 and hash entry 8043
<134>Nov 4 11:21:38 localhost /usr/local/kamailio/sbin/kamailio[4106]:
INFO: mad-localhost-1 Call 97980 / Call-ID
1409565771_82382809(a)195.219.240.46: Status 100, 6610
<134>Nov 4 11:21:39 localhost /usr/local/kamailio/sbin/kamailio[4111]:
INFO: mad-localhost-1 Call 97980 / Call-ID
1409565771_82382809(a)195.219.240.46: CANCEL received in A-Leg, relaying
downstream
<134>Nov 4 11:21:39 localhost /usr/local/kamailio/sbin/kamailio[4112]:
INFO: mad-localhost-1 Call 97980 / Call-ID
1409565771_82382809(a)195.219.240.46: Status 487, 6610
<133>Nov 4 11:21:39 localhost /usr/local/kamailio/sbin/kamailio[4139]:
NOTICE: dialog [dlg_hash.c:251]: dlg_clean_run(): dialog in delete state
is too old (0x7fa0c02a6870 ref 3) with callid
'1409565771_82382809(a)195.219.240.46'
<129>Nov 4 11:21:39 mad-proxy-inout-1
/usr/local/kamailio/sbin/kamailio[4112]: ALERT: dialog
[dlg_handlers.c:1715]: dlg_run_event_route(): after event route - dialog
not found [8043:21772] (1/5) (0x7fa0c02a6870) with callid
'1409565771_82382809(a)195.219.240.46'
we printed the dialog id and entry hash values and we can see there are
no other calls creating same values in the previous hours, or using same
memory allocation, or same callid, so it seems like there was some kind
of strange issue with the dialog timers....¿?
By the way, this is happening only few times (80-100 times) a day having
many thousands of calls, so it's quite difficult for us to duplicate, we
couldn't do it until now.
We also tried to use the timer_procs 0 or 1 to use a different proc
timer but seems the issue happens in both scenarios.
The configuration change we made and seems it was done when these
messages started to appear is to use dialog event_route when ended and
failed to do some stuff there managing some dialog variables.
Does ti make any sense that attempting to use those variables could
cause these behaviour?
Do you have any idea about it could be or how we can check it deeper?
thanks a lot and regards
david escartin
Show replies by date