2010/7/8 Timo Reimann <timo.reimann(a)1und1.de>de>:
It occurs in a
server usually processing 500-700 concurrent calls, but
today the traffic has been increased up to 1000-1200. I didn't see
these errores before today.
I don't think it's legal to send a BYE request for a call that hasn't
even transitioned to the "early" state, so without further insights I'd
speculate the UAC to be flawed.
I strongly think something wrong has occured with my Kamailio 1.5.4
today as the number of concurrent calls (~1200) is not true (confirmed
by inspecting the CDR's in the gateways).
As I told in other mail, Monit service detected that, at some time,
the server CPU arrived to 40% of usage (really impossible under normal
circunstances) and the dialog module shown exagerate values for the
ongoing calls.
- So first I see the message "bogus event 7 in state 2 for dlg" (never
seen before and there are no new clients, so very strange).
- After ~3 minutes the CPU goes up (which makes no sense).
- Also LCR load_gw() returns -1 (no gws available) which is just
possible if both gateways didn't reply to the OPTIONS (not real), but
worse is the fact that LCR didn't log this error (gws OFFLINE) but did
log "gws ONLINE" after 4 minutes when the CPU usage decreased again.
- There was not real increase of calls, but dialog module shows it
(I've already saw something similar when, for some reason, kamailio
1.5.1 leaked memory and I got PKG memory filled, then the number of
dialog is increased more and more, both in the statistics and using
dialog MI functions).
With the logs I have there is no way to understand what happened. I
also monitor the usage of PKG each 5 minutes for the first UDP
listener process and there was no PKG problem during the issue (at
least in the first listener).
Well, I must investigate it :)
--
Iñaki Baz Castillo
<ibc(a)aliax.net>