Hi Daniel,
Thanks for the explanation. I've been doing some testing and I've come
accross the following situation:
ds_probing_threshold = 1
ds_probing_mode = 0
in failure route (when timeout occurs) I do:
ds_mark_dst("ip")
State changes from active to inactive and mode set to probing is
correct, then dispatcher sends 3 ping messages to destination set in
probing state, it then recieves no response to the probe and then sets
the destination inactive. It looks like the probing for state inactive
also honours ds_probing_threshold.
If I wanted to keep pinging the destination, while its down, how would I
achieve that? So, for example, i have a destination in active state, the
destination goes down for some reason, I mark the destination as
inactive but want to keep probing it until it comes back. In this case I
will always be sending a probe to the destination, until it comes back,
in which case i recieve a 200 ok back and dispatcher sets state back to
active.
Currently, what happens is, the destination is active, it crashes, i set
state & mode to inactive probing, probe goes out to destination, it
times out, dispatcher sets state inactive, no probing. Therefor the
destination will never be selected unless manually set to active/trying
via fifo command when gateway is back alive.
The kamailio version I was testing with is:
# ./kamailio -V
version: kamailio 3.3.0-dev1 (i386/linux) 0b8f2e
flags: STATS: Off, USE_IPV6, USE_TCP, USE_TLS, TLS_HOOKS, USE_RAW_SOCKS,
DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC,
DBG_QM_MALLOC, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE,
USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16,
MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 4MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: 0b8f2e
compiled on 10:24:56 Oct 28 2011 with gcc 4.1.2
On 27/10/2011 17:49, Daniel-Constantin Mierla wrote:
Hello,
On 10/27/11 5:30 PM, Asgaroth wrote:
Hi Daniel,
[...]
Since with 3.2 seemed that it was lost capability
to go inactive after
a certain number of failures (ds_probing_threshold), there is a new
state 'trying' that can be used for it. Means that you can set a
destination in trying state couple of times and then it becomes
inactive. In 3.1 it was using a confusing mechanism based on probing
mode.
Can you explain this trying state and "lost capability to go inactive
after certain number of failures" a little more please and how it
relates to the new trying->inactive states. I would like to understand
how these states relate so that I can test better.
I was not using the feature in the past, but from the source code I
could see that there was a way not to go directly in probing mode
(which in the past meant not to select the gateway anymore), but just
count failure until a threshold is reached and then set probing.
So if threshold was 3 and there were (in 3.1.x-):
ds_mark_dst(p) => state still active (no probing, gateway still selected)
ds_mark_dst(p) => state still active (no probing, gateway still selected)
ds_mark_dst(p) => state goes to probing (inactive, gateway not selected)
Now (3.3.x+), since probing can be always on, even for active
destinations (to detect when they go down), you can get previous like
behavior with trying state:
ds_mark_dst(t) => state trying (gateway still selected)
ds_mark_dst(t) => state trying (gateway still selected)
ds_mark_dst(t) => state goes to inactive (gateway not selected)
Default failure counter threshold is 1, so goes to inactive as soon as
you set trying, but you can change it via ds_probing_threshold parameter.
So right now there are states: active, inactive, trying and disabled,
plus modes: probing, not-probing. A destination can be selected only
if it is active or trying. It will not be selected in inactive and
disabled. Probing mode specifies whether keepalives should be sent to
destinations, can be done per address or globally with the module
parameter ds_probing_mode. If a keepalive is not replied, the address
is marked as trying first and later will become inactive if keeps
being non-responsive.
OK, so if I understand this above paragraph correctly, if I
have
ds_probing_mode = 0, then I need to set mode manually to probing for a
gateway that has failed "ds_probing_threshold" times? If a server times
out and I set state/mode to "ip", then I assume probing will commence.
In this case the server will not responde to probe requests (as it has
crashed), does this mean then that the state will change to "trying"
because there was no probe response recieved from destination?
Probing is no
longer a gw selection state, but a mode switch to send
keepalives or not to a gateway. So if you want these keepalives and
ds_probing_mode=0, you have to set 'p' in any of the states you want
keepalives. A matter of the reply code from keepalives, the state in
probing mode is changed to active if it is 200ok or a reply code
configured in module parameter, or to trying if it is a failure (which
may end up in inactive when failure threshold is met). ds_probing_mode
controls as well if a keepalive reply will maintain the probing mode
or not.
Cheers,
Daniel