Hi Klaus,
Just a quick response to what you describe below:
We have a different scenario based on three facts:
- We have complete control and monitoring of all participating RADIUS
servers
- Each ser has a RADIUS server on the local LAN where the server center is
managed as a whole (i.e. individual components should not be unavailable)
- We do not tolerate RADIUS downtime at all. Our 24x7 operations center will
immediately respond and correct the situation
Thus, we have never experienced the scenario below. However, if something
happens, it is actually more likely that we start to NAK all requests as a
default. This of course causes the clients to re-register, but ser does not
slow down.
As you proxy the requests, you probably have a re-send from the RADIUS
proxy to the other servers as well, in addition to ser's resend. This adds
up as ser will send a new request before the proxy has finished it resends.
You could probably turn off the re-send on your RADIUS proxy completely and
only rely on ser's resend. It depends on the network between and the level
of monitoring you have on all the servers. If you have complete control and
tight monitoring, you can probably turn off resend and set very low
time-outs. Thus, when a server is down, ser will nak and the client will
retry later (which is probably as good as anything, because something is
probably seriously wrong and retrying every 4 seconds won't help...)
Hope this made sense.
g-)
Klaus Darilion wrote:
Hi Greger!
Greger V. Teigre wrote:
...
Agree. We use RADIUS-based authentication and
authorization with
distributed RADIUS servers. Only usrloc is stored in mysql (we use
I want to ask about your radius experiences. We (
www.at43.at) are also
using radius authentication. All the radius requests are sent to a
local radius proxy which forwards the request to the radius server of
the participating groups (universities, schools ...).
If one of the remote radius servers is down, we are having problems
with ser. Ser's threads are busy, waiting for the radius authorization
responses and ser is slowing done. Then, the client starts to
retransmit their REGISTER messages and ser is getting busier and
busier until all threads are busy with authentication requests. Thus,
the complete service will be down only if one of the radius servers
is down.
We have reduced the proxy load by replying "100...trying" to all
REGISTER requests, which reduces retransmissions in case of slow
authentication. We also tried to tweak the radius retransmission and
timeout settings but could not find a satisfying solution yet.
Do you also have problems in your distributed radius setup? Maybe you
could post a little about your experience with distributed radius.
All other radius users are also welcome to post their radius
experiences.
regards,
klaus
PS: I hope Maxim's patch for stateful authentication is going into
0.9.0