Once I commented out the 3 lines below it works fine to failover.
failure_route[1]{
xlog(" FAILED FAILURE_ROUTE[1]\n");
if(t_any_timeout()){
xlog(" TIMEOUT!\n");
# append_branch();
# xlog(" host is now $rd; all is $ru\n");
# route(1);
}
}
Just so I understand this correctly, there is not need for a failover route in my case
necessarily correct? The only reason I am going to keep this in here is so that the
TIMEOUT! can notify me of a host failure and then I can remove the record from DNS so it
doesn't keep trying it.
One curious thing is that if say the first invite gets a 401 unauthorized from the 2nd
server (the one that is online) when the client responds with its appropriate second
invite with the authorization info kamailio 3.1.0 retries the dead host again whereas
1.5.5 does not retry the dead host a second time. That in 1.5.5. could have been a fluke
because it completely requeries dns for each invite so it may just have been luke that 10
of the 10 times I tested it it randomly choose the live host the second invite?
Either way this appears to work now.
One question that did result from these tests is that a typical transaction looks like:
0(2856) ERROR: <script>: [Mon Nov 29 20:30:34 2010] INVITE
0(2856) ERROR: <script>: FAILED FAILURE_ROUTE[1]
0(2856) ERROR: <script>: TIMEOUT!
0(2856) ERROR: <script>: [Mon Nov 29 20:30:35 2010] ACK
0(2856) ERROR: <script>: [Mon Nov 29 20:30:35 2010] INVITE
With no FAILED FAILURE_ROUTE[1] or TIMEOUT! on the second invite even though it did
timeout. As I am typing this the idea came to me that the reason it didn't fail the
second time around is because it did not receive a 401 the second time. If this is the
case then what happens when the client isn't unauthorized? Or will the server always
reply with a 401 the first time?
-Eric
Date: Mon, 29 Nov 2010 14:48:25 +0100
From: ibc(a)aliax.net
To: marius.zbihlei(a)1and1.ro
CC: sr-users(a)lists.sip-router.org
Subject: Re: [SR-Users] Failover with SRV records
2010/11/29 marius zbihlei <marius.zbihlei(a)1and1.ro>ro>:
AFAIR
using raw sockets checking ICMP notifications would be possible
(not yet implemented, but possible as I remember from a thread with
Andrei).
Possible, but not easily implementable, as ICMP Host unreachable are sent
asynchronously from the kernel. Also the current sendto() call does not
guarantee delivery on all Unixes (Linux should be fine), connected UDP
sockets are to be used instead.
IMHO this would be very useful because if a UDP port is unreachable
and there is a ICMP notification about it, the proxy should generate
an internal 503 (transport error) rather than a 408 (fr_timer
timeout).
Well, this means that we should disable dns_failover (or equivalents)
completely and handle ICMP errors in failure_route blocks(just test if the
transaction issued a 503).
Humm, I expect that when discovering the destination (DNS SRV) N
branches should be generated in serial forking fashion in case there
are various priorities in the received response, am I wrong?
If I recall RFC 3263 , this would mean another
server discovery (as the new request generates a new transaction) so again
there is the possibility that the broken host is selected. If we use this
dns fallback(IMHO this is a nice feature- I personally rely on this) how do
we decide to generate a 503 ?
503 should be the final winning response in case all the branches fail.
If the host is already a IP address, that it
would be ok to send a 503, as
no DNS failover is possible.
Yes.
Ideas?
I think that what I've proposed in this mail requires a big change,
so... not sure if it's feasible right now.
--
Iñaki Baz Castillo
<ibc(a)aliax.net>
_______________________________________________
SIP Express Router (SER) and Kamailio (OpenSER) - sr-users mailing list
sr-users(a)lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-users