Interesting solution, maybe using an undefined algorithm in dispatcher functions helps, since it will use the first entry, and if failover is turned on, the next addresses will be loaded in AVP list -- no need for choosing the round robin. Adding a new id for an algorithm of selecting the destinations in the order they are in the list sounds like a good idea.
On the other hand, the plan is to have th DNS failover available in the next major release, 1.2.0.
Cheers, Daniel
On 01/12/07 18:46, T.R. Missner wrote:
We are currently working on bit of a hackish way to honor SRV records. Please poke holes as appropriate.
We will run a separate process that looks up a SRV record and then generates local A records based on the results. Using the dispatcher module in a similar fashion to what Christian describes below we then dispatch to the pre agreed upon domain, resolved locally. When the SRV results change we resolve the pre agreed domain to reflect the change.
Example:
SRV returns 2 A records
server1.carrier.com priority 1 resolves to 4.5.6.7 server2.carrier.com priority 2 resolves to 5.6.7.8
We then build a fake local domain to match server1.carrier.fake 4.5.6.7 server2.carrier.fake 5.6.7.8
The dispatcher list looks like this:
1 server1.carrier.fake 1 server2.carrier.fake
Now openser will always pick the first server and fail to the second server using the mechanism Christian describes.
At some predefined interval our external process will check the SRV record of the carrier. Let's say it has now changed where server1 is priority 2 and server2 is priority 1 ( reversed from before). The external process updates our local DNS to resolve server1.carrier.fake to 5.6.7.8 and server2.carrier.com to 4.5.6.7
In this manner using an external process and local DNS ( in our case we use DJBs tinydns ) we are honoring dynamic priority changing in SRV records with openser.
This is still a work in progress. Will update once we have it up and running.
Couple of caveats, we assume the number of records returned from the SRV lookup will be consistent, since we have to build the same number of fake domains. Also we assume we are dealing with priority routing not round robin, though round robin would work, only the dispatcher algorithm would need to change.
T.R.
On 1/11/07 11:36 AM, "Christian Schlatter" cs@unc.edu wrote:
Staffan,
Kerker Staffan wrote: ...
Now, if I disable one of the Gateways, I hang every second call. OpenSER does not try the second A record address if the first doesn't answer. How can I solve this? Shouldn't OpenSER fail over to the second A record listed in the NAPTR => SRV resolving? Or will OpenSER continue to resend all SIP INVITES until timers fire? Would it help if the proxy recieved an ICMP port/destination unreachable from the network? Is there anyway to get around this? In the other direction, from POTS to sip, the PGW2200 nicely switches over to the second of my two OpenSER servers if I shut one of them down. These servers have the same DNS entries (but for another SIP domain, NAPTR => SRV => 2x A record).
Yes, OpenSER or for that matter every transaction stateful proxy should do RFC 3263 based fail-over. But as you can imagine this is pretty complex to implement and that's why openser does not support it yet, it is listed on the development roadmap. The newest release of SER does support DNS failover.
But it is possible to implement failover with OpenSER, you just have to configure it manually on the proxy. And you have to adjust the SIP session timers of the tm module to achieve fast failovers.
Here is an overview of how I implemented failover with OpenSER (there are other ways to do that):
I use the dispatcher module with a non-random dispatcher algorithm to get deterministic failover.
dispatcher config file could look like:
1 sip:gw-1.example.com 1 sip:gw-2.example.com
In the openser config file, I call the ds_select_domain() function just before t_relay. And in the failure_route I then use ds_next_domain() to select the next target from the dispatcher config file.
In order to get short failover times one has to adjust fr_timer for INVITE transactions. For INVITE transactions, fr_timer is the max time openser waits for a reply from the downstream SIP entity. As soon as openser receives such a reply, it will use fr_inv_timer as the final response timer. Per default fr_timer is 30 seconds so openser would wait about 30 seconds before trying the next target.
An openser config that does failover between gw-1.example.com and gw-2.example.com for gw.example could look like:
modparam("tm", "fr_timer_avp", "i:24") # AVP to set fr_timer modparam("avpops","avp_aliases","fr_timer=i:24")
# failover support --> store dests in avp value modparam("dispatcher", "flags", 2)
route[0] { ... if (is_method("INVITE") && uri=~"sip:.*@gw.example.com") {
# replace domain part with first dispatcher target of group 1 ds_select_domain("1", "9"); # alg 9 --> use first, second, etc # set fr_timer to 3 seconds (3 seconds for failover) avp_write("i:3", "$avp(fr_timer)"); t_on_failure("1"); t_relay(); exit;
} ... }
failure_route[1] { ... # status is 408 if openser session timer fires if (t_check_status("408")) { # replace domain part with next dispatcher target if (ds_next_domain()) { t_relay(); exit; } } ... }
- Christian
I would love some best practice implementation clues regarding OpenSER and multiple GW fail over, if anyone of you have such knowledge or experience.
Best regards, /Staffan
Staffan Kerker, Saab Communication Ljungadalsgatan 2, 35180 Växjö, Sweden
p. +46 470 42185 c. +46 705 391365 m. staffan.kerker@saabgroup.com w. http://www.saabgroup.com
Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users