Greger V. Teigre wrote:
And I also want to avoid load balancers, heartbeat and other standby solutions.
:-) well, I don't like standby solutions either, but I like some of the characteristics of load balancing. If you have two data centers (using SRV to load balance/make redundancy on the client side), and each data center is a small LVS cluster, it is very easy to add servers to increase capacity.
I haven't found a good documentation for building a SIP-Cluster using LVS, but I'm looking forward to find one on onsip.org soon ;o) Unfortunately I don't have the time to play around a little with LVS for myself.
I like the idea of using OPTIONS. If you find a way to solve that, it may very well become a best practice. However, you need to make sure the NAT binding is open before you send the INVITE from SER-2, so you either have to continously send OPTIONS to all your clients or you have to make sure that SER-1 sends the OPTIONS right before you send the INVITE from SER-2
Well, the basic idea is to send an OPTION like the following sipp-based template directly after receiving and replicating a REGISTER from SER-1 to UAC-1, which will send the reply back via SER-2 (assuming I get rid of the Via-Header of SER-1):
<send> <![CDATA[ OPTIONS sip:[service]@[remote_ip] SIP/2.0 Via: SIP/2.0/[transport] [field0]:[field1] Via: SIP/2.0/[transport] [local_ip]:[local_port] To: <sip:[service]@[remote_ip]:[remote_port]> From: <sip:foo@[remote_ip]:[remote_port]> Contact: <sip:foo@[local_ip]:[local_port]>;transport=[transport] Call-ID: [call_id] CSeq: 1 OPTIONS Content-Length: 0 ]]> </send>
where field0 is the IP of SER-2, field1 the port of SER-2.
The reply from UAC-1 to SER-2 will open the NAT, and SER-2 will be able to continuously send UDP pings for keeping the NAT opened. So SER-1 can die now without effecting UAC-1's reachability.
That's for the theory so far :o)
Of course there are some other issues like registrations while one of the SERs is down, so no NAT-bindings can be established, or the NAT-bindings will time out while one SER is down, and the UACs are unreachable for the recovered SER.
Maybe it's really time to evaluate iptelorg's SOP/SRR solution ;o)
Andy