So guys, I have nice news.
I made it working as expected, but with some editions of previous scheme.
I set NAPTR records for tcp/udp:
dig naptr sipnew.heremydomain.ua @ns2 +short
10 100 "S" "SIP+D2U" "" _sip._udp.sip.heremydomain.ua.
10 100 "S" "SIP+D2T" "" _sip._tcp.sip.heremydomain.ua.
I set SRV records for both types of transport:
dig srv _sip._udp.sip.heremydomain.wnet.ua +short @ns2
10 100 5060 kamailio1.heremydomain.ua.
10 100 5060 kamailio2.heremydomain.ua.
dig srv _sip._tcp.sip.heremydomain.ua +short @ns2
10 100 5060 kamailio1.heremydomain.ua.
10 100 5060 kamailio2.heremydomain.ua
And of course A records for both fqdn names:
dig kamailio1.heremydomain.ua @ns2 +short
10.10.10.1
dig kamailio2.heremydomain.ua @ns2 +short
10.10.10.2
Here I have to say, that I moved away from idea to set two IP addresses for
both fqdns.
As Sebastian has already said, indeed clients sent half requests to first
result of A records and half requests to second one.
So I decided to record route sip.heremydomain.ua, instead of direct fqdn
names of kamailio servers (kamailio1.heremydomain.ua /
kamailio2.heremydomain.ua).
And the result was excellent.
So the configuration of kamailio proxies was like that:
1. dialog module working with external database;
2. registrar module working with external database;
3. usrloc module working with external database;
4. both kamailio inserts data to the same db, config files are the same on
both kamailio (excepting the listening ifaces);
5. b2b user agent as routing server and media proxy at the same time;
6. Location records are visible from both kamailio proxies;
Here is the sequence of the call process (I used TCP transport to test the
environment):
1. Client sets up dialog through the first kamailio, so three-way-handshake
is done, media stream is up.
2. Then kamailio1 falls down, but dialog doesn't get down.
2.1 My client noticed, that 5060 port on first kamailio is unreachable and
suddenly re-registered itself on second one (he resolves the
sip.heremydomain.ua and gets kamailio2 host).
2.2 Media stream doesn't get down, because of external media-proxy.
3. Client decides to end up the call and sends BYE request, to kamailio2
host.
3.1 Kamailio2 knows anything about this client and dialog, so he process
this BYE as usual.
3.2 Session completely (properly) gets down.
In case of UDP transport, the result was a bit different, because in a few
cases srv record couldn't recognize that 5060 is unreachable on first
kamailio, so client tried to send BYE to first host.
But if first host was completely down, it worked in 100 % cases.
So I can say, that this topology is workable for me.
2017-08-27 12:37 GMT+03:00 Donat Zenichev <donat.zenichev(a)gmail.com>om>:
Hi Igor.
Well, indeed I've already done the solutions with heartbeat, but the main
idea now is to minimize the absense of SIP server.
Heartbeat need time (that depends on your condigurations) to understand
that primary is down, e.g. you have dead interval set to 10 seconds, so if
no activity has noticed while this period, the node is considered as dead.
But, if you will set this interval lower, e.g. 2-3 seconds, you get the
risk to obtain flaps (e.g. there is a delay within ip route from slave to
primary node, so slave brings up the shared ip and start to process calls,
but real master works fine indeed and have possibility to communicate).
So according to heartbeat, I decided to perform it only inside same
physical domain, where ucast/bcast packets will reach other node without
any problems.
According to my actual question, I've moved further and now think
following scheme will work fine:
1. NAPTR records for every transport protocol (e.g. "
_sip._udp.domain.org
").
2. SRV records for every NAPTR record (e.g.
kamailio1.domain.org, ka
mailio2.domain.org <http://kamailio1.domain.org/>) with same
priority/weight for both of them, to balance half invites to first one and
half invites to second one.
3. A records for every domain name (e.g.
kamailio1.domain.org - 10.0.0.1,
10.0.0.2, where actually second one is kamailio2;
and the same for fqdn
kamailio2.domain.org - 10.0.0.2, 10.0.0.1).
So the sequence of dialog actions will be
1. Invite from uac is balanced to kamailio1;
2. Dialog is established and media stream is up;
3. Then kamailio1 goes down;
4. Bye message tries to achieve host that was set in rr hf (kamailio1),
but kamailio1 (10.0.0.1) is down, so bye message will be sent to 10.0.0.2
(kamailio2) and a cause of the behaviour is 10.0.0.2 ip assigned to
kamailio1 fqdn as second ip.
5. The message will be processed by kamailio2, because of common
dialog/usrloc db.
I will make an effort to set up it next week.
In case of success, I will write a short report here.
2017-08-25 17:26 GMT+03:00 Donat Zenichev <donat.zenichev(a)gmail.com>om>:
I've searched through the sr users list and
found a few discussions on
this count.
So the way (as I think) that is more relevant for kamailio failover, is
solution with DNS: NAPTR -> SRV records.
Like:
NAPTR record:
"IN NAPTR 10 10 SIP+D2U "" _sip._udp.domain.org"
SRV records:
"_sip._udp.domain.org SRV 10 1 5060 kamailio1.domain.org"
"_sip._udp.domain.org SRV 10 1 5060 kamailio2.domain.org"
A records:
"kamailio1 IN A 10.0.0.1"
"kamailio2 IN A 10.0.0.2"
So each kamailio will add rr with own hostname - e.g.
kamailio1.domain.org
So that, client will send in-dialog requests to route with fqdn
kamailio1.domain.org
And I can't add to rr
sip.domain.org, because every new request
(whatever it is initial or indialog) will be sent to one of the kamailio
servers, but I need to send in-dialog requests to the same kamailio.
So for the goal of failover, I need to have more A records, like:
"kamailio1 IN A 10.0.0.1"
"kamailio1 IN A 10.0.0.2"
"kamailio2 IN A 10.0.0.2"
"kamailio2 IN A 10.0.0.1"
And in case when kamailio 1 goes down, uac will have two ip dst to send
request: 10.0.0.1 and 10.0.0.2 (where indeed second one is kamailio2).
So as result I will have one database for userlocation and dialog module,
and loadbalancing based on SRV priority/weight fields.
And as failover, A records, that give possibility to send requests first
to 10.0.0.1 and second to 10.0.0.2 (if rr was bind to kamailio1).
And otherwise, if rr was defined as kamailio2, first request tries to
achive kamailio1 and then kamailio2.
Am I right at this point?
2017-08-22 21:57 GMT+03:00 Donat Zenichev <donat.zenichev(a)gmail.com>om>:
Hi.
I came up with idea to set up stand with two kamailio and one b2bua
server (for routing).
The idea consists of failover for dialogs, transactions.
So if one of kamailio nodes is down, another one is able to catch up the
dialog and let users to properly end up the session.
For better realizing of it, I will try to describe the idea step by step:
1. UAC invites UAS, they've done three-way-handshake, media stream is up.
2. Kamailio that processed this dialog is down.
3. Users decided to end the session with BYE method, but proxy that
processed their three-way-handshake recently is down, so one of ua sends
BYE to the destination route that contains domain name (that both kamailio
serve), BYE achieves the second kamailio to let him properly end the dialog.
But, there is a big but, this second kamailio hasn't ever known about
this dialog, he doesn't support any transactions for it and furthermore he
doesn't know anything about this call-id.
So the solution for it, as I think, is hidden in db mode for user
location (columns that contain call-ids, branches etc.
But I need to be sure, if I'm on the right way.
For purpose, where one ip is served by two nodes, I have two solutions:
-First one. I want to create heartbeat cluster with two kamailio nodes,
they will have one shared ip address, so when one node gets down, another
one brings up shared ip interface and implements the same actions that
master does.
-Another method is to assign a few ip addresses to one domain name (ip
addresses of different kamailio proxies).
So the goal looks simple, if someone has ever done something like that,
I will be glad to read the ideas.
--
--
BR, Donat Zenichev
Wnet VoIP team
Tel: +380(44) 5-900-808
http://wnet.ua
--
--
BR, Donat Zenichev
Wnet VoIP team
Tel: +380(44) 5-900-808
http://wnet.ua
--
--
BR, Donat Zenichev
Wnet VoIP team
Tel: +380(44) 5-900-808
http://wnet.ua