DMQ Developers,
Our team would like to setup a DMQ bus http://kamailio.org/docs/modules/4.2.x/modules/dmq.html#idp2640048 that contains more than one notification address http://kamailio.org/docs/modules/4.2.x/modules/dmq.html#dmq.p.notification_address to support a high availability, fault-tolerant system where multiple servers are used to maintain the list of DMQ nodes. The failure of any one DMQ node or the startup order should not cause the DMQ bus to lose track of available nodes. As nodes go offline and online the DMQ bus should be updated using information from the active nodes. To do this we propose changing the way DMQ uses the notification address.
Currently the notification address resolves a DNS name to a single IP address even if a SRV record or multiple A/AAAA records are available. The proposal is to use all the A/AAAA records or SRV targets returned from a DNS name query as notification addresses.
Do you have any comments, concerns, suggestions, or recommendations regarding this proposal? Your input is definitely welcome.
Thank you, Bob
Hello Bob,
As previously stated, it is a needed improvement.
My time is heavily consumed right now, so your patch will be gratefully received.
My only comment would be that once connected to the cluster, notifications are sent to all nodes anyway, not just the one specified in the notification address parameter. So practically, the only need for resolving multiple records is on start-up to find the first available node. Upon contact with that node, a list of the other nodes is returned, all of which are stored and included in subsequent notifications - this is fairly fault-tolerant, providing your network is well designed. It is not tolerant of network splits etc, where it could be possible to end up with two or more separated clusters or split-brain type situations. However, to overcome that, I think we're looking at a *major* redesign - far more than can be achieved with multiple notification addresses. We'd also likely need to consider quorum, voting a primary/master etc. to do it properly.
Otherwise, it is a welcome addition in my opinion.
Cheers,
Charles
On 5 March 2015 at 21:41, Robert Boisvert rdboisvert@gmail.com wrote:
DMQ Developers,
Our team would like to setup a DMQ bus http://kamailio.org/docs/modules/4.2.x/modules/dmq.html#idp2640048 that contains more than one notification address http://kamailio.org/docs/modules/4.2.x/modules/dmq.html#dmq.p.notification_address to support a high availability, fault-tolerant system where multiple servers are used to maintain the list of DMQ nodes. The failure of any one DMQ node or the startup order should not cause the DMQ bus to lose track of available nodes. As nodes go offline and online the DMQ bus should be updated using information from the active nodes. To do this we propose changing the way DMQ uses the notification address.
Currently the notification address resolves a DNS name to a single IP address even if a SRV record or multiple A/AAAA records are available. The proposal is to use all the A/AAAA records or SRV targets returned from a DNS name query as notification addresses.
Do you have any comments, concerns, suggestions, or recommendations regarding this proposal? Your input is definitely welcome.
Thank you, Bob
sr-dev mailing list sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
Charles,
I coded and tested the attached patch based on 4.2.3 code. So as not to disturb current functionality, I add a modparam called "multi_notify". When set to a non-zero value it loads all IPs returned by the SIP URI, including those provided by DNS SRV records.
With regard to your point about the notifications being sent to all nodes in the cluster, I noticed that if hosts A, B and C send notifications to D and D is not available when the cluster starts DMQ will not function even if D comes up some time later. However, if I create a SRV record or A records that resolve to hosts D and E and send notifications to that group with multi_notify set to 1 then the cluster will always function as long as one either D or E is functional. Even in a well-designed network servers may be unavailable for a period of time so this redundancy allows functionality to be preserved during downtime.
We are having good success with this current patch and would like to request that it be mainstreamed.
Thanks for considering this request and for your help, Bob
Hi Bob,
Thanks for your patch. At quick glance it looks great but I will take a closer look over the next 24 hours and report back.
It is in my opinion a worthwhile addition and your time is very much appreciated.
Kind regards,
Charles
On 20 March 2015 at 20:46, Robert Boisvert rdboisvert@gmail.com wrote:
Charles,
I coded and tested the attached patch based on 4.2.3 code. So as not to disturb current functionality, I add a modparam called "multi_notify". When set to a non-zero value it loads all IPs returned by the SIP URI, including those provided by DNS SRV records.
With regard to your point about the notifications being sent to all nodes in the cluster, I noticed that if hosts A, B and C send notifications to D and D is not available when the cluster starts DMQ will not function even if D comes up some time later. However, if I create a SRV record or A records that resolve to hosts D and E and send notifications to that group with multi_notify set to 1 then the cluster will always function as long as one either D or E is functional. Even in a well-designed network servers may be unavailable for a period of time so this redundancy allows functionality to be preserved during downtime.
We are having good success with this current patch and would like to request that it be mainstreamed.
Thanks for considering this request and for your help, Bob
sr-dev mailing list sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
Hi Bob,
Sorry for the delay.
It all seems fine - the patch did not apply against the master branch, so I added it manually. I'm not sure we need the extra module parameter though, since it doesn't break existing functionality if only a single DNS record is present or an IP address is specified directly. Did you have a reason for adding the parameter?
I can either push directly to master, or if you'd prefer you can create a pull request (again, master, not 4.2).
Cheers,
Charles
On 23 March 2015 at 15:59, Charles Chance charles.chance@sipcentric.com wrote:
Hi Bob,
Thanks for your patch. At quick glance it looks great but I will take a closer look over the next 24 hours and report back.
It is in my opinion a worthwhile addition and your time is very much appreciated.
Kind regards,
Charles
On 20 March 2015 at 20:46, Robert Boisvert rdboisvert@gmail.com wrote:
Charles,
I coded and tested the attached patch based on 4.2.3 code. So as not to disturb current functionality, I add a modparam called "multi_notify". When set to a non-zero value it loads all IPs returned by the SIP URI, including those provided by DNS SRV records.
With regard to your point about the notifications being sent to all nodes in the cluster, I noticed that if hosts A, B and C send notifications to D and D is not available when the cluster starts DMQ will not function even if D comes up some time later. However, if I create a SRV record or A records that resolve to hosts D and E and send notifications to that group with multi_notify set to 1 then the cluster will always function as long as one either D or E is functional. Even in a well-designed network servers may be unavailable for a period of time so this redundancy allows functionality to be preserved during downtime.
We are having good success with this current patch and would like to request that it be mainstreamed.
Thanks for considering this request and for your help, Bob
sr-dev mailing list sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
-- *Charles Chance* Managing Director
t. 0121 285 4400 m. 07932 063 891
Charles,
The parameter does make an important difference. Instead of using host names it uses explicit IP addresses which are saved in the DMQ structure.
I'm in the midst of several things right now so it will be faster if you can push. Next time, if it should happen, I will work with a pull request. Sorry for the extra work.
Thanks, Bob
On Mon, Mar 30, 2015 at 12:13 PM, Charles Chance < charles.chance@sipcentric.com> wrote:
Hi Bob,
Sorry for the delay.
It all seems fine - the patch did not apply against the master branch, so I added it manually. I'm not sure we need the extra module parameter though, since it doesn't break existing functionality if only a single DNS record is present or an IP address is specified directly. Did you have a reason for adding the parameter?
I can either push directly to master, or if you'd prefer you can create a pull request (again, master, not 4.2).
Cheers,
Charles
On 23 March 2015 at 15:59, Charles Chance charles.chance@sipcentric.com wrote:
Hi Bob,
Thanks for your patch. At quick glance it looks great but I will take a closer look over the next 24 hours and report back.
It is in my opinion a worthwhile addition and your time is very much appreciated.
Kind regards,
Charles
On 20 March 2015 at 20:46, Robert Boisvert rdboisvert@gmail.com wrote:
Charles,
I coded and tested the attached patch based on 4.2.3 code. So as not to disturb current functionality, I add a modparam called "multi_notify". When set to a non-zero value it loads all IPs returned by the SIP URI, including those provided by DNS SRV records.
With regard to your point about the notifications being sent to all nodes in the cluster, I noticed that if hosts A, B and C send notifications to D and D is not available when the cluster starts DMQ will not function even if D comes up some time later. However, if I create a SRV record or A records that resolve to hosts D and E and send notifications to that group with multi_notify set to 1 then the cluster will always function as long as one either D or E is functional. Even in a well-designed network servers may be unavailable for a period of time so this redundancy allows functionality to be preserved during downtime.
We are having good success with this current patch and would like to request that it be mainstreamed.
Thanks for considering this request and for your help, Bob
sr-dev mailing list sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
-- *Charles Chance* Managing Director
t. 0121 285 4400 m. 07932 063 891
-- *Charles Chance* Managing Director
t. 0121 285 4400 m. 07932 063 891
www.sipcentric.com
Follow us on twitter @sipcentric http://twitter.com/sipcentric
Sipcentric Ltd. Company registered in England & Wales no. 7365592. Registered office: Faraday Wharf, Innovation Birmingham Campus, Holt Street, Birmingham Science Park, Birmingham B7 4BB.
sr-dev mailing list sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev