#### Pre-Submission Checklist
- [X] Commit message has the format required by CONTRIBUTING guide
- [X] Commits are split per component (core, individual modules, libs, utils, ...)
- [X] Each component has a single commit (if not, squash them into one commit)
- [X] No commits to README files for modules (changes must be done to docbook files
in `doc/` subfolder, the README file is autogenerated)
#### Type Of Change
- [X] Small bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds new functionality)
- [ ] Breaking change (fix or feature that would change existing functionality)
#### Checklist:
- [X] PR should be backported to stable branches
- [X] Tested changes locally
- [ ] Related to issue #XXXX (replace XXXX with an open issue number)
#### Description
I've recently being experiencing a loop in nodes removal/addition leading to
"ghost nodes".
Suppose to have three servers A,B,C.
Server C goes down not cleanly, so DMQ doesn't notify the other nodes. Server A is the
first to send its ping, with a nodelist including node C. After fr_timer, the transaction
for the message to node C times out and the node is removed from node A nodelist.
Then node B sends its ping with a nodelist including node C (still alive for A), node A
sees node C as a new node and adds it back to its nodelist. Now node B reaching fr_timer
timeout removes node C, until next node's A ping, and so on. This does not occur if
the delta between node A and node B pings is less than fr_timer.
What I propose here is that, upon a failed ping, the failing node is put in disabled state
and we wait a 2nd failed ping before removing it from the nodelist. This should prevent
dead nodes to come back.
You can view, comment on, or merge this pull request online at:
https://github.com/kamailio/kamailio/pull/1840
-- Commit Summary --
* dmq: wait for a 2nd failed ping before deleting a node
-- File Changes --
M src/modules/dmq/notification_peer.c (23)
-- Patch Links --
https://github.com/kamailio/kamailio/pull/1840.patch
https://github.com/kamailio/kamailio/pull/1840.diff
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/pull/1840