On 8/30/12 12:50 PM, Øyvind Kolbu wrote:
On 2012-08-30 at 11:32, Daniel-Constantin Mierla
wrote:
On 8/30/12 10:29 AM, Øyvind Kolbu wrote:
Yes, I actually want both, HA + replication, by
using db_cluster as the
replicator. And as I've demonstrated the location-table on non-primary
servers will lack entries after downtime, which is bad.
Replication is done with
parallel writing as long as all nodes are
available, but, better said, you look for re-synchronization after
downtime, which is another story, not easy to code, because it is db
specific.
I don't want Kamailio to synchronise my data, but I think it is
reasonable
to expect it to treat the write targets individual and independent of the
result from the initial reading.
The write is independent of the read operation.
Read cannot be performed
in parallel, that results in duplicated records back to application.
Then, have in mind that there are several operations, one after the other:
- update - which do update to all write db servers
- test affected rows (which is always working on last write (update)
connection)
- insert if affected rows is 0
So it is not an atomic operation, like
updated_and_if_affected_rows_is_0_then_insert. All this layer is done in
usrloc, in sequential steps, working fine for one server, but not for
multiple nodes.
I am not sure what it will take to implement this kind of operation
inside the database drivers, then it may work. TBased on quick thoughts,
the code is there, just that has to be structured for each db connector
and exported via db api and propagated to db_cluster.
Perhaps enabling db_update_as_insert is the only current option.
This can end up in lot of records, as there is no update - if you set
constraints at database layer, then you have failures on unique keys.
If this
indeed is impossible I've have to continue our current scheme with
SIP level replication.
If you use db mode 3, then you can do database level
replication, is the
same.
------- -------
| DB1 | | DB2 |
------- -------
| |
| |
| |
------- --------
| LOC1| | LOC2 |
------- --------
\ /
----------
| Phones |
----------
The above setup is my scenario. When everything is up LOC1 will use DB1 for
reading and write to both DB1 and DB2. Similarly LOC2 will use DB2 for
reading and write to both DB1 and DB2. Both uses the "other" DB as failover
for reading. LOC1 and LOC2 are setup with even load in an SRV record.
- While DB2 is down, say reboot for a new kernel, a phone A boots and
registers at LOC1 and is populated in the DB1 database. Reading works
fine from LOC2 to DB1.
- DB2 is back again.
- Phone A re-registers at LOC1. The previous entry is found in the location
table and an UPDATE is issued for both DB1 and DB2, but DB2 will still
lack the entry.
DB2 will _never_ get an entry for until phone A boots and gets a new
Call-ID or for some reason phone A chooses to register with LOC2 instead.
Then a duplicate entry will end up in DB1, as LOC2 will blindly issue an
INSERT to both DB1 and DB2.
As the location servers are evenly used, ~every second call to phone A will
fail with 404.
You have to do cross replication at database layer and use db_cluster as
read/write for failover access (e.g., try read/write on db1 and if
fails, try the other one).
Cheers,
Daniel
--
Daniel-Constantin Mierla -
http://www.asipto.com
http://twitter.com/#!/miconda -
http://www.linkedin.com/in/miconda
Kamailio Advanced Training, Berlin, Nov 5-8, 2012 -
http://asipto.com/u/kat