Hello,
On 3/30/12 11:47 AM, Marius Zbihlei wrote:
On 03/27/2012 03:44 PM, Daniel-Constantin Mierla wrote:
Module: sip-router Branch: master Commit: 201fc2d600e48fbb717531c79013c1b971f82d76 URL: http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=201fc2d6...
Author: Daniel-Constantin Mierlamiconda@gmail.com Committer: Daniel-Constantin Mierlamiconda@gmail.com Date: Tue Mar 27 14:38:57 2012 +0200
Hello Daniel,
I have a few questions regarding the db_cluster module and especially the way it deals with errors:
For serial operation , lets consider two handlers DB1 and DB2 with the same priority. For the first write operation DB1 lets presume that itfails so the insert is done on DB2 (network congestion, mysql deadlock etc). But for a serial select the first DB1 is chosen (I looked thru the code and I see no ways of caching the initial error) and this means that the info returned from DB1 (it might be insert_update or update so info might be also present in DB1). How does the module handle this?
Same scenarios and question I think it applies with round-robin mode as well.
The way we do this in p_usrloc is by keeping a error counter per each handler that is associated with a state (on -off) and a timestamp (when it failed). This info can be used to disable usage of the DB handler( and later put the handler in Write-Only mode until the data is synchronized )
error detection and connection (auto-) enable/disable are not implemented yet. They were in mind, but couldn't decide quickly what solution to do.
Each connection links to a structure in shared memory - dbcl_shared_t - one field there being 'state', planned to be used to mark the connection active/inactive.
Marking inactive is easy, when a command fails. Bring it back active is more complex, I thought of: - counting how many commands would have been sent when connection is inactive and if a threshold is reached, then try reconnect - keep the timestamp when connection became inactive and try to bring it active when a specified interval elapsed
Other suggestions/contributions are welcome -- the code is in the repo, so if anyone wants to jump in development, feel free to do it...
To your questions, doing serial or round robin writes in some databases and also using them for reads in the same fashion would require a replication at database layer.
I thought of cases such as: - master db servers for writes in parallel (e.g., location/presence in db only mode) - slave db servers (replicated from masters) for reads in round robin
Of course one could write to all servers and read from all servers, but if the traffic is very high, might be good to reuse some db layer features for scalability.
Cheers, Daniel
Cheers, Marius
Zbihlei Marius
Head of Linux Development Services Romania
1&1 Internet Development srl Tel KA: 754-9152 Str Mircea Eliade 18 Tel RO: +40-31-223-9152 Sect 1, Bucuresti mailto: marius.zbihlei@1and1.ro 71295, Romania
sr-dev mailing list sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev