Re: [Serusers] SER usrloc caching, was: SER Reports "out of memory"

31 May 2005

      Inline.
----- Original Message ----- 
From: "Zeus Ng" zeus.ng@isquare.com.au
...
Anyway, I feel the idea of having cached usrloc not worthy. Instead, a 
pure
DB usrloc would be better off.
Lets consider a pure VoIP scenario. That is only REGISTER, INVITE, BYE, 
ACK
and CANCEL requests and nothing else.
Now UA_A send it's REGISTER request to SER_A. The contact is being saved 
on
SER_A's memory and DB. The information is replicated to SER_B's DB by 
either
replication or shared cluster. An INVITE request is send to SER_B to 
locate
UA_A. The contact cannot be found in SER_B's memory and a DB lookup is 
done
and populate to SER_B's memory.
Now, UA_A sends another REGISTER request to SER_A with a different contact
(maybe a reboot or change of IP via DHCP). This information is updated on
SER_A's memory and DB. This information propagates to SER_B's DB as well.
UA_B sends another INVITE to SER_B to locate UA_A. SER_B can find the 
"old"
UA_A contact detail in memory, which is different from DB version. The 
call
will not be established because of wrong caching information.
If we are to use pure DB only usrloc. That problem will not happen.
Yes, I believe this is the same problem Karl pointed out? In itself it poses 
an issue, but I think the broader scalability impact is more important.
...
To the argument of heavy RW on usrloc, I don't find caching helps. 
Consider
a UA does a REGISTER request every 5 minutes. The REGISTER request will
always behave the same.
Well, if the cached usrloc is the same as in the REGISTER request, you don't 
need to write, do you? This should have a huge impact on the number of 
necessary writes as without a cache, you must write every time.
...
For other requests, there is always a DB read within
that 5 minutes interval to populate the cache. Only subsequence requests 
go
through the cache. You may have a difference call patten but I find rarely
two INVITEs to the same UA within 5 minutes interval. After that 5 
minutes,
the cache becomes invalid and a DB read is required. Using cache saves
nothing on DB read here. (Obviously you can argue about longer time 
between
REGISTER request or contact rarely change with hardphone but the same
principle applies)
I'm not sure I understand. The point here is to look at the lifetime of the 
usrloc, which arguably is (in most cases) far longer than five minutes. I 
don't follow you in that the cache is invalid after five minutes?!
...
Besides, I have good experience with MySQL caching
results. So, DB lookup on every SIP request with proper DB tunning can
achieve similar result.
That IS a good point.  However, most of the overhead is not in the actual 
lookup, but in the call to mysql.  I'm not sure what that overhead is, if it 
is very low, you're point hits the "let each system do what it's best at..." 
;-)
...
I am in the process of writing a pure DB base usrloc. The lookup() part is
running fine. I just need to finish the save() and expire() functions. NAT
handling will be added at a later stage. Will post the code once I finish
that.
If you have a hope that this may get into the CVS, I hope you contacted Jan 
before you started the development.  He will be able to point out any 
overlaps in development (ex. his own), as well as point you to any issues.
g-)
...
...
-----Original Message-----
From: serusers-bounces@lists.iptel.org
[mailto:serusers-bounces@lists.iptel.org] On Behalf Of Greger V. Teigre
Sent: Monday, 30 May 2005 10:11 PM
To: Java Rockx; Jiri Kuthan
Cc: serusers
Subject: Re: [Serusers] SER Reports "out of memory"
See inline.
Jiri Kuthan wrote:
...
At 09:24 AM 5/30/2005, Greger V. Teigre wrote:
[...]
...
...

when ser starts up usrloc is "lazy-loaded"
if a usrloc record is looked up in cache and is __NOT__ found,

then MySQL will be queried. If found in MySQL then the
usrloc record
...
...
...
will be put in to cache for future lookups
By doing these two things we should not have a problem we
excessively large subscriber bases.
Thoughts?
Makes sense.  This is how Berkeley DB and many other DBs work.  In
fact, the best would be to build an abstraction cache layer around
all the query functions that have data in the DB. This way
you would
...
...
get the optimum performance/scalability.
I have to admit I am not sufficiently familiarized with BDB. If I
understand it right, they do confgurable in-memory caching and they
also support some kind of master-slave replication. I am not sure
though how this scales...(20 SERs with 20 BDBs, one of them
master and
...
replicating UsrLoc changes to 19 slaves who are all able to
identify
...
inconsistent cache?)
I mean the structural problem here is dealing with r-w intensive
Usrloc operations and still desiring to replicate for reliability.
There is a variety of algorithms to deal with it and I
don't know well
...
what the respective DB systems actually do.
I'm not proposing to use BDB, it was just an example.
Databases are very
good at replication, even two-way replication can be done
quite efficiently
through locking etc. I just took Paul's setup with cluster
back-end as
granted and wrote my comments based on that...
Thinking a bit wider and building on your comments, Jiri:
    The challenge, I think, is to handle the following things
in any likely
deployment scenario:

Usrloc writes to cache vs. DB
Replication of usrloc, multiple DBs vs. cluster, across

LAN or WAN 3. Memory caching management (inconsistencies etc)
For the sake of the readers, here is how I understand SER's
operations
today:

Usrloc is always written to cache, DB write is controlled through

write-through parameter
2. Replication is handled by t_replicate
3. Management of cache is not needed, the cache is always
updated. However,
an updated DB (and thus dirty cache) will not be detected
Here is how I understand Paul's proposal (and with my
annotated suggestions
from my last email :-):

Usrloc is always written to DB, cache is updated if it is

already in the
cache
2. Replication is handled by underlying database across DBs
or in a cluster 3. If usrloc is not found, DB is checked. If
cache is full, some mechanism
for throwing out a usrloc is devised
I must admit I often fall for the argument: "let each system
do what it is
best at."
Following that, replication should only be done at an
application level if
the underlying database is not capable of doing it (if we
agree that a DB is
good at replication).  The only thing I see a DB is not
capable of, is
handling the NAT issues. So, if a given usrloc has to be
represented by
different location (ex. the registration server), then the DB
cannot do
replication. However, if the NAT issue is handled through
some other means,
ex. Call-Id aware LVS with one public IP, then the usrloc
should be the same
across DBs and the DB should handle the replication.
You don't need many subscribers before you'll want redundancy and as
active-passive redundancy is a waste of resources, I believe
an upgrade of
the replication mechanism should soon be imminent. ;-)
    I think I have said this before, but this is my
enterprise-level "dream"
scenario:

Two geographically distributed server centers
DNS SRV for load distribution (and possible using

segmentation of clients
through their configurations if they don't support DNS SRV)
3. Each data center has Call-Id sensitive LVS in front, with
one or more
servers at the back  (a fair-sized LVS box can handle 8,000
UDP packets per
second)
4. Each data center either has a DB cluster or two-ways SER-based
replication
5. The data centers replicate between each other using either
DB-based
replication or two-ways SER-based replication
6. The SER-based replication is an enhanced version of
t_replicate() were
replication is to a set of servers and replication is ACKed
and guaranteed
(queue). I would suggest using the XMLRPC interface Jan has
introduced 7. I think Paul's cache-suggestions are good
regardless of decisions on
replication
Entry level scenario where the same box is running LVS, SER,
and DB (you can
quickly add new boxes) has a very low cost.
...
...
However, there is one more thing: You need to decide on an
algorithm for selecting a usrloc record to replace when
the cache is
...
...
full.  Do you store extra info in memory for each usrloc
to make the
...
...
right decision (ex. based on the number of lookups).
You may also purchase more memory :)
Do you suggest that no mechanism should be devised when the
cache limit is
hit? ;-)  Then maybe I can suggest an email alert to the
operator when a
certain amount of the cache is full... :-D  I trust my people
to act fast
and appropriate, but not that fast and appropriate!
g-)

Serusers mailing list
serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Serusers] SER usrloc caching, was: SER Reports "out of memory"