[Serusers] SER Reports "out of memory"

List overview All Threads
Download

newer

older

[Serusers] How to get the via when...

[Serusers] req: rtpproxy video...

Java Rockx

24 May 2005 24 May '05

1:09 a.m.

Hi All.

I'm using ser-0.9.2 and testing how it works when the location table has many thousands of rows.

ser will not start when the location table has many rows. SER loads the first 1344 records and then pukes on the 1345th record with an out of memory error. I've started ser with "-m 512" but that doesn't seem to make any difference.

Am I doing something wrong?

Also, for my "test" data I simply created a bunch of bogus records in the location table. I did not do anything with the subscriber table. Does this matter? If so, what needs to be done in order to create a large number of "registered" users?

Regards, Paul

Attachments:

attachment.html (text/html — 706 bytes)

Show replies by date

Zeus Ng

24 May 24 May

2:24 a.m.

This may help.

http://lists.iptel.org/pipermail/serusers/2004-December/013593.html

...

-----Original Message----- From: serusers-bounces@lists.iptel.org [mailto:serusers-bounces@lists.iptel.org] On Behalf Of Java Rockx Sent: Tuesday, 24 May 2005 11:09 AM To: serusers Subject: [Serusers] SER Reports "out of memory"

Hi All.

I'm using ser-0.9.2 and testing how it works when the location table has many thousands of rows.

ser will not start when the location table has many rows. SER loads the first 1344 records and then pukes on the 1345th record with an out of memory error. I've started ser with "-m 512" but that doesn't seem to make any difference.

Am I doing something wrong?

Also, for my "test" data I simply created a bunch of bogus records in the location table. I did not do anything with the subscriber table. Does this matter? If so, what needs to be done in order to create a large number of "registered" users?

Regards, Paul

Java Rockx

1:25 p.m.

Zeus...thanks for the post.

However, it does a raise new (and possibly disturbing) question about SER's memory usage architecture.

Here is the question; If I have 500000 records in the location table and for the sake of this example, let's assume that each usrloc contact occupies 1024 bytes of memory. To load all 500000 usrloc records we would need (500000 x 1024) bytes of memory plus the structure overhead. So let's just estimate it to be a total of 512MB of memory for discussions sake. (I realize that the actual memory requirements will be much lower).

Now when SER starts up it needs to load all usrloc records in to memory. Per the email in the archives, SER will do this in private memory space. I assume that after all usrloc records are loaded in to private memory that they are then copied to shared memory. The amount of private memory is specified in config.h as PKG_MEM_POOL_SIZE and affects __ALL__ ser processes.

This then implies that in order to load a large number of usrloc records I need to increase PKG_MEM_POOL_SIZE to approximately 512MB which then affects __ALL__ ser processes. If I specify ser to use 8 child processes in my ser.cfg file then will ser attempt to allocate 512MB x 8 or (4GB) of memory just to start up???

If this is the case when the fundamental architecture is flawed as it would make much more sense to have a single process load usrloc records in to shared memory and then fork the children with a much smaller amount of private memory.

Am I totally missing something here?

Regards, Paul

On 5/23/05, Zeus Ng zeus.ng@isquare.com.au wrote:

...

This may help.

http://lists.iptel.org/pipermail/serusers/2004-December/013593.html

...
-----Original Message----- From: serusers-bounces@lists.iptel.org [mailto:serusers-bounces@lists.iptel.org] On Behalf Of Java Rockx Sent: Tuesday, 24 May 2005 11:09 AM To: serusers Subject: [Serusers] SER Reports "out of memory"

Hi All.

I'm using ser-0.9.2 and testing how it works when the location table has many thousands of rows.

ser will not start when the location table has many rows. SER loads the first 1344 records and then pukes on the 1345th record with an out of memory error. I've started ser with "-m 512" but that doesn't seem to make any difference.

Am I doing something wrong?

Also, for my "test" data I simply created a bunch of bogus records in the location table. I did not do anything with the subscriber table. Does this matter? If so, what needs to be done in order to create a large number of "registered" users?

Regards, Paul

Zeus Ng

10:39 p.m.

Paul,

Like you, I'm not totally familiar with SER's memory management. I just happened to bookmark that link which I think is useful.

Before we jump into conclusion that SER memory management is at flaw, can you just recompile SER and confirm that change fix your problem. I would like to know the result too.

The loading of location table is done in the mod_init() function, not child_init() function. Base on that information, I would assume that only the master process has to allocate that much memory. The child processes need only know the pointer to that list. Thus, there is no need of 4G as in your example. Again, this is only my assumption. It's better to raise another thread to ask the maintainer.

Regards,

Zeus

...

Zeus...thanks for the post.

However, it does a raise new (and possibly disturbing) question about SER's memory usage architecture.

Here is the question; If I have 500000 records in the location table and for the sake of this example, let's assume that each usrloc contact occupies 1024 bytes of memory. To load all 500000 usrloc records we would need (500000 x 1024) bytes of memory plus the structure overhead. So let's just estimate it to be a total of 512MB of memory for discussions sake. (I realize that the actual memory requirements will be much lower).

Now when SER starts up it needs to load all usrloc records in to memory. Per the email in the archives, SER will do this in private memory space. I assume that after all usrloc records are loaded in to private memory that they are then copied to shared memory. The amount of private memory is specified in config.h as PKG_MEM_POOL_SIZE and affects __ALL__ ser processes.

This then implies that in order to load a large number of usrloc records I need to increase PKG_MEM_POOL_SIZE to approximately 512MB which then affects __ALL__ ser processes. If I specify ser to use 8 child processes in my ser.cfg file then will ser attempt to allocate 512MB x 8 or (4GB) of memory just to start up???

If this is the case when the fundamental architecture is flawed as it would make much more sense to have a single process load usrloc records in to shared memory and then fork the children with a much smaller amount of private memory.

Am I totally missing something here?

Regards, Paul

On 5/23/05, Zeus Ng zeus.ng@isquare.com.au wrote:

This may help.

http://lists.iptel.org/pipermail/serusers/2004-December/013593.html

...
-----Original Message----- From: serusers-bounces@lists.iptel.org [mailto:serusers-bounces@lists.iptel.org] On Behalf Of Java Rockx Sent: Tuesday, 24 May 2005 11:09 AM To: serusers Subject: [Serusers] SER Reports "out of memory"

Hi All.

I'm using ser-0.9.2 and testing how it works when the location table has many thousands of rows.

ser will not start when the location table has many rows. SER loads the first 1344 records and then pukes on the 1345th record with an out of memory error. I've started ser with "-m 512" but that doesn't seem to make any difference.

Am I doing something wrong?

Also, for my "test" data I simply created a bunch of bogus records in the location table. I did not do anything with the subscriber table. Does this matter? If so, what needs to be done in order to create a large number of "registered" users?

Regards, Paul

Java Rockx

25 May 25 May

1:07 a.m.

Zeus,

Thanks for the info. I did change that config.h define and now it works well.

My newest problem is the ser start time. In my very non-scientific test it took ser about 25 minutes before it began serving requests because it was loading usrloc information.

That was using 500000 records in the location table. The MySQL server was running on the same box as SER, which is also my workstation, so stuff like Firefox, X, etc, were in use.

But this does bring up an interesting problem namely - how can ser service SIP clients while loading large number of usrloc records? I'm kind of thinking that this could be a big problem. When dealing with massive user bases there is no such thing as a "quick restart".

We do have LVS fully "sip-aware" so we are doing true UDP load balancing based on the Call-ID header, but this is still a problem [potentially] with replication ucontact records while the server is starting up.

I wonder if it is possible to modify the behaviour of usrloc so that it loads in the background while ser is processing SIP messages. And when lookup("location") is executed, usrloc searching the the ser cache and then MySQL if no hit is found in cache -- or something like that.

Can anyone on serusers give some tips as to how to get ser to load usrloc entries optimized? I know the usual stuff like faster MySQL disks, faster network connection, dedicated app servers, etc, etc. But I'm looking for ser and/or MySQL tweaking hacks.

Regards, Paul

On 5/24/05, Zeus Ng zeus.ng@isquare.com.au wrote:

...

Paul,

Like you, I'm not totally familiar with SER's memory management. I just happened to bookmark that link which I think is useful.

Before we jump into conclusion that SER memory management is at flaw, can you just recompile SER and confirm that change fix your problem. I would like to know the result too.

The loading of location table is done in the mod_init() function, not child_init() function. Base on that information, I would assume that only the master process has to allocate that much memory. The child processes need only know the pointer to that list. Thus, there is no need of 4G as in your example. Again, this is only my assumption. It's better to raise another thread to ask the maintainer.

Regards,

Zeus

...
Zeus...thanks for the post.

However, it does a raise new (and possibly disturbing) question about SER's memory usage architecture.

Here is the question; If I have 500000 records in the location table and for the sake of this example, let's assume that each usrloc contact occupies 1024 bytes of memory. To load all 500000 usrloc records we would need (500000 x 1024) bytes of memory plus the structure overhead. So let's just estimate it to be a total of 512MB of memory for discussions sake. (I realize that the actual memory requirements will be much lower).

Now when SER starts up it needs to load all usrloc records in to memory. Per the email in the archives, SER will do this in private memory space. I assume that after all usrloc records are loaded in to private memory that they are then copied to shared memory. The amount of private memory is specified in config.h as PKG_MEM_POOL_SIZE and affects __ALL__ ser processes.

This then implies that in order to load a large number of usrloc records I need to increase PKG_MEM_POOL_SIZE to approximately 512MB which then affects __ALL__ ser processes. If I specify ser to use 8 child processes in my ser.cfg file then will ser attempt to allocate 512MB x 8 or (4GB) of memory just to start up???

If this is the case when the fundamental architecture is flawed as it would make much more sense to have a single process load usrloc records in to shared memory and then fork the children with a much smaller amount of private memory.

Am I totally missing something here?

Regards, Paul

On 5/23/05, Zeus Ng zeus.ng@isquare.com.au wrote:

This may help.

http://lists.iptel.org/pipermail/serusers/2004-December/013593.html

...
-----Original Message----- From: serusers-bounces@lists.iptel.org [mailto:serusers-bounces@lists.iptel.org] On Behalf Of Java Rockx Sent: Tuesday, 24 May 2005 11:09 AM To: serusers Subject: [Serusers] SER Reports "out of memory"

Hi All.

I'm using ser-0.9.2 and testing how it works when the location table has many thousands of rows.

ser will not start when the location table has many rows. SER loads the first 1344 records and then pukes on the 1345th record with an out of memory error. I've started ser with "-m 512" but that doesn't seem to make any difference.

Am I doing something wrong?

Also, for my "test" data I simply created a bunch of bogus records in the location table. I did not do anything with the subscriber table. Does this matter? If so, what needs to be done in order to create a large number of "registered" users?

Regards, Paul

Zeus Ng

1:42 a.m.

See inline comment.

...

Thanks for the info. I did change that config.h define and now it works well.

Great to hear that the little change solve your problem.

...

My newest problem is the ser start time. In my very non-scientific test it took ser about 25 minutes before it began serving requests because it was loading usrloc information.

That was using 500000 records in the location table. The MySQL server was running on the same box as SER, which is also my workstation, so stuff like Firefox, X, etc, were in use.

But this does bring up an interesting problem namely - how can ser service SIP clients while loading large number of usrloc records? I'm kind of thinking that this could be a big

No, you can't. In fact, you will experience a temporary slow down when a hugh number of UA is un-registering because the table was locked during that period of time. I once use sipsak to register 5000 users in 15s. When they all expired about the same time, SER hang for a while for locking the table to release the record from memory.

...

problem. When dealing with massive user bases there is no such thing as a "quick restart".

Well, that's the trade-off of memory base db. You need to balance the startup time verse runtime performance.

...

We do have LVS fully "sip-aware" so we are doing true UDP load balancing based on the Call-ID header, but this is still a problem [potentially] with replication ucontact records while the server is starting up.

I wonder if it is possible to modify the behaviour of usrloc so that it loads in the background while ser is processing SIP messages. And when lookup("location") is executed, usrloc searching the the ser cache and then MySQL if no hit is found in cache -- or something like that.

This triggers me to bring up the common question asked on this list before. Can SER use just MySQL for usrloc? A similar concept has been done on the speeddial module. It would help load distribution, faster startup time and better redundancy. Of course, slower lookup as tradeoff.

I once consider replacing the build-in memory base DB with MySQL memory db. However, that idea was dropped due to time constrain and compatability (postgresql) issue.

...

Can anyone on serusers give some tips as to how to get ser to load usrloc entries optimized? I know the usual stuff like faster MySQL disks, faster network connection, dedicated app servers, etc, etc. But I'm looking for ser and/or MySQL tweaking hacks.

Good luck on your search.

...

Regards, Paul

Greger V. Teigre

29 May 29 May

8:28 a.m.

Interesting discussion. I believe most large-scale deployments (there aren't really that many...) divide the user base across several servers. I believe they use 20K users is a "good number" per server. So, one ser having to load that many records, is only if you have a cluster with no server divide. Loading all the contacts into memory is impossible to scale, at one point it will take too long time and take too much memory. So, a better architecture *for such a deployment scenario* would be a cache of some size and then a lookup of records in DB if not present in cache. Loading 330 records per second, you can load about 20,000 contacts in a minute, which probably is ok. g-)

Zeus Ng wrote:

...

See inline comment.

...
Thanks for the info. I did change that config.h define and now it works well.

Great to hear that the little change solve your problem.

...
My newest problem is the ser start time. In my very non-scientific test it took ser about 25 minutes before it began serving requests because it was loading usrloc information.

That was using 500000 records in the location table. The MySQL server was running on the same box as SER, which is also my workstation, so stuff like Firefox, X, etc, were in use.

But this does bring up an interesting problem namely - how can ser service SIP clients while loading large number of usrloc records? I'm kind of thinking that this could be a big

No, you can't. In fact, you will experience a temporary slow down when a hugh number of UA is un-registering because the table was locked during that period of time. I once use sipsak to register 5000 users in 15s. When they all expired about the same time, SER hang for a while for locking the table to release the record from memory.

...
problem. When dealing with massive user bases there is no such thing as a "quick restart".

Well, that's the trade-off of memory base db. You need to balance the startup time verse runtime performance.

...
We do have LVS fully "sip-aware" so we are doing true UDP load balancing based on the Call-ID header, but this is still a problem [potentially] with replication ucontact records while the server is starting up.

I wonder if it is possible to modify the behaviour of usrloc so that it loads in the background while ser is processing SIP messages. And when lookup("location") is executed, usrloc searching the the ser cache and then MySQL if no hit is found in cache -- or something like that.

This triggers me to bring up the common question asked on this list before. Can SER use just MySQL for usrloc? A similar concept has been done on the speeddial module. It would help load distribution, faster startup time and better redundancy. Of course, slower lookup as tradeoff.

I once consider replacing the build-in memory base DB with MySQL memory db. However, that idea was dropped due to time constrain and compatability (postgresql) issue.

...
Can anyone on serusers give some tips as to how to get ser to load usrloc entries optimized? I know the usual stuff like faster MySQL disks, faster network connection, dedicated app servers, etc, etc. But I'm looking for ser and/or MySQL tweaking hacks.

Good luck on your search.

...
Regards, Paul

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Java Rockx

1:15 p.m.

Actually, a minute delay would be a bad thing because replicated usrloc records, using t_replicate() would not make it in to peer SER server caches when the server is starting up.

Given this fact, and given the fact that most SER modules do not hash data upon server startup [like group.so, etc, etc] we are starting to see little value in caching usrloc. Our MySQL server is hit 12 times for an INVITE message and so complete caching of usrloc is of minimal performace gain.

Anyhow, we're not in process of modifying SER so that:

* when ser starts up usrloc is "lazy-loaded" * if a usrloc record is looked up in cache and is __NOT__ found, then MySQL will be queried. If found in MySQL then the usrloc record will be put in to cache for future lookups

By doing these two things we should not have a problem we excessively large subscriber bases.

Thoughts? Paul

On 5/29/05, Greger V. Teigre greger@teigre.com wrote:

...

Interesting discussion. I believe most large-scale deployments (there aren't really that many...) divide the user base across several servers. I believe they use 20K users is a "good number" per server. So, one ser having to load that many records, is only if you have a cluster with no server divide. Loading all the contacts into memory is impossible to scale, at one point it will take too long time and take too much memory. So, a better architecture *for such a deployment scenario* would be a cache of some size and then a lookup of records in DB if not present in cache. Loading 330 records per second, you can load about 20,000 contacts in a minute, which probably is ok. g-)

Zeus Ng wrote:

...
See inline comment.

...
Thanks for the info. I did change that config.h define and now it works well.

Great to hear that the little change solve your problem.

...
My newest problem is the ser start time. In my very non-scientific test it took ser about 25 minutes before it began serving requests because it was loading usrloc information.

That was using 500000 records in the location table. The MySQL server was running on the same box as SER, which is also my workstation, so stuff like Firefox, X, etc, were in use.

But this does bring up an interesting problem namely - how can ser service SIP clients while loading large number of usrloc records? I'm kind of thinking that this could be a big

No, you can't. In fact, you will experience a temporary slow down when a hugh number of UA is un-registering because the table was locked during that period of time. I once use sipsak to register 5000 users in 15s. When they all expired about the same time, SER hang for a while for locking the table to release the record from memory.

...
problem. When dealing with massive user bases there is no such thing as a "quick restart".

Well, that's the trade-off of memory base db. You need to balance the startup time verse runtime performance.

...
We do have LVS fully "sip-aware" so we are doing true UDP load balancing based on the Call-ID header, but this is still a problem [potentially] with replication ucontact records while the server is starting up.

I wonder if it is possible to modify the behaviour of usrloc so that it loads in the background while ser is processing SIP messages. And when lookup("location") is executed, usrloc searching the the ser cache and then MySQL if no hit is found in cache -- or something like that.

This triggers me to bring up the common question asked on this list before. Can SER use just MySQL for usrloc? A similar concept has been done on the speeddial module. It would help load distribution, faster startup time and better redundancy. Of course, slower lookup as tradeoff.

I once consider replacing the build-in memory base DB with MySQL memory db. However, that idea was dropped due to time constrain and compatability (postgresql) issue.

...
Can anyone on serusers give some tips as to how to get ser to load usrloc entries optimized? I know the usual stuff like faster MySQL disks, faster network connection, dedicated app servers, etc, etc. But I'm looking for ser and/or MySQL tweaking hacks.

Good luck on your search.

...
Regards, Paul

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Juha Heinanen

1:52 p.m.

Java Rockx writes:

...

when ser starts up usrloc is "lazy-loaded"

if a usrloc record is looked up in cache and is __NOT__ found, then MySQL

will be queried. If found in MySQL then the usrloc record will be put in to cache for future lookups

how about changing on the fly the listening address of the secondary ser when primary ser fails? that way there would be no loading delay.

-- juha

Java Rockx

4 p.m.

That won't work for us. The reason is that we have LVS fully "sip-aware" and thus we have many SER routers that are all active at the same time. This means we don't have the concept of a "primary" sip router - as the LVS SIP scheduler determines where to route a SIP message.

On 5/29/05, Juha Heinanen jh@tutpro.com wrote:

...

Java Rockx writes:

...

when ser starts up usrloc is "lazy-loaded"

if a usrloc record is looked up in cache and is __NOT__ found, then

MySQL

...
will be queried. If found in MySQL then the usrloc record will be put in

to

...
cache for future lookups

how about changing on the fly the listening address of the secondary ser when primary ser fails? that way there would be no loading delay.

-- juha

Juha Heinanen

4:19 p.m.

Java Rockx writes:

...

That won't work for us. The reason is that we have LVS fully "sip-aware" and thus we have many SER routers that are all active at the same time. This means we don't have the concept of a "primary" sip router - as the LVS SIP scheduler determines where to route a SIP message.

then yuu should not have any problems at all. you just wait until all contacts are loaded before you add a new proxy to the pool.

-- juha

Jiri Kuthan

9:41 p.m.

At 03:15 PM 5/29/2005, Java Rockx wrote:

...

Actually, a minute delay would be a bad thing because replicated usrloc records, using t_replicate() would not make it in to peer SER server caches when the server is starting up.

Given this fact, and given the fact that most SER modules do not hash data upon server startup [like group.so, etc, etc] we are starting to see little value in caching usrloc. Our MySQL server is hit 12 times for an INVITE message and so complete caching of usrloc is of minimal performace gain.

indeed.

...

Anyhow, we're not in process of modifying SER so that:

when ser starts up usrloc is "lazy-loaded"

if a usrloc record is looked up in cache and is __NOT__ found, then MySQL will be queried. If found in MySQL then the usrloc record will be put in to cache for future lookups

By doing these two things we should not have a problem we excessively large subscriber bases.

Appears reasonable to me.

Still -- with the way you are suggesting, CallID based load distribution, how do you replicate UsrLoc changes across all the servers?

-jiri

Java Rockx

30 May 30 May

1:40 a.m.

Currently, usrloc is replicated via t_replicate() using db_mode=writeback.

However, our lazy-load patch would obsolete the need for t_replicate() because we have multiple MySQL servers that are active-active so __all__ replication really occurs at the database layer rather than the SIP layer.

So in this situation, when a REGISTER message hits any SIP router, SER will do the usual stuff that it currently does (excluding t_replicate), and when it persists the user contact to MySQL, the database will replicate to the other DB servers.

Then if a usrloc record is "looked-up" on any other SIP router and a match is not found in cache, the usrloc code will query MySQL for the record, which was replicated by MySQL.

By doing this we will have a "zero-delay" SER start time, regardless of the number of records in the subscriber table and we also eliminate the possiblity of t_replicate() sending a usrloc record to peer SIP routers which didn't process the request.

Jiri, do you see any pitfalls with this school of thought.

Regards, Paul

On 5/29/05, Jiri Kuthan jiri@iptel.org wrote:

...

At 03:15 PM 5/29/2005, Java Rockx wrote:

...
Actually, a minute delay would be a bad thing because replicated usrloc

records, using t_replicate() would not make it in to peer SER server caches when the server is starting up.

...
Given this fact, and given the fact that most SER modules do not hash

data upon server startup [like group.so, etc, etc] we are starting to see little value in caching usrloc. Our MySQL server is hit 12 times for an INVITE message and so complete caching of usrloc is of minimal performace gain.

indeed.

...
Anyhow, we're not in process of modifying SER so that:

when ser starts up usrloc is "lazy-loaded"

if a usrloc record is looked up in cache and is __NOT__ found, then

MySQL will be queried. If found in MySQL then the usrloc record will be put in to cache for future lookups

...
By doing these two things we should not have a problem we excessively

large subscriber bases.

Appears reasonable to me.

Still -- with the way you are suggesting, CallID based load distribution, how do you replicate UsrLoc changes across all the servers?

-jiri

Jiri Kuthan

9:36 a.m.

At 03:40 AM 5/30/2005, Java Rockx wrote:

...

Currently, usrloc is replicated via t_replicate() using db_mode=writeback.

However, our lazy-load patch would obsolete the need for t_replicate() because we have multiple MySQL servers that are active-active so __all__ replication really occurs at the database layer rather than the SIP layer.

So this is the point which I am still struggling with. I mean generally there is a problem of read-write intenstive UsrLoc operations. We can move it from SIP to DB. However, whichever layer we choose to solve the problem, it takes careful dimensioning. Otherwise the replication mechanism may cause peformance problems.

What Mysql setup are you exactly using? Cluster? Master/slave replication?

Otherwise I think that the cache policy "load-on-demand" makes sense.

-jiri

Java Rockx

1:33 p.m.

Our current MySQL configuration has two servers in active-active state (ie, neither is a master, both are fully active and serving requests). The MySQL servers use a Network Appliance for storage.

Regards, Paul

On 5/30/05, Jiri Kuthan jiri@iptel.org wrote:

...

At 03:40 AM 5/30/2005, Java Rockx wrote:

...
Currently, usrloc is replicated via t_replicate() using

db_mode=writeback.

...
However, our lazy-load patch would obsolete the need for t_replicate()

because we have multiple MySQL servers that are active-active so __all__ replication really occurs at the database layer rather than the SIP layer.

So this is the point which I am still struggling with. I mean generally there is a problem of read-write intenstive UsrLoc operations. We can move it from SIP to DB. However, whichever layer we choose to solve the problem, it takes careful dimensioning. Otherwise the replication mechanism may cause peformance problems.

What Mysql setup are you exactly using? Cluster? Master/slave replication?

Otherwise I think that the cache policy "load-on-demand" makes sense.

-jiri

Karl H. Putz

1:48 p.m.

...

-----Original Message-----On Behalf Of Jiri Kuthan Sent: Monday, May 30, 2005 5:36 AM At 03:40 AM 5/30/2005, Java Rockx wrote:

...
Currently, usrloc is replicated via t_replicate() using db_mode=writeback.

However, our lazy-load patch would obsolete the need for

t_replicate() because we have multiple MySQL servers that are active-active so __all__ replication really occurs at the database layer rather than the SIP layer.

So this is the point which I am still struggling with. I mean generally there is a problem of read-write intenstive UsrLoc operations. We can move it from SIP to DB. However, whichever layer we choose to solve the problem, it takes careful dimensioning. Otherwise the replication mechanism may cause peformance problems.

What Mysql setup are you exactly using? Cluster? Master/slave replication?

Otherwise I think that the cache policy "load-on-demand" makes sense.

If pure DB replication is used, what would happen in the following scenario:

A given user receives multiple calls such that more than 1 physical SER server has userloc cache populated.

The user then phsyically moves or changes return contact registration information and re-registers.

It seems that the specific SER server that handled the registration would update cache and the backend DB would be updated. But any attempt to contact the user through a SER server that has not yet expired the old cache info would fail.

Karl

...

-jiri

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Java Rockx

3:07 p.m.

Good point. So then perhaps t_replicate() [AND] save_memory() should be used as normal. And SER should then just start up using a lazy-loading mechanism. Eventually all SER routers in the farm would then have a fully populated cache and the problem would be solved.

In other words the SER server that physically recieves the original REGISTER message would save() and t_replicate().

All the peers in the farm that receive a REGISTER via the t_replicate() function would only use save_memory().

MySQL replication still occurs and if a SER server is restarted it doesn't attempt to load usrloc info upon startup, but rather loads it over a period of time. All the while, if a usrloc record is looked-up and it is on in cache, then SER would query MySQL for the correct ucontact record.

Thanks for the qustion --- I hadn't thought about that before.

Regards, Paul

On 5/30/05, Karl H. Putz kputz@columbus.rr.com wrote:

...

...
-----Original Message-----On Behalf Of Jiri Kuthan Sent: Monday, May 30, 2005 5:36 AM At 03:40 AM 5/30/2005, Java Rockx wrote:

...
Currently, usrloc is replicated via t_replicate() using

db_mode=writeback.

...
...
However, our lazy-load patch would obsolete the need for

t_replicate() because we have multiple MySQL servers that are active-active so __all__ replication really occurs at the database layer rather than the SIP layer.

So this is the point which I am still struggling with. I mean generally there is a problem of read-write intenstive UsrLoc operations. We can move it from SIP to DB. However, whichever layer we choose to solve the problem, it takes careful dimensioning. Otherwise the replication mechanism may cause peformance problems.

What Mysql setup are you exactly using? Cluster? Master/slave

replication?

...
Otherwise I think that the cache policy "load-on-demand" makes sense.

If pure DB replication is used, what would happen in the following scenario:

A given user receives multiple calls such that more than 1 physical SER server has userloc cache populated.

The user then phsyically moves or changes return contact registration information and re-registers.

It seems that the specific SER server that handled the registration would update cache and the backend DB would be updated. But any attempt to contact the user through a SER server that has not yet expired the old cache info would fail.

Karl

...
-jiri

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Greger V. Teigre

8:04 p.m.

Yes, good point, but I don't like the solution. Both doing t_replicate AND DB replication is replication at both layers... I think we here have another argument for replication at the application layer. I'm starting to get blurred vision here: What was the reason for doing DB-level replication in the first place? (And BTW, what do you gain?) g-)

---- Original Message ---- From: Java Rockx To: Karl H. Putz Cc: serusers Sent: Monday, May 30, 2005 05:07 PM Subject: Re: [Serusers] SER Reports "out of memory"

...

Good point. So then perhaps t_replicate() [AND] save_memory() should be used as normal. And SER should then just start up using a lazy-loading mechanism. Eventually all SER routers in the farm would then have a fully populated cache and the problem would be solved.

In other words the SER server that physically recieves the original REGISTER message would save() and t_replicate().

All the peers in the farm that receive a REGISTER via the t_replicate() function would only use save_memory().

MySQL replication still occurs and if a SER server is restarted it doesn't attempt to load usrloc info upon startup, but rather loads it over a period of time. All the while, if a usrloc record is looked-up and it is on in cache, then SER would query MySQL for the correct ucontact record.

Thanks for the qustion --- I hadn't thought about that before.

Regards, Paul

On 5/30/05, Karl H. Putz kputz@columbus.rr.com wrote:

...
-----Original Message-----On Behalf Of Jiri Kuthan Sent: Monday, May 30, 2005 5:36 AM At 03:40 AM 5/30/2005, Java Rockx wrote:

...
Currently, usrloc is replicated via t_replicate() using db_mode=writeback.

However, our lazy-load patch would obsolete the need for

t_replicate() because we have multiple MySQL servers that are active-active so __all__ replication really occurs at the database layer rather than the SIP layer.

So this is the point which I am still struggling with. I mean generally there is a problem of read-write intenstive UsrLoc operations. We can move it from SIP to DB. However, whichever layer we choose to solve the problem, it takes careful dimensioning. Otherwise the replication mechanism may cause peformance problems.

What Mysql setup are you exactly using? Cluster? Master/slave replication?

Otherwise I think that the cache policy "load-on-demand" makes sense.

If pure DB replication is used, what would happen in the following scenario:

A given user receives multiple calls such that more than 1 physical SER server has userloc cache populated.

The user then phsyically moves or changes return contact registration information and re-registers.

It seems that the specific SER server that handled the registration would update cache and the backend DB would be updated. But any attempt to contact the user through a SER server that has not yet expired the old cache info would fail.

Karl

...
-jiri

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Greger V. Teigre

7:24 a.m.

...

Actually, a minute delay would be a bad thing because replicated usrloc records, using t_replicate() would not make it in to peer SER server caches when the server is starting up.

Yeah, I forgot about that scheme...

...

Given this fact, and given the fact that most SER modules do not hash data upon server startup [like group.so, etc, etc] we are starting to see little value in caching usrloc. Our MySQL server is hit 12 times for an INVITE message and so complete caching of usrloc is of minimal performace gain.

Anyhow, we're not in process of modifying SER so that:

when ser starts up usrloc is "lazy-loaded"

if a usrloc record is looked up in cache and is __NOT__ found, then

MySQL will be queried. If found in MySQL then the usrloc record will be put in to cache for future lookups

By doing these two things we should not have a problem we excessively large subscriber bases.

Thoughts?

Makes sense. This is how Berkeley DB and many other DBs work. In fact, the best would be to build an abstraction cache layer around all the query functions that have data in the DB. This way you would get the optimum performance/scalability.

However, there is one more thing: You need to decide on an algorithm for selecting a usrloc record to replace when the cache is full. Do you store extra info in memory for each usrloc to make the right decision (ex. based on the number of lookups). Also, what to do when you are storing a new location with save: Do you put it in the cache as well? Today this happens automatically. As you have continous registrations, you will fill up the cache with registered clients (and push out the ones having been called). What you really want is to keep the user locations you need (those required by lookup) in the cache. So I would suggest that in save(), you only write to the DB (and of course update the record if its in the cache) and that lookup() is the function that will control the activation and replacement of the records in the cache.

I think this approach to caching is of interest also to those who do not have a mysql cluster, but do regular replication, for example to reduce start-up time. I believe an implementation may get pretty involved (in terms of important functions you need to touch). However, I cannot see that you will need to touch the core.

g-)

...

Paul

On 5/29/05, Greger V. Teigre greger@teigre.com wrote: Interesting discussion. I believe most large-scale deployments (there aren't really that many...) divide the user base across several servers. I believe they use 20K users is a "good number" per server. So, one ser having to load that many records, is only if you have a cluster with no server divide. Loading all the contacts into memory is impossible to scale, at one point it will take too long time and take too much memory. So, a better architecture *for such a deployment scenario* would be a cache of some size and then a lookup of records in DB if not present in cache. Loading 330 records per second, you can load about 20,000 contacts in a minute, which probably is ok. g-)

Zeus Ng wrote:

...
See inline comment.

...
Thanks for the info. I did change that config.h define and now it works well.

Great to hear that the little change solve your problem.

...
My newest problem is the ser start time. In my very non-scientific test it took ser about 25 minutes before it began serving requests because it was loading usrloc information.

That was using 500000 records in the location table. The MySQL server was running on the same box as SER, which is also my workstation, so stuff like Firefox, X, etc, were in use.

But this does bring up an interesting problem namely - how can ser service SIP clients while loading large number of usrloc records? I'm kind of thinking that this could be a big

No, you can't. In fact, you will experience a temporary slow down when a hugh number of UA is un-registering because the table was locked during that period of time. I once use sipsak to register 5000 users in 15s. When they all expired about the same time, SER hang for a while for locking the table to release the record from memory.

...
problem. When dealing with massive user bases there is no such thing as a "quick restart".

Well, that's the trade-off of memory base db. You need to balance the startup time verse runtime performance.

...
We do have LVS fully "sip-aware" so we are doing true UDP load balancing based on the Call-ID header, but this is still a problem [potentially] with replication ucontact records while the server is starting up.

I wonder if it is possible to modify the behaviour of usrloc so that it loads in the background while ser is processing SIP messages. And when lookup("location") is executed, usrloc searching the the ser cache and then MySQL if no hit is found in cache -- or something like that.

This triggers me to bring up the common question asked on this list before. Can SER use just MySQL for usrloc? A similar concept has been done on the speeddial module. It would help load distribution, faster startup time and better redundancy. Of course, slower lookup as tradeoff.

I once consider replacing the build-in memory base DB with MySQL memory db. However, that idea was dropped due to time constrain and compatability (postgresql) issue.

...
Can anyone on serusers give some tips as to how to get ser to load usrloc entries optimized? I know the usual stuff like faster MySQL disks, faster network connection, dedicated app servers, etc, etc. But I'm looking for ser and/or MySQL tweaking hacks.

Good luck on your search.

...
Regards, Paul

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Jiri Kuthan

8:54 a.m.

At 09:24 AM 5/30/2005, Greger V. Teigre wrote:

[...]

...

...

when ser starts up usrloc is "lazy-loaded"

if a usrloc record is looked up in cache and is __NOT__ found, then

MySQL will be queried. If found in MySQL then the usrloc record will be put in to cache for future lookups

By doing these two things we should not have a problem we excessively large subscriber bases.

Thoughts?

Makes sense. This is how Berkeley DB and many other DBs work. In fact, the best would be to build an abstraction cache layer around all the query functions that have data in the DB. This way you would get the optimum performance/scalability.

I have to admit I am not sufficiently familiarized with BDB. If I understand it right, they do confgurable in-memory caching and they also support some kind of master-slave replication. I am not sure though how this scales...(20 SERs with 20 BDBs, one of them master and replicating UsrLoc changes to 19 slaves who are all able to identify inconsistent cache?)

I mean the structural problem here is dealing with r-w intensive Usrloc operations and still desiring to replicate for reliability. There is a variety of algorithms to deal with it and I don't know well what the respective DB systems actually do.

...

However, there is one more thing: You need to decide on an algorithm for selecting a usrloc record to replace when the cache is full. Do you store extra info in memory for each usrloc to make the right decision (ex. based on the number of lookups).

You may also purchase more memory :)

-jiri

Greger V. Teigre

12:11 p.m.

See inline. Jiri Kuthan wrote:

...

At 09:24 AM 5/30/2005, Greger V. Teigre wrote:

[...]

...
...

when ser starts up usrloc is "lazy-loaded"

if a usrloc record is looked up in cache and is __NOT__ found,

then MySQL will be queried. If found in MySQL then the usrloc record will be put in to cache for future lookups

By doing these two things we should not have a problem we excessively large subscriber bases.

Thoughts?

Makes sense. This is how Berkeley DB and many other DBs work. In fact, the best would be to build an abstraction cache layer around all the query functions that have data in the DB. This way you would get the optimum performance/scalability.

I have to admit I am not sufficiently familiarized with BDB. If I understand it right, they do confgurable in-memory caching and they also support some kind of master-slave replication. I am not sure though how this scales...(20 SERs with 20 BDBs, one of them master and replicating UsrLoc changes to 19 slaves who are all able to identify inconsistent cache?)

I mean the structural problem here is dealing with r-w intensive Usrloc operations and still desiring to replicate for reliability. There is a variety of algorithms to deal with it and I don't know well what the respective DB systems actually do.

I'm not proposing to use BDB, it was just an example. Databases are very good at replication, even two-way replication can be done quite efficiently through locking etc. I just took Paul's setup with cluster back-end as granted and wrote my comments based on that...

Thinking a bit wider and building on your comments, Jiri: The challenge, I think, is to handle the following things in any likely deployment scenario: 1. Usrloc writes to cache vs. DB 2. Replication of usrloc, multiple DBs vs. cluster, across LAN or WAN 3. Memory caching management (inconsistencies etc)

For the sake of the readers, here is how I understand SER's operations today: 1. Usrloc is always written to cache, DB write is controlled through write-through parameter 2. Replication is handled by t_replicate 3. Management of cache is not needed, the cache is always updated. However, an updated DB (and thus dirty cache) will not be detected

Here is how I understand Paul's proposal (and with my annotated suggestions from my last email :-): 1. Usrloc is always written to DB, cache is updated if it is already in the cache 2. Replication is handled by underlying database across DBs or in a cluster 3. If usrloc is not found, DB is checked. If cache is full, some mechanism for throwing out a usrloc is devised

I must admit I often fall for the argument: "let each system do what it is best at." Following that, replication should only be done at an application level if the underlying database is not capable of doing it (if we agree that a DB is good at replication). The only thing I see a DB is not capable of, is handling the NAT issues. So, if a given usrloc has to be represented by different location (ex. the registration server), then the DB cannot do replication. However, if the NAT issue is handled through some other means, ex. Call-Id aware LVS with one public IP, then the usrloc should be the same across DBs and the DB should handle the replication.

You don't need many subscribers before you'll want redundancy and as active-passive redundancy is a waste of resources, I believe an upgrade of the replication mechanism should soon be imminent. ;-) I think I have said this before, but this is my enterprise-level "dream" scenario: 1. Two geographically distributed server centers 2. DNS SRV for load distribution (and possible using segmentation of clients through their configurations if they don't support DNS SRV) 3. Each data center has Call-Id sensitive LVS in front, with one or more servers at the back (a fair-sized LVS box can handle 8,000 UDP packets per second) 4. Each data center either has a DB cluster or two-ways SER-based replication 5. The data centers replicate between each other using either DB-based replication or two-ways SER-based replication 6. The SER-based replication is an enhanced version of t_replicate() were replication is to a set of servers and replication is ACKed and guaranteed (queue). I would suggest using the XMLRPC interface Jan has introduced 7. I think Paul's cache-suggestions are good regardless of decisions on replication

Entry level scenario where the same box is running LVS, SER, and DB (you can quickly add new boxes) has a very low cost.

...

...
However, there is one more thing: You need to decide on an algorithm for selecting a usrloc record to replace when the cache is full. Do you store extra info in memory for each usrloc to make the right decision (ex. based on the number of lookups).

You may also purchase more memory :)

Do you suggest that no mechanism should be devised when the cache limit is hit? ;-) Then maybe I can suggest an email alert to the operator when a certain amount of the cache is full... :-D I trust my people to act fast and appropriate, but not that fast and appropriate!

g-)

Jan Janak

6:18 p.m.

On 30-05-2005 14:11, Greger V. Teigre wrote:

...

See inline. Jiri Kuthan wrote:

...
At 09:24 AM 5/30/2005, Greger V. Teigre wrote:

[...]

...
...

when ser starts up usrloc is "lazy-loaded"

if a usrloc record is looked up in cache and is __NOT__ found,

then MySQL will be queried. If found in MySQL then the usrloc record will be put in to cache for future lookups

By doing these two things we should not have a problem we excessively large subscriber bases.

Thoughts?

Makes sense. This is how Berkeley DB and many other DBs work. In fact, the best would be to build an abstraction cache layer around all the query functions that have data in the DB. This way you would get the optimum performance/scalability.

I have to admit I am not sufficiently familiarized with BDB. If I understand it right, they do confgurable in-memory caching and they also support some kind of master-slave replication. I am not sure though how this scales...(20 SERs with 20 BDBs, one of them master and replicating UsrLoc changes to 19 slaves who are all able to identify inconsistent cache?)

I mean the structural problem here is dealing with r-w intensive Usrloc operations and still desiring to replicate for reliability. There is a variety of algorithms to deal with it and I don't know well what the respective DB systems actually do.

I'm not proposing to use BDB, it was just an example. Databases are very good at replication, even two-way replication can be done quite efficiently through locking etc. I just took Paul's setup with cluster back-end as granted and wrote my comments based on that...

Thinking a bit wider and building on your comments, Jiri: The challenge, I think, is to handle the following things in any likely deployment scenario:

Usrloc writes to cache vs. DB

Replication of usrloc, multiple DBs vs. cluster, across LAN or WAN

Memory caching management (inconsistencies etc)

For the sake of the readers, here is how I understand SER's operations today:

Usrloc is always written to cache, DB write is controlled through

write-through parameter 2. Replication is handled by t_replicate 3. Management of cache is not needed, the cache is always updated. However, an updated DB (and thus dirty cache) will not be detected

I am working on that already. The entries in the usrloc cache will have an additional expires value and if that value expires then the usrloc code will refresh it from the database. Also there will be no full cache anymore -- usrloc will cache only a portion of the whole location database and old entries will be using LRU scheme.

The cache will be empty upon startup. When SER calls lookup then usrloc will search the cache -- if there is no entry or if it is expired then it will load it from the database and store in the cache for limited period of time. If there is no entry in the database then it will create a negative cache entry (to limit the amount of unsuccessful database queries).

Database updates will not assume anything about the state of the database so it should not matter if the entry still exists / does not exists / has been modified..

There is one drawback though -- nathelper as it is implemented right now will not work anymore -- we would need to rewrite it to use the contents of the database.

...

Here is how I understand Paul's proposal (and with my annotated suggestions from my last email :-):

Usrloc is always written to DB, cache is updated if it is already in the

cache 2. Replication is handled by underlying database across DBs or in a cluster 3. If usrloc is not found, DB is checked. If cache is full, some mechanism for throwing out a usrloc is devised

I must admit I often fall for the argument: "let each system do what it is best at." Following that, replication should only be done at an application level if the underlying database is not capable of doing it (if we agree that a DB is good at replication). The only thing I see a DB is not capable of, is handling the NAT issues. So, if a given usrloc has to be represented by different location (ex. the registration server), then the DB cannot do replication. However, if the NAT issue is handled through some other means, ex. Call-Id aware LVS with one public IP, then the usrloc should be the same across DBs and the DB should handle the replication.

Another approach would be to let the user agent handle NATs. Sipura phones, for example, can register with two proxy servers.

...

You don't need many subscribers before you'll want redundancy and as active-passive redundancy is a waste of resources, I believe an upgrade of the replication mechanism should soon be imminent. ;-) I think I have said this before, but this is my enterprise-level "dream" scenario:

Two geographically distributed server centers

DNS SRV for load distribution (and possible using segmentation of

clients through their configurations if they don't support DNS SRV) 3. Each data center has Call-Id sensitive LVS in front, with one or more servers at the back (a fair-sized LVS box can handle 8,000 UDP packets per second) 4. Each data center either has a DB cluster or two-ways SER-based replication 5. The data centers replicate between each other using either DB-based replication or two-ways SER-based replication 6. The SER-based replication is an enhanced version of t_replicate() were replication is to a set of servers and replication is ACKed and guaranteed (queue). I would suggest using the XMLRPC interface Jan has introduced 7. I think Paul's cache-suggestions are good regardless of decisions on replication

Entry level scenario where the same box is running LVS, SER, and DB (you can quickly add new boxes) has a very low cost.

...
...
However, there is one more thing: You need to decide on an algorithm for selecting a usrloc record to replace when the cache is full. Do you store extra info in memory for each usrloc to make the right decision (ex. based on the number of lookups).

You may also purchase more memory :)

Do you suggest that no mechanism should be devised when the cache limit is hit? ;-) Then maybe I can suggest an email alert to the operator when a certain amount of the cache is full... :-D I trust my people to act fast and appropriate, but not that fast and appropriate!

g-)

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Java Rockx

7:26 p.m.

Jan,

Great! I can only image that your very busy, however, do you have any sort of time frame for this to be commited to CVS?

Regards, Paul

On 5/30/05, Jan Janak jan@iptel.org wrote:

...

On 30-05-2005 14:11, Greger V. Teigre wrote:

...
See inline. Jiri Kuthan wrote:

...
At 09:24 AM 5/30/2005, Greger V. Teigre wrote:

[...]

...
...

when ser starts up usrloc is "lazy-loaded"

if a usrloc record is looked up in cache and is __NOT__ found,

then MySQL will be queried. If found in MySQL then the usrloc record will be put in to cache for future lookups

By doing these two things we should not have a problem we excessively large subscriber bases.

Thoughts?

Makes sense. This is how Berkeley DB and many other DBs work. In fact, the best would be to build an abstraction cache layer around all the query functions that have data in the DB. This way you would get the optimum performance/scalability.

I have to admit I am not sufficiently familiarized with BDB. If I understand it right, they do confgurable in-memory caching and they also support some kind of master-slave replication. I am not sure though how this scales...(20 SERs with 20 BDBs, one of them master and replicating UsrLoc changes to 19 slaves who are all able to identify inconsistent cache?)

I mean the structural problem here is dealing with r-w intensive Usrloc operations and still desiring to replicate for reliability. There is a variety of algorithms to deal with it and I don't know well what the respective DB systems actually do.

I'm not proposing to use BDB, it was just an example. Databases are very good at replication, even two-way replication can be done quite

efficiently

...
through locking etc. I just took Paul's setup with cluster back-end as granted and wrote my comments based on that...

Thinking a bit wider and building on your comments, Jiri: The challenge, I think, is to handle the following things in any likely deployment scenario:

Usrloc writes to cache vs. DB

Replication of usrloc, multiple DBs vs. cluster, across LAN or WAN

Memory caching management (inconsistencies etc)

For the sake of the readers, here is how I understand SER's operations today:

Usrloc is always written to cache, DB write is controlled through

write-through parameter 2. Replication is handled by t_replicate 3. Management of cache is not needed, the cache is always updated.

However,

...
an updated DB (and thus dirty cache) will not be detected

I am working on that already. The entries in the usrloc cache will have an additional expires value and if that value expires then the usrloc code will refresh it from the database. Also there will be no full cache anymore -- usrloc will cache only a portion of the whole location database and old entries will be using LRU scheme.

The cache will be empty upon startup. When SER calls lookup then usrloc will search the cache -- if there is no entry or if it is expired then it will load it from the database and store in the cache for limited period of time. If there is no entry in the database then it will create a negative cache entry (to limit the amount of unsuccessful database queries).

Database updates will not assume anything about the state of the database so it should not matter if the entry still exists / does not exists / has been modified..

There is one drawback though -- nathelper as it is implemented right now will not work anymore -- we would need to rewrite it to use the contents of the database.

...
Here is how I understand Paul's proposal (and with my annotated

suggestions

...
from my last email :-):

Usrloc is always written to DB, cache is updated if it is already in

the

...
cache 2. Replication is handled by underlying database across DBs or in a

cluster

...

If usrloc is not found, DB is checked. If cache is full, some

mechanism

...
for throwing out a usrloc is devised

I must admit I often fall for the argument: "let each system do what it

is

...
best at." Following that, replication should only be done at an application level

if

...
the underlying database is not capable of doing it (if we agree that a

DB

...
is good at replication). The only thing I see a DB is not capable of, is handling the NAT issues. So, if a given usrloc has to be represented by different location (ex. the registration server), then the DB cannot do replication. However, if the NAT issue is handled through some other

means,

...
ex. Call-Id aware LVS with one public IP, then the usrloc should be the same across DBs and the DB should handle the replication.

Another approach would be to let the user agent handle NATs. Sipura phones, for example, can register with two proxy servers.

...
You don't need many subscribers before you'll want redundancy and as active-passive redundancy is a waste of resources, I believe an upgrade

of

...
the replication mechanism should soon be imminent. ;-) I think I have said this before, but this is my enterprise-level "dream" scenario:

Two geographically distributed server centers

DNS SRV for load distribution (and possible using segmentation of

clients through their configurations if they don't support DNS SRV) 3. Each data center has Call-Id sensitive LVS in front, with one or more servers at the back (a fair-sized LVS box can handle 8,000 UDP packets

per

...
second) 4. Each data center either has a DB cluster or two-ways SER-based replication 5. The data centers replicate between each other using either DB-based replication or two-ways SER-based replication 6. The SER-based replication is an enhanced version of t_replicate()

were

...
replication is to a set of servers and replication is ACKed and

guaranteed

...
(queue). I would suggest using the XMLRPC interface Jan has introduced 7. I think Paul's cache-suggestions are good regardless of decisions on replication

Entry level scenario where the same box is running LVS, SER, and DB (you can quickly add new boxes) has a very low cost.

...
...
However, there is one more thing: You need to decide on an algorithm for selecting a usrloc record to replace when the cache is full. Do you store extra info in memory for each usrloc to make the right decision (ex. based on the number of lookups).

You may also purchase more memory :)

Do you suggest that no mechanism should be devised when the cache limit

is

...
hit? ;-) Then maybe I can suggest an email alert to the operator when a certain amount of the cache is full... :-D I trust my people to act fast and appropriate, but not that fast and appropriate!

g-)

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Greger V. Teigre

7:56 p.m.

Jan Janak wrote:

...

On 30-05-2005 14:11, Greger V. Teigre wrote:

...
For the sake of the readers, here is how I understand SER's operations today:

Usrloc is always written to cache, DB write is controlled through

write-through parameter 2. Replication is handled by t_replicate 3. Management of cache is not needed, the cache is always updated. However, an updated DB (and thus dirty cache) will not be detected

I am working on that already. The entries in the usrloc cache will have an additional expires value and if that value expires then the usrloc code will refresh it from the database. Also there will be no full cache anymore -- usrloc will cache only a portion of the whole location database and old entries will be using LRU scheme.

Excellent news!

...

The cache will be empty upon startup. When SER calls lookup then usrloc will search the cache -- if there is no entry or if it is expired then it will load it from the database and store in the cache for limited period of time. If there is no entry in the database then it will create a negative cache entry (to limit the amount of unsuccessful database queries).

Database updates will not assume anything about the state of the database so it should not matter if the entry still exists / does not exists / has been modified..

I assume another ser server receiving an update must both write to DB and do a t_replicate to other ser servers who then do save_memory() as Paul responded in his last email to Karl?

...

There is one drawback though -- nathelper as it is implemented right now will not work anymore -- we would need to rewrite it to use the contents of the database.

Are you referring to the ping feature only or are there other things as well? Reading all NATed devices from DB every 30 seconds?

...

...
I must admit I often fall for the argument: "let each system do what it is best at." Following that, replication should only be done at an application level if the underlying database is not capable of doing it (if we agree that a DB is good at replication). The only thing I see a DB is not capable of, is handling the NAT issues. So, if a given usrloc has to be represented by different location (ex. the registration server), then the DB cannot do replication. However, if the NAT issue is handled through some other means, ex. Call-Id aware LVS with one public IP, then the usrloc should be the same across DBs and the DB should handle the replication.

Another approach would be to let the user agent handle NATs. Sipura phones, for example, can register with two proxy servers.

Yes, good point. But I'm not aware of many other user agents with such support yet?!

g-)

Jan Janak

8:33 p.m.

On 30-05-2005 21:56, Greger V. Teigre wrote:

...

...
Database updates will not assume anything about the state of the database so it should not matter if the entry still exists / does not exists / has been modified..

I assume another ser server receiving an update must both write to DB and do a t_replicate to other ser servers who then do save_memory() as Paul responded in his last email to Karl?

t_replicate and save_memory will be necessary no more. Updates will propagate through the database. You can use mysql cluster and then all ser instances can be configured with the same table -- mysql cluster can handle conflict resolution internally.

...

...
There is one drawback though -- nathelper as it is implemented right now will not work anymore -- we would need to rewrite it to use the contents of the database.

Are you referring to the ping feature only or are there other things as well? Reading all NATed devices from DB every 30 seconds?

Yes, I am referring to the NAT pinger. We would need something that would walk through the entries in the database and send ping packets to those that are behind NAT.

Jan.

Greger V. Teigre

8:48 p.m.

Jan Janak wrote:

...

On 30-05-2005 21:56, Greger V. Teigre wrote:

...
...
Database updates will not assume anything about the state of the database so it should not matter if the entry still exists / does not exists / has been modified..

I assume another ser server receiving an update must both write to DB and do a t_replicate to other ser servers who then do save_memory() as Paul responded in his last email to Karl?

t_replicate and save_memory will be necessary no more. Updates will propagate through the database. You can use mysql cluster and then all ser instances can be configured with the same table -- mysql cluster can handle conflict resolution internally.

:-D Smashing! How is the scenario handled that Karl sketched? One ser is updated with a usrloc that exists in another ser's memory.

...

...
...
There is one drawback though -- nathelper as it is implemented right now will not work anymore -- we would need to rewrite it to use the contents of the database.

Are you referring to the ping feature only or are there other things as well? Reading all NATed devices from DB every 30 seconds?

Yes, I am referring to the NAT pinger. We would need something that would walk through the entries in the database and send ping packets to those that are behind NAT.

Hm. And doing this efficiently... g-)

Zeus Ng

31 May 31 May

3:10 a.m.

Should we start a new thread on this? It's becoming way pass the original question Paul asked.

Anyway, I feel the idea of having cached usrloc not worthy. Instead, a pure DB usrloc would be better off.

Lets consider a pure VoIP scenario. That is only REGISTER, INVITE, BYE, ACK and CANCEL requests and nothing else.

Now UA_A send it's REGISTER request to SER_A. The contact is being saved on SER_A's memory and DB. The information is replicated to SER_B's DB by either replication or shared cluster. An INVITE request is send to SER_B to locate UA_A. The contact cannot be found in SER_B's memory and a DB lookup is done and populate to SER_B's memory.

Now, UA_A sends another REGISTER request to SER_A with a different contact (maybe a reboot or change of IP via DHCP). This information is updated on SER_A's memory and DB. This information propagates to SER_B's DB as well.

UA_B sends another INVITE to SER_B to locate UA_A. SER_B can find the "old" UA_A contact detail in memory, which is different from DB version. The call will not be established because of wrong caching information.

If we are to use pure DB only usrloc. That problem will not happen.

To the argument of heavy RW on usrloc, I don't find caching helps. Consider a UA does a REGISTER request every 5 minutes. The REGISTER request will always behave the same. For other requests, there is always a DB read within that 5 minutes interval to populate the cache. Only subsequence requests go through the cache. You may have a difference call patten but I find rarely two INVITEs to the same UA within 5 minutes interval. After that 5 minutes, the cache becomes invalid and a DB read is required. Using cache saves nothing on DB read here. (Obviously you can argue about longer time between REGISTER request or contact rarely change with hardphone but the same principle applies) Besides, I have good experience with MySQL caching results. So, DB lookup on every SIP request with proper DB tunning can achieve similar result.

I am in the process of writing a pure DB base usrloc. The lookup() part is running fine. I just need to finish the save() and expire() functions. NAT handling will be added at a later stage. Will post the code once I finish that.

Zeus

...

-----Original Message----- From: serusers-bounces@lists.iptel.org [mailto:serusers-bounces@lists.iptel.org] On Behalf Of Greger V. Teigre Sent: Monday, 30 May 2005 10:11 PM To: Java Rockx; Jiri Kuthan Cc: serusers Subject: Re: [Serusers] SER Reports "out of memory"

See inline. Jiri Kuthan wrote:

...
At 09:24 AM 5/30/2005, Greger V. Teigre wrote:

[...]

...
...

when ser starts up usrloc is "lazy-loaded"

if a usrloc record is looked up in cache and is __NOT__ found,

then MySQL will be queried. If found in MySQL then the

usrloc record

...
...
...
will be put in to cache for future lookups

By doing these two things we should not have a problem we excessively large subscriber bases.

Thoughts?

Makes sense. This is how Berkeley DB and many other DBs work. In fact, the best would be to build an abstraction cache layer around all the query functions that have data in the DB. This way

you would

...
...
get the optimum performance/scalability.

I have to admit I am not sufficiently familiarized with BDB. If I understand it right, they do confgurable in-memory caching and they also support some kind of master-slave replication. I am not sure though how this scales...(20 SERs with 20 BDBs, one of them

master and

...
replicating UsrLoc changes to 19 slaves who are all able to

identify

...
inconsistent cache?)

I mean the structural problem here is dealing with r-w intensive Usrloc operations and still desiring to replicate for reliability. There is a variety of algorithms to deal with it and I

don't know well

...
what the respective DB systems actually do.

I'm not proposing to use BDB, it was just an example. Databases are very good at replication, even two-way replication can be done quite efficiently through locking etc. I just took Paul's setup with cluster back-end as granted and wrote my comments based on that...

Thinking a bit wider and building on your comments, Jiri: The challenge, I think, is to handle the following things in any likely deployment scenario:

Usrloc writes to cache vs. DB

Replication of usrloc, multiple DBs vs. cluster, across

LAN or WAN 3. Memory caching management (inconsistencies etc)

For the sake of the readers, here is how I understand SER's operations today:

Usrloc is always written to cache, DB write is controlled through

write-through parameter 2. Replication is handled by t_replicate 3. Management of cache is not needed, the cache is always updated. However, an updated DB (and thus dirty cache) will not be detected

Here is how I understand Paul's proposal (and with my annotated suggestions from my last email :-):

Usrloc is always written to DB, cache is updated if it is

already in the cache 2. Replication is handled by underlying database across DBs or in a cluster 3. If usrloc is not found, DB is checked. If cache is full, some mechanism for throwing out a usrloc is devised

I must admit I often fall for the argument: "let each system do what it is best at." Following that, replication should only be done at an application level if the underlying database is not capable of doing it (if we agree that a DB is good at replication). The only thing I see a DB is not capable of, is handling the NAT issues. So, if a given usrloc has to be represented by different location (ex. the registration server), then the DB cannot do replication. However, if the NAT issue is handled through some other means, ex. Call-Id aware LVS with one public IP, then the usrloc should be the same across DBs and the DB should handle the replication.

You don't need many subscribers before you'll want redundancy and as active-passive redundancy is a waste of resources, I believe an upgrade of the replication mechanism should soon be imminent. ;-) I think I have said this before, but this is my enterprise-level "dream" scenario:

Two geographically distributed server centers

DNS SRV for load distribution (and possible using

segmentation of clients through their configurations if they don't support DNS SRV) 3. Each data center has Call-Id sensitive LVS in front, with one or more servers at the back (a fair-sized LVS box can handle 8,000 UDP packets per second) 4. Each data center either has a DB cluster or two-ways SER-based replication 5. The data centers replicate between each other using either DB-based replication or two-ways SER-based replication 6. The SER-based replication is an enhanced version of t_replicate() were replication is to a set of servers and replication is ACKed and guaranteed (queue). I would suggest using the XMLRPC interface Jan has introduced 7. I think Paul's cache-suggestions are good regardless of decisions on replication

Entry level scenario where the same box is running LVS, SER, and DB (you can quickly add new boxes) has a very low cost.

...
...
However, there is one more thing: You need to decide on an algorithm for selecting a usrloc record to replace when

the cache is

...
...
full. Do you store extra info in memory for each usrloc

to make the

...
...
right decision (ex. based on the number of lookups).

You may also purchase more memory :)

Do you suggest that no mechanism should be devised when the cache limit is hit? ;-) Then maybe I can suggest an email alert to the operator when a certain amount of the cache is full... :-D I trust my people to act fast and appropriate, but not that fast and appropriate!

g-)

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Greger V. Teigre

5:18 a.m.

New subject: [Serusers] SER usrloc caching, was: SER Reports "out of memory"

Inline. ----- Original Message ----- From: "Zeus Ng" zeus.ng@isquare.com.au

...

Anyway, I feel the idea of having cached usrloc not worthy. Instead, a pure DB usrloc would be better off.

Lets consider a pure VoIP scenario. That is only REGISTER, INVITE, BYE, ACK and CANCEL requests and nothing else.

Now UA_A send it's REGISTER request to SER_A. The contact is being saved on SER_A's memory and DB. The information is replicated to SER_B's DB by either replication or shared cluster. An INVITE request is send to SER_B to locate UA_A. The contact cannot be found in SER_B's memory and a DB lookup is done and populate to SER_B's memory.

Now, UA_A sends another REGISTER request to SER_A with a different contact (maybe a reboot or change of IP via DHCP). This information is updated on SER_A's memory and DB. This information propagates to SER_B's DB as well.

UA_B sends another INVITE to SER_B to locate UA_A. SER_B can find the "old" UA_A contact detail in memory, which is different from DB version. The call will not be established because of wrong caching information.

If we are to use pure DB only usrloc. That problem will not happen.

Yes, I believe this is the same problem Karl pointed out? In itself it poses an issue, but I think the broader scalability impact is more important.

...

To the argument of heavy RW on usrloc, I don't find caching helps. Consider a UA does a REGISTER request every 5 minutes. The REGISTER request will always behave the same.

Well, if the cached usrloc is the same as in the REGISTER request, you don't need to write, do you? This should have a huge impact on the number of necessary writes as without a cache, you must write every time.

...

For other requests, there is always a DB read within that 5 minutes interval to populate the cache. Only subsequence requests go through the cache. You may have a difference call patten but I find rarely two INVITEs to the same UA within 5 minutes interval. After that 5 minutes, the cache becomes invalid and a DB read is required. Using cache saves nothing on DB read here. (Obviously you can argue about longer time between REGISTER request or contact rarely change with hardphone but the same principle applies)

I'm not sure I understand. The point here is to look at the lifetime of the usrloc, which arguably is (in most cases) far longer than five minutes. I don't follow you in that the cache is invalid after five minutes?!

...

Besides, I have good experience with MySQL caching results. So, DB lookup on every SIP request with proper DB tunning can achieve similar result.

That IS a good point. However, most of the overhead is not in the actual lookup, but in the call to mysql. I'm not sure what that overhead is, if it is very low, you're point hits the "let each system do what it's best at..." ;-)

...

I am in the process of writing a pure DB base usrloc. The lookup() part is running fine. I just need to finish the save() and expire() functions. NAT handling will be added at a later stage. Will post the code once I finish that.

If you have a hope that this may get into the CVS, I hope you contacted Jan before you started the development. He will be able to point out any overlaps in development (ex. his own), as well as point you to any issues. g-)

...

...
-----Original Message----- From: serusers-bounces@lists.iptel.org [mailto:serusers-bounces@lists.iptel.org] On Behalf Of Greger V. Teigre Sent: Monday, 30 May 2005 10:11 PM To: Java Rockx; Jiri Kuthan Cc: serusers Subject: Re: [Serusers] SER Reports "out of memory"

See inline. Jiri Kuthan wrote:

...
At 09:24 AM 5/30/2005, Greger V. Teigre wrote:

[...]

...
...

when ser starts up usrloc is "lazy-loaded"

if a usrloc record is looked up in cache and is __NOT__ found,

then MySQL will be queried. If found in MySQL then the

usrloc record

...
...
...
will be put in to cache for future lookups

By doing these two things we should not have a problem we excessively large subscriber bases.

Thoughts?

Makes sense. This is how Berkeley DB and many other DBs work. In fact, the best would be to build an abstraction cache layer around all the query functions that have data in the DB. This way

you would

...
...
get the optimum performance/scalability.

I have to admit I am not sufficiently familiarized with BDB. If I understand it right, they do confgurable in-memory caching and they also support some kind of master-slave replication. I am not sure though how this scales...(20 SERs with 20 BDBs, one of them

master and

...
replicating UsrLoc changes to 19 slaves who are all able to

identify

...
inconsistent cache?)

I mean the structural problem here is dealing with r-w intensive Usrloc operations and still desiring to replicate for reliability. There is a variety of algorithms to deal with it and I

don't know well

...
what the respective DB systems actually do.

I'm not proposing to use BDB, it was just an example. Databases are very good at replication, even two-way replication can be done quite efficiently through locking etc. I just took Paul's setup with cluster back-end as granted and wrote my comments based on that...

Thinking a bit wider and building on your comments, Jiri: The challenge, I think, is to handle the following things in any likely deployment scenario:

Usrloc writes to cache vs. DB

Replication of usrloc, multiple DBs vs. cluster, across

LAN or WAN 3. Memory caching management (inconsistencies etc)

For the sake of the readers, here is how I understand SER's operations today:

Usrloc is always written to cache, DB write is controlled through

write-through parameter 2. Replication is handled by t_replicate 3. Management of cache is not needed, the cache is always updated. However, an updated DB (and thus dirty cache) will not be detected

Here is how I understand Paul's proposal (and with my annotated suggestions from my last email :-):

Usrloc is always written to DB, cache is updated if it is

already in the cache 2. Replication is handled by underlying database across DBs or in a cluster 3. If usrloc is not found, DB is checked. If cache is full, some mechanism for throwing out a usrloc is devised

I must admit I often fall for the argument: "let each system do what it is best at." Following that, replication should only be done at an application level if the underlying database is not capable of doing it (if we agree that a DB is good at replication). The only thing I see a DB is not capable of, is handling the NAT issues. So, if a given usrloc has to be represented by different location (ex. the registration server), then the DB cannot do replication. However, if the NAT issue is handled through some other means, ex. Call-Id aware LVS with one public IP, then the usrloc should be the same across DBs and the DB should handle the replication.

You don't need many subscribers before you'll want redundancy and as active-passive redundancy is a waste of resources, I believe an upgrade of the replication mechanism should soon be imminent. ;-) I think I have said this before, but this is my enterprise-level "dream" scenario:

Two geographically distributed server centers

DNS SRV for load distribution (and possible using

segmentation of clients through their configurations if they don't support DNS SRV) 3. Each data center has Call-Id sensitive LVS in front, with one or more servers at the back (a fair-sized LVS box can handle 8,000 UDP packets per second) 4. Each data center either has a DB cluster or two-ways SER-based replication 5. The data centers replicate between each other using either DB-based replication or two-ways SER-based replication 6. The SER-based replication is an enhanced version of t_replicate() were replication is to a set of servers and replication is ACKed and guaranteed (queue). I would suggest using the XMLRPC interface Jan has introduced 7. I think Paul's cache-suggestions are good regardless of decisions on replication

Entry level scenario where the same box is running LVS, SER, and DB (you can quickly add new boxes) has a very low cost.

...
...
However, there is one more thing: You need to decide on an algorithm for selecting a usrloc record to replace when

the cache is

...
...
full. Do you store extra info in memory for each usrloc

to make the

...
...
right decision (ex. based on the number of lookups).

You may also purchase more memory :)

Do you suggest that no mechanism should be devised when the cache limit is hit? ;-) Then maybe I can suggest an email alert to the operator when a certain amount of the cache is full... :-D I trust my people to act fast and appropriate, but not that fast and appropriate!

g-)

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Zeus Ng

9:48 a.m.

New subject: [Serusers] SER usrloc caching, was: SER Reports "out of memory"

If implemented properly, usrloc cache will no doubt help scaling SER. My arguments mainly focus on the DB RW issue. People can decide if pure DB usrloc is best for them or not. See inline for comments.

...

-----Original Message-----

...

...
If we are to use pure DB only usrloc. That problem will not happen.

Yes, I believe this is the same problem Karl pointed out? In itself it poses an issue, but I think the broader scalability impact is more important.

For me, integrity is much more important than scalability.

...

...
a UA does a REGISTER request every 5 minutes. The REGISTER

request will

...
always behave the same.

Well, if the cached usrloc is the same as in the REGISTER request, you don't need to write, do you? This should have a huge impact on the number of necessary writes as without a cache, you must write every time.

How about expire time? Even though contact and receive IP are the same, expire time will almost always be different. You can't ignore that, can you? Also, if a user has two UA and they register to two different SER instances, how are we going to sync the cache without writing to DB.

...

...
For other requests, there is always a DB read within that 5 minutes interval to populate the cache. Only subsequence requests go through the cache. You may have a difference call patten

but I find rarely

...
two INVITEs to the same UA within 5 minutes interval. After that 5 minutes, the cache becomes invalid and a DB read is required. Using

cache saves

...
nothing on DB read here. (Obviously you can argue about longer time between REGISTER request or contact rarely change with hardphone

but the same

...
principle applies)

I'm not sure I understand. The point here is to look at the lifetime of the usrloc, which arguably is (in most cases) far longer than five minutes. I don't follow you in that the cache is invalid after five minutes?!

The life time of UA and it's contact IP may be very long but the life time of a register request is limited. The 5 minutes is just an example to illustrate that once that register expired, the record should not be trust anymore. Only the new register request, with new expire time should be used. This is what I mean invalid cache.

...

...
Besides, I have good experience with MySQL caching results. So, DB lookup on every SIP request with proper DB

tunning can

...
achieve similar result.

That IS a good point. However, most of the overhead is not in the actual lookup, but in the call to mysql. I'm not sure what that overhead is, if it

On local DB server, the overhead is minimal. Unlike PHP, SER opens the connection at start up and close it at shutdown. Only the request and result go through UNIX socket for each query. This reduce the overhead of connect and disconnect which could be heavy. With remote DB, the over head is TCP. By using MySQL replication, SER can query local DB and leave the TCP overhead to MySQL itself. Going via slow WAN link is another issue.

...

is very low, you're point hits the "let each system do what it's best at..."

More or less.

Zeus

Alex Vishnev

11:26 a.m.

New subject: [Serusers] SER usrloc caching, was: SER Reports "out of memory"

All,

Not sure if this was discussed before, however have you considered MRU cache implementation. I don't see a reason why the entire userloc needs to be cached. MRU cache is allocated once during initialization. The size can be driven via user defined parameter or be statically set to maximum shared memory segment configured in the system. MRU will be populated over time and not at once when the system initializes. Therefore, initial startup time will be short, however a number of incoming requests will need to perform db queries to retrieve the information and cache it. Comments?

Alex -----Original Message----- From: serusers-bounces@iptel.org [mailto:serusers-bounces@lists.iptel.org] On Behalf Of Zeus Ng Sent: Tuesday, May 31, 2005 5:49 AM To: 'Greger V. Teigre'; 'Java Rockx' Cc: 'serusers' Subject: RE: [Serusers] SER usrloc caching, was: SER Reports "out of memory"

...

-----Original Message-----

...

...
If we are to use pure DB only usrloc. That problem will not happen.

Yes, I believe this is the same problem Karl pointed out? In itself it poses an issue, but I think the broader scalability impact is more important.

For me, integrity is much more important than scalability.

...

...
a UA does a REGISTER request every 5 minutes. The REGISTER

request will

...
always behave the same.

Well, if the cached usrloc is the same as in the REGISTER request, you don't need to write, do you? This should have a huge impact on the number of necessary writes as without a cache, you must write every time.

...

...
For other requests, there is always a DB read within that 5 minutes interval to populate the cache. Only subsequence requests go through the cache. You may have a difference call patten

but I find rarely

...
two INVITEs to the same UA within 5 minutes interval. After that 5 minutes, the cache becomes invalid and a DB read is required. Using

cache saves

...
nothing on DB read here. (Obviously you can argue about longer time between REGISTER request or contact rarely change with hardphone

but the same

...
principle applies)

I'm not sure I understand. The point here is to look at the lifetime of the usrloc, which arguably is (in most cases) far longer than five minutes. I don't follow you in that the cache is invalid after five minutes?!

...

...
Besides, I have good experience with MySQL caching results. So, DB lookup on every SIP request with proper DB

tunning can

...
achieve similar result.

That IS a good point. However, most of the overhead is not in the actual lookup, but in the call to mysql. I'm not sure what that overhead is, if it

...

is very low, you're point hits the "let each system do what it's best at..."

More or less.

Zeus

_______________________________________________ Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Greger V. Teigre

8:57 p.m.

New subject: [Serusers] SER usrloc caching, was: SER Reports "out of memory"

I believe this was what Jan sketched out, though he hasn't defined how entries should be discarded. Most recently used can be one, most often accesed another. g-)

Alex Vishnev wrote:

...

All,

Not sure if this was discussed before, however have you considered MRU cache implementation. I don't see a reason why the entire userloc needs to be cached. MRU cache is allocated once during initialization. The size can be driven via user defined parameter or be statically set to maximum shared memory segment configured in the system. MRU will be populated over time and not at once when the system initializes. Therefore, initial startup time will be short, however a number of incoming requests will need to perform db queries to retrieve the information and cache it. Comments?

Alex -----Original Message----- From: serusers-bounces@iptel.org [mailto:serusers-bounces@lists.iptel.org] On Behalf Of Zeus Ng Sent: Tuesday, May 31, 2005 5:49 AM To: 'Greger V. Teigre'; 'Java Rockx' Cc: 'serusers' Subject: RE: [Serusers] SER usrloc caching, was: SER Reports "out of memory"

If implemented properly, usrloc cache will no doubt help scaling SER. My arguments mainly focus on the DB RW issue. People can decide if pure DB usrloc is best for them or not. See inline for comments.

...
-----Original Message-----

...
...
If we are to use pure DB only usrloc. That problem will not happen.

Yes, I believe this is the same problem Karl pointed out? In itself it poses an issue, but I think the broader scalability impact is more important.

For me, integrity is much more important than scalability.

...
...
a UA does a REGISTER request every 5 minutes. The REGISTER

request will

...
always behave the same.

Well, if the cached usrloc is the same as in the REGISTER request, you don't need to write, do you? This should have a huge impact on the number of necessary writes as without a cache, you must write every time.

How about expire time? Even though contact and receive IP are the same, expire time will almost always be different. You can't ignore that, can you? Also, if a user has two UA and they register to two different SER instances, how are we going to sync the cache without writing to DB.

...
...
For other requests, there is always a DB read within that 5 minutes interval to populate the cache. Only subsequence requests go through the cache. You may have a difference call patten

but I find rarely

...
two INVITEs to the same UA within 5 minutes interval. After that 5 minutes, the cache becomes invalid and a DB read is required. Using

cache saves

...
nothing on DB read here. (Obviously you can argue about longer time between REGISTER request or contact rarely change with hardphone

but the same

...
principle applies)

I'm not sure I understand. The point here is to look at the lifetime of the usrloc, which arguably is (in most cases) far longer than five minutes. I don't follow you in that the cache is invalid after five minutes?!

The life time of UA and it's contact IP may be very long but the life time of a register request is limited. The 5 minutes is just an example to illustrate that once that register expired, the record should not be trust anymore. Only the new register request, with new expire time should be used. This is what I mean invalid cache.

...
...
Besides, I have good experience with MySQL caching results. So, DB lookup on every SIP request with proper DB

tunning can

...
achieve similar result.

That IS a good point. However, most of the overhead is not in the actual lookup, but in the call to mysql. I'm not sure what that overhead is, if it

On local DB server, the overhead is minimal. Unlike PHP, SER opens the connection at start up and close it at shutdown. Only the request and result go through UNIX socket for each query. This reduce the overhead of connect and disconnect which could be heavy. With remote DB, the over head is TCP. By using MySQL replication, SER can query local DB and leave the TCP overhead to MySQL itself. Going via slow WAN link is another issue.

...
is very low, you're point hits the "let each system do what it's best at..."

More or less.

Zeus

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Greger V. Teigre

9:17 p.m.

New subject: [Serusers] SER usrloc caching, was: SER Reports "out of memory"

Zeus Ng wrote:

...

If implemented properly, usrloc cache will no doubt help scaling SER. My arguments mainly focus on the DB RW issue. People can decide if pure DB usrloc is best for them or not. See inline for comments.

...
-----Original Message-----

...
...
If we are to use pure DB only usrloc. That problem will not happen.

Yes, I believe this is the same problem Karl pointed out? In itself it poses an issue, but I think the broader scalability impact is more important.

For me, integrity is much more important than scalability.

Yes, one have to start with integrity as it cannot be sacrificed...

...

...
...
a UA does a REGISTER request every 5 minutes. The REGISTER

request will

...
always behave the same.

Well, if the cached usrloc is the same as in the REGISTER request, you don't need to write, do you? This should have a huge impact on the number of necessary writes as without a cache, you must write every time.

How about expire time? Even though contact and receive IP are the same, expire time will almost always be different. You can't ignore that, can you? Also, if a user has two UA and they register to two different SER instances, how are we going to sync the cache without writing to DB.

Good points! I forgot about expire time.

...

...
...
For other requests, there is always a DB read within that 5 minutes interval to populate the cache. Only subsequence requests go through the cache. You may have a difference call patten

but I find rarely

...
two INVITEs to the same UA within 5 minutes interval. After that 5 minutes, the cache becomes invalid and a DB read is required. Using

cache saves

...
nothing on DB read here. (Obviously you can argue about longer time between REGISTER request or contact rarely change with hardphone

but the same

...
principle applies)

I'm not sure I understand. The point here is to look at the lifetime of the usrloc, which arguably is (in most cases) far longer than five minutes. I don't follow you in that the cache is invalid after five minutes?!

The life time of UA and it's contact IP may be very long but the life time of a register request is limited. The 5 minutes is just an example to illustrate that once that register expired, the record should not be trust anymore. Only the new register request, with new expire time should be used. This is what I mean invalid cache.

Yes, I get it now that I was reminded about the expire time. Valid arguments.

...

...
...
Besides, I have good experience with MySQL caching results. So, DB lookup on every SIP request with proper DB

tunning can

...
achieve similar result.

That IS a good point. However, most of the overhead is not in the actual lookup, but in the call to mysql. I'm not sure what that overhead is, if it

On local DB server, the overhead is minimal. Unlike PHP, SER opens the connection at start up and close it at shutdown. Only the request and result go through UNIX socket for each query. This reduce the overhead of connect and disconnect which could be heavy. With remote DB, the over head is TCP. By using MySQL replication, SER can query local DB and leave the TCP overhead to MySQL itself. Going via slow WAN link is another issue.

Yes, I guess the iptel guys have some real-life measures they can use in deciding upon the best scheme.

Hopefully you're already developed code can be combined with Jan's work. Seems like you have spent some time thinking about this. :-D g-)

Java Rockx

30 May 30 May

1:35 p.m.

Greger,

We would let SER use the save() function which would update the local machine's cache and eventually persist the record to MySQL. Once persisted the other MySQL servers would then see the updated usrloc record, thus letting other ser routers "see" then if needed.

The only real difference here is that t_replicate() would not ever be called.

On 5/30/05, Greger V. Teigre greger@teigre.com wrote:

...

...
Actually, a minute delay would be a bad thing because replicated usrloc records, using t_replicate() would not make it in to peer SER server caches when the server is starting up.

Yeah, I forgot about that scheme...

...
Given this fact, and given the fact that most SER modules do not hash data upon server startup [like group.so, etc, etc] we are starting to see little value in caching usrloc. Our MySQL server is hit 12 times for an INVITE message and so complete caching of usrloc is of minimal performace gain.

Anyhow, we're not in process of modifying SER so that:

when ser starts up usrloc is "lazy-loaded"

if a usrloc record is looked up in cache and is __NOT__ found, then

MySQL will be queried. If found in MySQL then the usrloc record will be put in to cache for future lookups

By doing these two things we should not have a problem we excessively large subscriber bases.

Thoughts?

Makes sense. This is how Berkeley DB and many other DBs work. In fact, the best would be to build an abstraction cache layer around all the query functions that have data in the DB. This way you would get the optimum performance/scalability. However, there is one more thing: You need to decide on an algorithm for selecting a usrloc record to replace when the cache is full. Do you store extra info in memory for each usrloc to make the right decision (ex. based on the number of lookups). Also, what to do when you are storing a new location with save: Do you put it in the cache as well? Today this happens automatically. As you have continous registrations, you will fill up the cache with registered clients (and push out the ones having been called). What you really want is to keep the user locations you need (those required by lookup) in the cache. So I would suggest that in save(), you only write to the DB (and of course update the record if its in the cache) and that lookup() is the function that will control the activation and replacement of the records in the cache. I think this approach to caching is of interest also to those who do not have a mysql cluster, but do regular replication, for example to reduce start-up time. I believe an implementation may get pretty involved (in terms of important functions you need to touch). However, I cannot see that you will need to touch the core. g-)

...
Paul

On 5/29/05, Greger V. Teigre greger@teigre.com wrote: Interesting discussion. I believe most large-scale deployments (there aren't really that many...) divide the user base across several servers. I believe they use 20K users is a "good number" per server. So, one ser having to load that many records, is only if you have a cluster with no server divide. Loading all the contacts into memory is impossible to scale, at one point it will take too long time and take too much memory. So, a better architecture *for such a deployment scenario* would be a cache of some size and then a lookup of records in DB if not present in cache. Loading 330 records per second, you can load about 20,000 contacts in a minute, which probably is ok. g-)

Zeus Ng wrote:

...
See inline comment.

...
Thanks for the info. I did change that config.h define and now it works well.

Great to hear that the little change solve your problem.

...
My newest problem is the ser start time. In my very non-scientific test it took ser about 25 minutes before it began serving requests because it was loading usrloc information.

That was using 500000 records in the location table. The MySQL server was running on the same box as SER, which is also my workstation, so stuff like Firefox, X, etc, were in use.

But this does bring up an interesting problem namely - how can ser service SIP clients while loading large number of usrloc records? I'm kind of thinking that this could be a big

No, you can't. In fact, you will experience a temporary slow down when a hugh number of UA is un-registering because the table was locked during that period of time. I once use sipsak to register 5000 users in 15s. When they all expired about the same time, SER hang for a while for locking the table to release the record from memory.

...
problem. When dealing with massive user bases there is no such thing as a "quick restart".

Well, that's the trade-off of memory base db. You need to balance the startup time verse runtime performance.

...
We do have LVS fully "sip-aware" so we are doing true UDP load balancing based on the Call-ID header, but this is still a problem [potentially] with replication ucontact records while the server is starting up.

I wonder if it is possible to modify the behaviour of usrloc so that it loads in the background while ser is processing SIP messages. And when lookup("location") is executed, usrloc searching the the ser cache and then MySQL if no hit is found in cache -- or something like that.

This triggers me to bring up the common question asked on this list before. Can SER use just MySQL for usrloc? A similar concept has been done on the speeddial module. It would help load distribution, faster startup time and better redundancy. Of course, slower lookup as tradeoff.

I once consider replacing the build-in memory base DB with MySQL memory db. However, that idea was dropped due to time constrain and compatability (postgresql) issue.

...
Can anyone on serusers give some tips as to how to get ser to load usrloc entries optimized? I know the usual stuff like faster MySQL disks, faster network connection, dedicated app servers, etc, etc. But I'm looking for ser and/or MySQL tweaking hacks.

Good luck on your search.

...
Regards, Paul

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Erik Versaevel - Infopact Netwerkdiensten BV

25 May 25 May

6:55 a.m.

Java Rockx wrote:

...

We do have LVS fully "sip-aware" so we are doing true UDP load balancing based on the Call-ID header, but this is still a problem [potentially] with replication ucontact records while the server is starting up.

How did you get LVS to do that? 3rd party addon or some inhouse solution?

Kind regards,

E. V.

Java Rockx

11:37 a.m.

We did our own in-house SIP scheduler, which did force us to modify about 20 lines of LVS code but at least we didn't have to do any special kernel patching.

Regards, Paul

On 5/25/05, Erik Versaevel - Infopact Netwerkdiensten BV erik@infopact.nl wrote:

...

Java Rockx wrote:

...
We do have LVS fully "sip-aware" so we are doing true UDP load balancing based on the Call-ID header, but this is still a problem [potentially] with replication ucontact records while the server is starting up.

How did you get LVS to do that? 3rd party addon or some inhouse solution?

Kind regards,

E. V.

7349

Age (days ago)

7356

Last active (days ago)

sr-users@lists.kamailio.org

34 comments

9 participants

tags (0)

participants (9)

Alex Vishnev
Erik Versaevel - Infopact Netwerkdiensten BV
Greger V. Teigre
Jan Janak
Java Rockx
Jiri Kuthan
Juha Heinanen
Karl H. Putz
Zeus Ng