Re: usrloc loading (was Re: [Serusers] Re: Fw: [Users] TM : retransmission timers

List overview All Threads
Download

newer

older

Re: [Users] Duplicate INVITEs -...

[Users] determining if From==To

Jiri Kuthan

30 Nov 2006 30 Nov '06

1:44 p.m.

Thanks, how could I have missed that!

Is what it does it loads usrloc table piece-wise? And if so, why does it do only for mysql?

And also, it is saying that old version would have increased memory consumption with TCP/TLS -- does it mean that it does not increase with the new version?

-jiri

At 14:32 30/11/2006, Ovidiu Sas wrote:

...

Maybe you are looking for this one: http://www.openser.org/index.php?option=com_content&task=view&id=48&...

Regards, Ovidiu Sas

On 11/30/06, Jiri Kuthan jiri@iptel.org wrote:

...
At 12:53 22/11/2006, Weiter Leiter wrote:

...
I know that OpenSER loads (only?) faster.

Can folks share with me what the fast-usrloc-loading feature is about? I was not successful finding it out.

Thanks!

-jiri

-- Jiri Kuthan http://iptel.org/~jiri/

Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users

-- Jiri Kuthan http://iptel.org/~jiri/

Show replies by date

Ovidiu Sas

30 Nov 30 Nov

1:54 p.m.

New subject: usrloc loading (was Re: [Serusers] Re: Fw: [Users] TM : retransmission timers

Both mysql and postgres are having this feature (dev version). I don't think that it was implemented in unixodbc ...

-ovi

On 11/30/06, Jiri Kuthan jiri@iptel.org wrote:

...

Thanks, how could I have missed that!

Is what it does it loads usrloc table piece-wise? And if so, why does it do only for mysql?

And also, it is saying that old version would have increased memory consumption with TCP/TLS -- does it mean that it does not increase with the new version?

-jiri

At 14:32 30/11/2006, Ovidiu Sas wrote:

...
Maybe you are looking for this one: http://www.openser.org/index.php?option=com_content&task=view&id=48&...

Regards, Ovidiu Sas

On 11/30/06, Jiri Kuthan jiri@iptel.org wrote:

...
At 12:53 22/11/2006, Weiter Leiter wrote:

...
I know that OpenSER loads (only?) faster.

Can folks share with me what the fast-usrloc-loading feature is about? I was not successful finding it out.

Thanks!

-jiri

-- Jiri Kuthan http://iptel.org/~jiri/

Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users

-- Jiri Kuthan http://iptel.org/~jiri/

Andrei Pelinescu-Onciul

3 p.m.

New subject: usrloc loading (was Re: [Serusers] Re: Fw: [Users] TM : retransmission timers

On Nov 30, 2006 at 14:44, Jiri Kuthan jiri@iptel.org wrote:

...

Thanks, how could I have missed that!

Is what it does it loads usrloc table piece-wise? And if so, why does it do only for mysql?

And also, it is saying that old version would have increased memory consumption with TCP/TLS -- does it mean that it does not increase with the new version?

I think they mean the number listed on the web page were obtained with tcp & tls disabled and if they would have enabled them you would get higher numbers due to tcp & tls internal memory use (unrelated to usrloc).

I haven't looked at the details, but it sounds good to me. It is much more convenient not to have to increase manually the private memory size and to recompile if you have large usrloc and keep them in the database. A nice side effect is that it probably causes faster startup times in such sitations.

However there is a mistake: it does not decrease memory usage by the numbers claimed (very very far from them). This is because usrloc loads the database content _only_ from the main ser process (when the registrar fixup functions for the save & lookup family are called -- from the main process before forking). All the children processes will inherit all the memory of their parent, however memory pages that are not modified will point to the parent's memory space and will not take _any_ system memory (this is because most unixes have a copy-on-write policy - after a fork a page is allocated & copied only if somebody attempts to write in it). So the children will not use any extra memory then normal (the initial db loading and the initial private memory size do not cause extra memory being used/allocated by the system for the children processes). To get the real memory used by a process look at the resident set size (RSS) and not at the virtual size (VSZ), which at least in ser's case tells the maximum memory that ser would ever use (and thus it's not that usefull). I will try to give a correct example, supposing that the fetch fix doesn't use any private memory at all on startup (or uses negligible ammounts). Let's take the numbers published on the web page: non-fixed usrloc would need 32Mb of private memory to load the whole location table => in the end the fetch-fixed usrloc version would end up using 32Mb less system memory then the non-fixed version (because of the main process which really used the private memory to load the db usrloc). There is a big difference from the 1Gb claimed...

Andrei

...

-jiri

At 14:32 30/11/2006, Ovidiu Sas wrote:

...
Maybe you are looking for this one: http://www.openser.org/index.php?option=com_content&task=view&id=48&...

Regards, Ovidiu Sas

On 11/30/06, Jiri Kuthan jiri@iptel.org wrote:

...
At 12:53 22/11/2006, Weiter Leiter wrote:

...
I know that OpenSER loads (only?) faster.

Can folks share with me what the fast-usrloc-loading feature is about? I was not successful finding it out.

Thanks!

-jiri

-- Jiri Kuthan http://iptel.org/~jiri/

Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users

-- Jiri Kuthan http://iptel.org/~jiri/

Serusers mailing list Serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Jiri Kuthan

3:51 p.m.

New subject: usrloc loading (was Re: [Serusers] Re: Fw: [Users] TM : retransmission timers

At 16:00 30/11/2006, Andrei Pelinescu-Onciul wrote:

...

I think they mean the number listed on the web page were obtained with tcp & tls disabled and if they would have enabled them you would get higher numbers due to tcp & tls internal memory use (unrelated to usrloc).

I see.

...

I haven't looked at the details, but it sounds good to me. It is much more convenient not to have to increase manually the private memory size and to recompile if you have large usrloc and keep them in the database.

Agreed, that's where my curiosity came from.

...

A nice side effect is that it probably causes faster startup times in such sitations.

That's what I'm struggling with.

I mean in the end there is some quantity of data which needs to be downloaded, so either you do spend long time with downloading it in advance to cut later request processing latency or you start quickly and download the data on demand with some latency penalty for cache misses.

Where is the time saving coming from then?

...

However there is a mistake: it does not decrease memory usage by the numbers claimed (very very far from them). This is because usrloc loads the database content _only_ from the main ser process (when the registrar fixup functions for the save & lookup family are called -- from the main process before forking). All the children processes will inherit all the memory of their parent, however memory pages that are not modified will point to the parent's memory space and will not take _any_ system memory (this is because most unixes have a copy-on-write policy - after a fork a page is allocated & copied only if somebody attempts to write in it). So the children will not use any extra memory then normal (the initial db loading and the initial private memory size do not cause extra memory being used/allocated by the system for the children processes). To get the real memory used by a process look at the resident set size (RSS) and not at the virtual size (VSZ), which at least in ser's case tells the maximum memory that ser would ever use (and thus it's not that usefull). I will try to give a correct example, supposing that the fetch fix doesn't use any private memory at all on startup (or uses negligible ammounts). Let's take the numbers published on the web page: non-fixed usrloc would need 32Mb of private memory to load the whole location table => in the end the fetch-fixed usrloc version would end up using 32Mb less system memory then the non-fixed version (because of the main process which really used the private memory to load the db usrloc). There is a big difference from the 1Gb claimed...

Thanks!

-jiri

-- Jiri Kuthan http://iptel.org/~jiri/

samuel

4:58 p.m.

New subject: usrloc loading (was Re: [Serusers] Re: Fw: [Users] TM : retransmission timers

...

Where is the time saving coming from then?

I think the idea behind was the following: The use case is for big providers with lots of entries in the usrloc database. A restart in such situation might lead to stop in the service for quite a few minutes (i don't recall the numbers) while the server is loading the data. If you split the data in chunks and load it sequentally, you can start serving without interrumption...

please, can somebody confirm this assumption(I'm not 100% sure)??

Samuel.

Daniel-Constantin Mierla

5:22 p.m.

New subject: [Users] Re: usrloc loading

Hello,

some confusion was created around this subject. It was pointed a news which was related to an improvement (fetch support) which brought memory usage optimization in usrloc (going to be expanded in usage to other modules, like lcr, presence ...), not usrloc loading/lookup optimization. It was not yet a news since the work is not fully finished/well tested. The news about this new improvements will come in the near future. It started in summer, with:

http://openser.org/pipermail/devel/2006-July/003469.html

Shortly, usrloc records are not loaded anymore by the main process, but by first child. All other children processes can handle other events/sip messages in parallel. Previously, at start, OpenSER was blocked until all records were loaded, which could be quite long when having big numbers of active users.

Cheers, Daniel

On 11/30/06 18:58, samuel wrote:

...

...
Where is the time saving coming from then?

I think the idea behind was the following: The use case is for big providers with lots of entries in the usrloc database. A restart in such situation might lead to stop in the service for quite a few minutes (i don't recall the numbers) while the server is loading the data. If you split the data in chunks and load it sequentally, you can start serving without interrumption...

please, can somebody confirm this assumption(I'm not 100% sure)??

Samuel.

Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users

Atle Samuelsen

9:32 p.m.

New subject: [Serusers] [Users] Re: usrloc loading

Hi guys

This was supposed to be a forward to the serusers list, and not a bounce.

Sorry Daniel.

-Atle

* Daniel-Constantin Mierla daniel@voice-system.ro [061130 19:28]:

...

Hello,

some confusion was created around this subject. It was pointed a news which was related to an improvement (fetch support) which brought memory usage optimization in usrloc (going to be expanded in usage to other modules, like lcr, presence ...), not usrloc loading/lookup optimization. It was not yet a news since the work is not fully finished/well tested. The news about this new improvements will come in the near future. It started in summer, with:

http://openser.org/pipermail/devel/2006-July/003469.html

Shortly, usrloc records are not loaded anymore by the main process, but by first child. All other children processes can handle other events/sip messages in parallel. Previously, at start, OpenSER was blocked until all records were loaded, which could be quite long when having big numbers of active users.

Cheers, Daniel

On 11/30/06 18:58, samuel wrote:

...
...
Where is the time saving coming from then?

I think the idea behind was the following: The use case is for big providers with lots of entries in the usrloc database. A restart in such situation might lead to stop in the service for quite a few minutes (i don't recall the numbers) while the server is loading the data. If you split the data in chunks and load it sequentally, you can start serving without interrumption...

please, can somebody confirm this assumption(I'm not 100% sure)??

Samuel.

Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users

Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users _______________________________________________ Serusers mailing list Serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

sip

1 Dec 1 Dec

9:51 p.m.

New subject: [Serusers] [Users] Re: usrloc loading

That's quite a nifty concept.

Something I have noted during all this back and forth is that both these projects (SER, OpenSER) have some incredible innovations in them.

I find the whole thing very exciting. I mean... bits of both SER and OpenSER are part of our core business services model, so it's easy for me to salivate at all these neat things going on in the development on both sides.

On Thu, 30 Nov 2006 19:22:58 +0200, Daniel-Constantin Mierla wrote

...

Hello,

some confusion was created around this subject. It was pointed a news which was related to an improvement (fetch support) which brought memory usage optimization in usrloc (going to be expanded in usage to other modules, like lcr, presence ...), not usrloc loading/lookup optimization. It was not yet a news since the work is not fully finished/well tested. The news about this new improvements will come in the near future. It started in summer, with:

http://openser.org/pipermail/devel/2006-July/003469.html

Shortly, usrloc records are not loaded anymore by the main process, but by first child. All other children processes can handle other events/sip messages in parallel. Previously, at start, OpenSER was blocked until all records were loaded, which could be quite long when having big numbers of active users.

Cheers, Daniel

On 11/30/06 18:58, samuel wrote:

...
...
Where is the time saving coming from then?

I think the idea behind was the following: The use case is for big providers with lots of entries in the usrloc database. A restart in such situation might lead to stop in the service for quite a few minutes (i don't recall the numbers) while the server is loading the data. If you split the data in chunks and load it sequentally, you can start serving without interrumption...

please, can somebody confirm this assumption(I'm not 100% sure)??

Samuel.

Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users

Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users _______________________________________________ Serusers mailing list Serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Jiri Kuthan

4 Dec 4 Dec

9:19 a.m.

New subject: [Serusers] [Users] Re: usrloc loading

At 18:22 30/11/2006, Daniel-Constantin Mierla wrote:

...

Hello,

some confusion was created around this subject. It was pointed a news which was related to an improvement (fetch support) which brought memory usage optimization in usrloc (going to be expanded in usage to other modules, like lcr, presence ...), not usrloc loading/lookup optimization. It was not yet a news since the work is not fully finished/well tested. The news about this new improvements will come in the near future. It started in summer, with:

http://openser.org/pipermail/devel/2006-July/003469.html

Shortly, usrloc records are not loaded anymore by the main process, but by first child. All other children processes can handle other events/sip messages in parallel. Previously, at start, OpenSER was blocked until all records were loaded, which could be quite long when having big numbers of active users.

Whats going to happen then if a child x process wants to read a usrloc entry whereas child 1 is not finished reading yet?

Thanks!

-jiri

-- Jiri Kuthan http://iptel.org/~jiri/

Jiri Kuthan

6 Dec 6 Dec

1:05 p.m.

New subject: [Serusers] [Users] Re: usrloc loading

At 10:19 04/12/2006, Jiri Kuthan wrote:

...

At 18:22 30/11/2006, Daniel-Constantin Mierla wrote:

...
Hello,

some confusion was created around this subject. It was pointed a news which was related to an improvement (fetch support) which brought memory usage optimization in usrloc (going to be expanded in usage to other modules, like lcr, presence ...), not usrloc loading/lookup optimization. It was not yet a news since the work is not fully finished/well tested. The news about this new improvements will come in the near future. It started in summer, with:

http://openser.org/pipermail/devel/2006-July/003469.html

Shortly, usrloc records are not loaded anymore by the main process, but by first child. All other children processes can handle other events/sip messages in parallel. Previously, at start, OpenSER was blocked until all records were loaded, which could be quite long when having big numbers of active users.

Whats going to happen then if a child x process wants to read a usrloc entry whereas child 1 is not finished reading yet?

CAn someone help with the questions - retransmission: can't there be inconsistence of data during initial DB loading? - performance: can comeone confirm for me that the acclaimed memory saving are not achievable, as suggested in Andrei's email

The bottom line is I'm trying to learn what it actually is and based on these assumptions, it appears one-process-loading+others-as-normal -- it that what it does?

Thanks!

-jiri

-- Jiri Kuthan http://iptel.org/~jiri/

Martin Hoffmann

30 Nov 30 Nov

10:33 p.m.

New subject: [Serusers] Re: usrloc loading

samuel wrote:

...

...
Where is the time saving coming from then?

I think the idea behind was the following: The use case is for big providers with lots of entries in the usrloc database. A restart in such situation might lead to stop in the service for quite a few minutes (i don't recall the numbers) while the server is loading the data.

Numbers obviously depend on your hardware and whether you have a local database. But they are somewhere in the range of 20 seconds for 50,000 entries, 2 minutes for 100,000 entries and 10 minuits for 500,000 entries.

Part of the problem and also of the memory usage problem is that the database interface of SER requires that the entire table is slurped into SER's process memory instead of fetching and processing it row by row. This can cause funny behaviour during start-up and a near heart attack for the sysadmin.

...

If you split the data in chunks and load it sequentally, you can start serving without interrumption...

As far as I understand the announcement (haven't looked at the actual code), the idea is to load everything inside an extra process. The problem with that kind of speed-up is that your responses will not be correct during the loading phase. I am not sure if this is better than being down as it may cause support calls and false problem alerts. If you are in a phase of troubles and have to restart often, this wrong behaviour can go on for hours.

But anyways, in my experience with large scale installations, the whole caching thing in usrloc is unnecessary. I have it on good authority that a modern PC can handle more than 100,000 subscribers with a cacheless usrloc and a local database. I once wrote a replacement module that did lookup() directly to the database without any usrloc. It was able to serve substantially more than 100,000 subscribers. (Disclaimer: This actually depends on your usage patterns. I can't provide CPS values, though.)

This leaves the registrar stuff. But that is writing to the database anyways. What would be more important here is to have it transactional in a sensible way. They way it works now is that if you have database problems, you delay your response which makes your UAs re-send the request which causes more database troubles. (This, BTW, is true for INVITE processing as well -- here you process your request with all the checks and database lookups and whatnots only to find out upon t_relay() that, oops, re-sent INVITE, needs to be dropped, all for nothing). True, this is not a problem if you use the right db_mode.

But there is another issue and that is reliability. At a certain point, you need to have a second SIP server because your superiors read about the five-nine thing. IMHO the easiest way to set this up is by having several servers doing the exact same thing and then load balancing traffic between them. This is only possible if you have a cacheless usrloc and if registrations are written to the database ASAP.

So, I do think that this cache is one of those optimizations that look good on paper but in practice are missing the point. That, of course, are just my sixteen øre. And just if someone cares to know, we are using Andreas' usrloc-cl in production and appart from a segfault I introduced while porting in our changes, it runs very smoothly.

Regards, Martin

Jiri Kuthan

4 Dec 4 Dec

9:33 a.m.

New subject: [Serusers] Re: usrloc loading

Hi Martin,

Couple of points inline, mostly academic type of discussion, as I largely agree that this type of optimization is missing the point, and one can do things in many different ways. IMO the real point is a reasonable cluster design (which includes DB processing too) and how to tune usrloc is eventually marginal.

-jiri

At 23:33 30/11/2006, Martin Hoffmann wrote:

...

samuel wrote:

...
...
Where is the time saving coming from then?

I think the idea behind was the following: The use case is for big providers with lots of entries in the usrloc database. A restart in such situation might lead to stop in the service for quite a few minutes (i don't recall the numbers) while the server is loading the data.

Numbers obviously depend on your hardware and whether you have a local database. But they are somewhere in the range of 20 seconds for 50,000 entries, 2 minutes for 100,000 entries and 10 minuits for 500,000 entries.

Part of the problem and also of the memory usage problem is that the database interface of SER requires that the entire table is slurped into SER's process memory instead of fetching and processing it row by row. This can cause funny behaviour during start-up and a near heart attack for the sysadmin.

it's a trade-off. I recall quite some providers who would have had a heart attack if usrloc was not cached. (think what happens when a popular IAD vendor sets its IADs to reregister at 3am) The problem may not appear on SIP side but on DB side, though.

Basically, you can preload (which is what we do), not to cache (which under some circumstances may cause real bad heart-attack) or perhaps something inbetween (less than 100% cache). Given other bottlenecks and price of memory, prelaoding seems feasibly the only down side is the loading time. This can be compensated by a reasonable network design with redundancy.

...

...
If you split the data in chunks and load it sequentally, you can start serving without interrumption...

As far as I understand the announcement (haven't looked at the actual code), the idea is to load everything inside an extra process. The problem with that kind of speed-up is that your responses will not be correct during the loading phase. I am not sure if this is better than being down as it may cause support calls and false problem alerts. If you are in a phase of troubles and have to restart often, this wrong behaviour can go on for hours.

But anyways, in my experience with large scale installations, the whole caching thing in usrloc is unnecessary. I have it on good authority that a modern PC can handle more than 100,000 subscribers with a cacheless usrloc and a local database.

I agree with you on sunny days. The problem is there are rainy days too and usrloc becomes bad bottleneck with significantly less subs.

...

I once wrote a replacement module that did lookup() directly to the database without any usrloc. It was able to serve substantially more than 100,000 subscribers. (Disclaimer: This actually depends on your usage patterns. I can't provide CPS values, though.)

This leaves the registrar stuff. But that is writing to the database anyways. What would be more important here is to have it transactional in a sensible way. They way it works now is that if you have database problems, you delay your response which makes your UAs re-send the request which causes more database troubles. (This, BTW, is true for INVITE processing as well -- here you process your request with all the checks and database lookups and whatnots only to find out upon t_relay() that, oops, re-sent INVITE, needs to be dropped, all for nothing). True, this is not a problem if you use the right db_mode.

I think this is a good place for improvement indeed. We have been thinking of some aggregation of delayed writes but haven't moved forward on this yet.

...

But there is another issue and that is reliability. At a certain point, you need to have a second SIP server because your superiors read about the five-nine thing.

I would add that if they don't read about it, they may find themselves being written about in popular magazines :-)

...

IMHO the easiest way to set this up is by having several servers doing the exact same thing and then load balancing traffic between them. This is only possible if you have a cacheless usrloc and if registrations are written to the database ASAP.

Well -- it is certainly possible but you actually just push the problem from SER cluster to a DB cluster, which may bring you other type of headache.

...

So, I do think that this cache is one of those optimizations that look good on paper but in practice are missing the point. That, of course, are just my sixteen øre. And just if someone cares to know, we are using Andreas' usrloc-cl in production and appart from a segfault I introduced while porting in our changes, it runs very smoothly.

Regards, Martin _______________________________________________ Serusers mailing list Serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

-- Jiri Kuthan http://iptel.org/~jiri/

Martin Hoffmann

5 Dec 5 Dec

9:54 a.m.

New subject: [Serusers] Re: usrloc loading

Salut,

Jiri Kuthan wrote:

...

At 23:33 30/11/2006, Martin Hoffmann wrote:

...
Part of the problem and also of the memory usage problem is that the database interface of SER requires that the entire table is slurped into SER's process memory instead of fetching and processing it row by row. This can cause funny behaviour during start-up and a near heart attack for the sysadmin.

it's a trade-off. I recall quite some providers who would have had a heart attack if usrloc was not cached.

This comment wasn't about the caching per se. The database interface allows you to access all rows as an array. This is rarely if ever needed. If the interface instead had a function a la dbf->get_next_row(), you wouldn't need to slurp a table of thousands of rows into pkg_mem first.

Another short-coming of the database API is that you can't do a "where expires < now()". This, however, is only a problem if you teach SER not to delete expired rows from the database and then forget to run the cron job that does it (Reminds me that I owe Atle a cookie for that one).

...

(think what happens when a popular IAD vendor sets its IADs to reregister at 3am)

If you have enough of those, the only thing you can do here is starting to 503 them. Just an idea: The problem really is that all UDP processes are stuck waiting for the database and new requests wouldn't get handled (which causes a re-sent storm that eventually kills you). If one counts the processes that are stuck, one can write a function that sends a 503 back if only one or two processes are left.

...

The problem may not appear on SIP side but on DB side, though.

Basically, you can preload (which is what we do), not to cache (which under some circumstances may cause real bad heart-attack) or perhaps something inbetween (less than 100% cache). Given other bottlenecks and price of memory, prelaoding seems feasibly the only down side is the loading time. This can be compensated by a reasonable network design with redundancy.

What you forget here is that your database has a query cache (or should have). This one is much better suited for this because it can cope with changes to the database from somewhere else. (We had to use serctl to update aliases which sometimes didn't work. The resulting script that tries to insert the alias, then checks whether it is actually there is quite impressive).

Plus, usrloc is actually only one out of two or three querries you do per INVITE: does_uri_exist() is probably done on every one (at least if you have call forwarding) and avp_load() is likely to be done for all incoming calls (That's 0.9, of course, dunno about 0.10 yet).

What killed me once wasn't usrloc but the avp_load(). And that was only because the indexes on the table were screwed and the select did a full table scan every time.

...

...
This leaves the registrar stuff. But that is writing to the database anyways. What would be more important here is to have it transactional in a sensible way. They way it works now is that if you have database problems, you delay your response which makes your UAs re-send the request which causes more database troubles. (This, BTW, is true for INVITE processing as well -- here you process your request with all the checks and database lookups and whatnots only to find out upon t_relay() that, oops, re-sent INVITE, needs to be dropped, all for nothing). True, this is not a problem if you use the right db_mode.

I think this is a good place for improvement indeed. We have been thinking of some aggregation of delayed writes but haven't moved forward on this yet.

I think a function "t_go_stateful()" might be enough (and use t_reply() in the registrar). The function checks if a transaction for the request exists and if so, ends processing right away. Otherwise it creates a transaction in a prelimary state.

...

Well -- it is certainly possible but you actually just push the problem from SER cluster to a DB cluster, which may bring you other type of headache.

Probably, but in this scenario I have several options to solve this, depending on my actual load. I can start with a central database that is accessed over the net, later switch to an elaborate scheme with replication and finally switch to a MySQL cluster-esque solution. High-performance databases are necessary in other applications, too, and do exist.

I am a follower of the old Unix strategy that everything does one thing and one thing only. Providing that data fast enough is the job of the database.

Regards, Martin

PS: Should we move this to serdev?

Michal Matyska

10:33 a.m.

New subject: [Serusers] Re: usrloc loading

Comment regarding tm change in ser 0.10 inline. Michal

On Tue, 2006-12-05 at 10:54 +0100, Martin Hoffmann wrote:

...

Salut,

Jiri Kuthan wrote:

...
At 23:33 30/11/2006, Martin Hoffmann wrote:

...
Part of the problem and also of the memory usage problem is that the database interface of SER requires that the entire table is slurped into SER's process memory instead of fetching and processing it row by row. This can cause funny behaviour during start-up and a near heart attack for the sysadmin.

it's a trade-off. I recall quite some providers who would have had a heart attack if usrloc was not cached.

This comment wasn't about the caching per se. The database interface allows you to access all rows as an array. This is rarely if ever needed. If the interface instead had a function a la dbf->get_next_row(), you wouldn't need to slurp a table of thousands of rows into pkg_mem first.

Another short-coming of the database API is that you can't do a "where expires < now()". This, however, is only a problem if you teach SER not to delete expired rows from the database and then forget to run the cron job that does it (Reminds me that I owe Atle a cookie for that one).

...
(think what happens when a popular IAD vendor sets its IADs to reregister at 3am)

If you have enough of those, the only thing you can do here is starting to 503 them. Just an idea: The problem really is that all UDP processes are stuck waiting for the database and new requests wouldn't get handled (which causes a re-sent storm that eventually kills you). If one counts the processes that are stuck, one can write a function that sends a 503 back if only one or two processes are left.

...
The problem may not appear on SIP side but on DB side, though.

Basically, you can preload (which is what we do), not to cache (which under some circumstances may cause real bad heart-attack) or perhaps something inbetween (less than 100% cache). Given other bottlenecks and price of memory, prelaoding seems feasibly the only down side is the loading time. This can be compensated by a reasonable network design with redundancy.

What you forget here is that your database has a query cache (or should have). This one is much better suited for this because it can cope with changes to the database from somewhere else. (We had to use serctl to update aliases which sometimes didn't work. The resulting script that tries to insert the alias, then checks whether it is actually there is quite impressive).

Plus, usrloc is actually only one out of two or three querries you do per INVITE: does_uri_exist() is probably done on every one (at least if you have call forwarding) and avp_load() is likely to be done for all incoming calls (That's 0.9, of course, dunno about 0.10 yet).

What killed me once wasn't usrloc but the avp_load(). And that was only because the indexes on the table were screwed and the select did a full table scan every time.

...
...
This leaves the registrar stuff. But that is writing to the database anyways. What would be more important here is to have it transactional in a sensible way. They way it works now is that if you have database problems, you delay your response which makes your UAs re-send the request which causes more database troubles. (This, BTW, is true for INVITE processing as well -- here you process your request with all the checks and database lookups and whatnots only to find out upon t_relay() that, oops, re-sent INVITE, needs to be dropped, all for nothing). True, this is not a problem if you use the right db_mode.

I think this is a good place for improvement indeed. We have been thinking of some aggregation of delayed writes but haven't moved forward on this yet.

I think a function "t_go_stateful()" might be enough (and use t_reply() in the registrar). The function checks if a transaction for the request exists and if so, ends processing right away. Otherwise it creates a transaction in a prelimary state.

With Ottendorf you can use t_newtran() to start the transaction and further in the script use functions of tm module like t_relay(), t_reply() - it does not complain any longer, that the transaction was started from the script earlier.

So it up to you, how you manage the selection of loosing CPU cycles either in transactions lookup or in handling of retransmissions.

...

...
Well -- it is certainly possible but you actually just push the problem from SER cluster to a DB cluster, which may bring you other type of headache.

Probably, but in this scenario I have several options to solve this, depending on my actual load. I can start with a central database that is accessed over the net, later switch to an elaborate scheme with replication and finally switch to a MySQL cluster-esque solution. High-performance databases are necessary in other applications, too, and do exist.

I am a follower of the old Unix strategy that everything does one thing and one thing only. Providing that data fast enough is the job of the database.

Regards, Martin

PS: Should we move this to serdev? _______________________________________________ Serusers mailing list Serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

Martin Hoffmann

2:22 p.m.

New subject: [Serusers] Re: usrloc loading

Michal Matyska wrote:

...

On Tue, 2006-12-05 at 10:54 +0100, Martin Hoffmann wrote:

...
I think a function "t_go_stateful()" might be enough (and use t_reply() in the registrar). The function checks if a transaction for the request exists and if so, ends processing right away. Otherwise it creates a transaction in a prelimary state.

With Ottendorf you can use t_newtran() to start the transaction and further in the script use functions of tm module like t_relay(), t_reply() - it does not complain any longer, that the transaction was started from the script earlier.

That takes care of INVITEs. Now a version of save() that uses t_reply() instead of sl_send_reply() is needed. I seem to remember that that isn't too hard.

Regards, Martin

Michal Matyska

4:19 p.m.

New subject: [Serusers] Re: usrloc loading

Oh yes.... that's already part of ser 0.10 as well.... just use

save_noreply("location"); t_reply("$code","$reason");

Michal

On Tue, 2006-12-05 at 15:22 +0100, Martin Hoffmann wrote:

...

Michal Matyska wrote:

...
On Tue, 2006-12-05 at 10:54 +0100, Martin Hoffmann wrote:

...
I think a function "t_go_stateful()" might be enough (and use t_reply() in the registrar). The function checks if a transaction for the request exists and if so, ends processing right away. Otherwise it creates a transaction in a prelimary state.

With Ottendorf you can use t_newtran() to start the transaction and further in the script use functions of tm module like t_relay(), t_reply() - it does not complain any longer, that the transaction was started from the script earlier.

That takes care of INVITEs. Now a version of save() that uses t_reply() instead of sl_send_reply() is needed. I seem to remember that that isn't too hard.

Regards, Martin

Martin Hoffmann

16 Dec 16 Dec

7:08 a.m.

New subject: [Serusers] Re: usrloc loading

Salut,

Michal Matyska wrote:

...

Oh yes.... that's already part of ser 0.10 as well.... just use

save_noreply("location"); t_reply("$code","$reason");

Cool. One more reason to switch, then.

Regards, Martin

Klaus Darilion

5 Dec 5 Dec

1:46 p.m.

New subject: [Serusers] Re: usrloc loading

Martin Hoffmann wrote:

...

...
...
This leaves the registrar stuff. But that is writing to the database anyways. What would be more important here is to have it transactional in a sensible way. They way it works now is that if you have database problems, you delay your response which makes your UAs re-send the request which causes more database troubles. (This, BTW, is true for INVITE processing as well -- here you process your request with all the checks and database lookups and whatnots only to find out upon t_relay() that, oops, re-sent INVITE, needs to be dropped, all for nothing). True, this is not a problem if you use the right db_mode.

I think this is a good place for improvement indeed. We have been thinking of some aggregation of delayed writes but haven't moved forward on this yet.

I think a function "t_go_stateful()" might be enough (and use t_reply() in the registrar). The function checks if a transaction for the request exists and if so, ends processing right away. Otherwise it creates a transaction in a prelimary state.

Doesn't t_lookup_request() help you?

regards klaus

-- Klaus Darilion nic.at

Martin Hoffmann

2:28 p.m.

New subject: [Serusers] Re: usrloc loading

Klaus Darilion wrote:

...

Martin Hoffmann wrote:

...
I think a function "t_go_stateful()" might be enough (and use t_reply() in the registrar). The function checks if a transaction for the request exists and if so, ends processing right away. Otherwise it creates a transaction in a prelimary state.

Doesn't t_lookup_request() help you?

I seem to remember that there were some problems with that. I do not claim to understand tm, but during my feeble attempts to do, I found issues with t_lookup_request() and maybe even t_newtran(). Probably something to do with ACKs and CANCELs.

Regards, Martin

6773

Age (days ago)

6789

Last active (days ago)

sr-users@lists.kamailio.org

18 comments

10 participants

tags (0)

participants (10)

Andrei Pelinescu-Onciul
Atle Samuelsen
Daniel-Constantin Mierla
Jiri Kuthan
Klaus Darilion
Martin Hoffmann
Michal Matyska
Ovidiu Sas
samuel
sip