Hi all!
Well, maybe I don't know what I am talking about, I just saw now this message, don't know the entire discussion, so anyway, here it goes.
I am working on implementing 3GPP's IMS using ser and by doing this I am implementing the 3GPP AAA scenarios, which, as I understood, will evolve into IETF's "Diameter SIP authentication" (now is a Draft) http://www.potaroo.net/ietf/idref/draft-ietf-aaa-diameter-sip-app/ .
Usrloc, although I use it locally is not anymore needed on a large distributed scale because the AAA servers keep track of where the users are registered and you can query them by standard request (Location Information Request, User Authorization Request) to find out where to forward messages.
1. more users, more servers. I think this should be the way to do it. I never liked replication because it's kind of a too complicated solution to work reliably. I really think that the Diameter SIP auth will scale very well. Keep in mind that it was designed to work in large mobile networks (3G, right?). The network should be splitted into different functional sip proxies, and in this case: - interogating(finds where a user is/should be registered and forwards the msg there) - serving (actually services the user).
2.the replication problem has just moved from ser to the AAA server. But the AAA server can be distributed so that each one will hold a limited number of users, and a special AAA routing node can be used to route requests toward the right AAA, which will respond to them. The AAA server holding users could be a point of failure so maybe you will still need to duplicate it's functionality but I think it's better to do a replication just to ensure availability/fail-safety then to do it in order to implement functionality. At SIP level the distribution can be done using the interogating nodes, first to redirect REGISTERs in load balancing if you want, to different ser servers; and then the users can be found by interogating the AAA servers which will track the users registration status.
3. the provisioning needs to be done in the AAA server only and all the ser servers in the farm will obey. I would suggest a "farm" architecture for ser, where all nodes are configured the same and then you could dinamically introduce or take back some such servers depending on load. All this would be done by provisioning on the AAA.
4. Actually 1. responded to this. Load balancing is controled in interogating sip-proxies and in AAA servers and not by "patches" like DNS SRV (please don't flame me on this one ;-) ).
I, personaly consider that ser should be limited to what it does best - SIP - and other components should do provisioning, replication and user tracking. Anyway, user location is more of a AAA job, don't you think? This is my personal opinion and I might be wrong :) . It is based on the patterns of the 3GPP and diameter-sip-app. Although I have reached the testing phase and it works, I have not yet performed large stress tests to see how it works because my target is to build a working 3GPP IMS, not a diameter-sip application as the draft says. So if there is interest for this, please let me know.
Dragos
Greger V. Teigre wrote:
Let me try to sort out the issues we are discussing here, so we at least can see if we agree to the goals:
- Reliability and scalability issues
Scenario: Tens of thousands or hundreds of thousands of users require a reliable and scalable infrastructure Goal: Find a good reference scenario for building a reliable and scalable infrastructure of ser servers. Problems: Everybody tries to solve this their own way and most keep their solutions as a secret because it is a competitive advantage not to tell anybody.
**** I think that your solution to #1 will dominate the discussions on the issues below. Using RADIUS (and possibly LDAP back-ends) for everything but usrloc is one solution that seems to be Juha's scenario (and mine). Andreas uses mysql for subscriber info as well. Do you have one server center with load balancing or geographically-distributed server centers? It will influence your needs. So, let's sort out our scenarios before we discuss what is the "best " solution.
- Usrloc replication across standalone ser servers.
Scenario: Independent servers with independent databases run either with some sort of load balancing or DNS SRV. Goal: Make sure that all ser servers have updated usrloc information, so each can handle any SIP message. Problems: Distribute REGISTER messages to all servers; Make sure that server unavailability does not corrupt the usrloc DB state
*** We all have this issue. It is my understanding that t_replicate: a) uses SIP messages b) uses a best-effort algorithm (haven't looked at the code...) c) can be used between several servers, but when you introduce a new server, you need to change each server's ser.cfg My suggestion for a simple solution based on the discussion so far: Extend t_replicate with a guaranteed mode of replication. mysql can be used as a queue with replication states (or even a text-file for that sake). Whether SIP messages are used or TCP/IP-based FIFO is really based on an estimation of network traffic. Result: The least work and the code is an integrated part of ser.
- Network-based provisioning of new users, aliases, etc
Scenario: One server need to be provisioned from a web server or process running on a remote server Goal: Allow ser to receive TCP/IP based provisioning messages Problems: ser's FIFO does not have a TCP/IP interface
*** I think this is an extension to ser that would benefit many people. I also believe that a provisioning interface should be SOAP based due to share number of projects that probably will use the interface for provisioning.
- Replication of user database, aliases, etc across standalone ser
servers.
Scenario: Independent servers with independent databases run either with some sort of load balancing or DNS SRV and subscriber information is stored in sql tables Goal: Make sure that each server recognizes all subscribers, aliases, etc Problems: Make sure that all servers have updated database tables
*** RADIUS/LDAP solutions do not need to do this as RADIUS servers, LDAP replication etc take care of both reliability and scalability. However, I think ser support more than one RADIUS server. A defined secondary server would be useful. With SQL-based scenarios however, I see three natural solutions: a) Rely on sql-based replication. Without checking this, I believe ser always write such FIFO commands directly to the DB, so sql-level replication should work b) Extend ser's FIFO to also have a replication configuration, i.e. in ser.cfg you define the peer servers that need replication. If the extension to t_replicate uses TCP/IP based FIFO, the code can be re-used. c) Implement provisioning systems so that each ser server is updated through the TCP/IP-based FIFO
To be honest, I'm not sure if I see the value of such an effort (b). Also, as usage of sql for storage is just one of several modes, it is probably not right to integrate such code into FIFO. a) and b) are more natural choices.
My summary and conclusions:
- I believe a TCP/IP-based FIFO (#3) is a core feature that we all can
agree would be useful and natural to implement;
- I don't know the details of how t_replicate functions, but Juha's
opinion is that it takes care of all the issues Andreas points out except one: The amount of traffic SIP messages create. I will not interfere with this discussion, of course, if t_replicate can handle unavailable servers etc, that would be great. Anyway, a reliable replication of usrloc is essential to a carrier-grade architecture
- After this discussion, I now believe we should keep provisioning
(#3) and the two types of replication (#2 and #4) separate also in implementation.
Well, my attempt at sorting out issues. Any succes, you think? ;-)
g-)
Andreas Granig wrote:
Juha Heinanen wrote:
you can have any number of proxies participating in replication.
What method are you thinking of? t_replicate() reports
ERROR: t_newtran: transaction already in process 0x4054d5ec
if you call it twice, like
t_replicate("foohost", "5060"); t_replicate("barhost", "5060");
Or do you mean something like
forward_tcp("foohost", "5060"); forward_tcp("barhost", "5060");
and on the receiving hosts
if(/* register from replicating host */) save_noreply("location");
which would be a possibility, indeed...
Beside that the domain tables (location etc) get out of synch if
one of > the SERs is down for a moment, because retransmission is only tried a > few times.
i don't see why this needs to be the case with db mode 2. when ser comes back up, it updates its location table from database.
I think mode 1 (Write-Through) should be used because the SER could start up while some of the contacts aren't flushed to DB yet.
However, how would you set up your database connections here? Using a common usrloc database for all hosts (-> single point of failure)? This is the main point. _How_ do you share the contacts as reliable as possible so that a host can go down for a while without getting out of synch regarding the contacts?
Andy
Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers
Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers