Seems our last two messages were posted at the same time... :-)
Is anybody out there using LVS for SIP load balancing
yet? I read the
documentation of the several methodes and come to the following
conclusions. Please correct me if I'm wrong.
I have not yet seen a confirmed LVS setup. Paul (Java Rockxx) is using LVS
and claims that LVS code was not touched in setting that up. His setup
mostly uses ATAs with public IPs, so maybe they have not addressed the
call-id stickiness.
Usually, everybody processes transactions statefull.
This is required
for accounting, forking... As the SIP proxies in our cluster do not
share transaction states, all responses have to hit the same proxy
which forwarded the request. Thus, the load balancer also has to
remember clients and send them to the same proxy. This should be done
at IP:port level to avoid parsing SIP messages.
As said in my previous email, I have confirmed that LVS IPVS can pars UDP
payload quite efficiently. What the penalty is, I don't know yet.
I don't like LVS/NAT because of NAT: SIP proxies
have to put different
IP addresses in the SIP messages as they are binding to, rtpproxy must
be before NAT, ...
In the scenario I described in the last scenario, LVS is not used for rtp
load balancing. I don't think it's suitable for that. We have so far used
rtpproxy, but is now converting to mediaproxy with load balancing to public
IPs as implemented in the mediaproxy dispatcher. I believe this is a better
approach.
The non-NAT based versions TUN & DR also have
problems. If an incoming
INVITE is forwarded to the user by one of the proxies, this will be
done directly, without passing through the load balancer. Thus,
responses will hit the load balancer and the load balancer will send
the response to any SIP proxy (probably to the one the client is
registered). This is bad with stateful forwarding. This can avoided
when the proxy sends requests using its real IP instead of the
virtual IP. But then, if the user has a symmetric NAT, this wont work.
Yes, that's why I believe Call-id stickiness is necessary. I have looked at
F5 and Cisco and they both do UDP payload parsing and Call-id stickiness.
So, anybody using LVR already? Which LVR method do you
use?
Not yet...
g-)
regards,
klaus
Greger V. Teigre wrote:
> Klaus,
> Sorry for replying so late...
>
> I have been looking at the scenario you are suggesting, but I have
> never been able to get around the following two issues:
> 1. Restricted and symmetric NATs only allow incoming INVITEs from
> the IP where the client's REGISTER was sent (I believe most clients
> do not contact the outbound proxy until a call is made and the NAT
> has thus not an opening for the outbound proxy). You then need some
> logic to find out where the user is registered and forward the
> INVITE to the appropriate SER server (for example each server
> rewriting the Contact of it's own registrations before a
> t_replicate). 2. As a consequence of #1, you get a single point of
> failure for a
> given client, i.e. the registration server goes down and the client
> is not reachable until it has re-registered.
>
> Have you found a way around this? I remember one of your previous
> posts with a ser.cfg example for an outbound proxy where you alter
> the Contact and also patched ser to make it work. Is there a
> "clean" (i.e. non-patching) way to do the scenario you describe (I
> guess using Path header)?
> Another way is to keep the time between registrations to x minutes,
> so if a registration server goes down, the client will not be
> unavailable for long.
>
> Of course, you can set each registration server in a Linux HA setup,
> but that's waste of resources and I would prefer a more LVS type
> setup. I agree that SRV is a good thing when the client has implemented
> it.
> As long as you keep the contacts synchronized and can figure out
> which registration server the callee can be found at, it is a nice
> load balancing feature. We use SRV ourselves and t_replicate to
> keep the contacts in sync. We only use the location table as we have
> RADIUS servers for authentication and avpair configuration of
> calling options. The RADIUS servers have LDAP backends and we do
> LDAP-level replication. However, I like low-level replication better and
> we're probably
> moving to a scenario closer to what Paul (Java Rockxx) has and Tina
> is working on: Using replication at the database level and load
> balance in front with Cisco/F5 load balancers. I have looked at F5
> and the annoying thing is that it seems to be a pretty standard
> "peek-into-udp-packet" scheduling and it wouldn't surprise me if
> they use their own modified LVS at the core of their boxes... So,
> Matt, a couple of cheap servers setup with Linux HA (let's say
> Ultramonkey setup) would be great, wouldn't it?
> Well, what we lack is the ipvs udp-packet inspection scheduler :-)
> g-)
>
>
>
>
> Klaus Darilion wrote:
>
>> My 2 cents:
>>
>> 1. Use SRV for load balancing. (Yes there are dumb devices, thus
>> also use A records) Probably this will cause problems with clients
>> which does not remember the IP address. The client has to remember
>> the IP address resolved by SRV lookups for all other requests. Once
>> a request fails, the client should repeat SRV lookup, choose a new
>> server, reREGISTER and stay with this server till the next failure.
>>
>> 2. Use dedicated outbound proxies which do NAT traversal. Of course
>> you have to be sure that all messages to a client has to be routed
>> via the same outboundproxy. This can be solved by implementing the
>> Path: header, or by modifiying the Contact: header in REGISTER
>> requests to point to the outboundproxy.
>>
>> 3. Use one ore more main proxies with the routing logic.
>>
>> I don't like load balancers as they are a single point of failure
>> and SIP is not that easy to handle as HTTP.
>>
>> regards,
>> klaus
>>
>> Greger V. Teigre wrote:
>>
>>> I agree that NAT should be resolved by the peers. I haven't looked
>>> at the forking proxy details; I assume it will do sort of a
>>> redirect for REGISTERs and INVITEs, so that everything thereafter
>>> is handled by each SIP server. I still cannot really see how you
>>> solve the NAT problem,though. The public IP of the SIP server
>>> handling the first REGISTER will be the only IP allowed to send an
>>> INVITE to the UA, so if another UA registered with another server
>>> makes a call, the SIP forking proxy must make sure that the INVITE
>>> is sent through the SIP server having done the initial
>>> registration of callee. g-)
>>>
>>> ---- Original Message ----
>>> From: Alex Vishnev
>>> To: 'Greger V. Teigre' ; serusers(a)lists.iptel.org
>>> Sent: Tuesday, April 12, 2005 06:20 PM
>>> Subject: RE: LVS, load balancing,and stickness was ==> Re:
>>> [Serusers] moreusrloc synchronization
>>>
>>> > Greger,
>>> >
>>> > I am not an expert on anycast as well. I just know it exists and
>>> > people are starting to look at it more seriously for HA option.
>>> That > is why I though DNS SRV records would be an easier
>>> solution. > Regarding your comments on NAT, I don’t believe it is
>>> an issue as it > relates to forking proxy. Forking proxy should
>>> not resolve NAT, it is > a job for its peers. As for configuring
>>> SER as forking proxy, I > thought I read about it a while back,
>>> but now I can’t seem to locate > it. I hope I was not dreaming
>>> ;-). >
>>> > In any case, I will continue to google around to see if SER has
>>> this > option.
>>> >
>>> > Sincerely,
>>> >
>>> > Alex
>>> >
>>> >
>>> >
>>> >
>>> > From: Greger V. Teigre [mailto:greger@teigre.com]
>>> > Sent: Tuesday, April 12, 2005 6:21 AM
>>> > To: Alex Vishnev; serusers(a)lists.iptel.org
>>> > Subject: Re: LVS, load balancing,and stickness was ==> Re:
>>> [Serusers] > moreusrloc synchronization
>>> >
>>> > Alex,
>>> > I'm not really knowledgable enough about anycast to say anything
>>> > useful. The only is that in your described setup, I cannot
>>> really > see how you get around the UA behind restricted (or
>>> worse) NAT. > I have never tried to configure SER as a
>>> forking proxy, but I > wouldn't be surprised if it was possible.
>>> > g-)
>>> >
>>> > ---- Original Message ----
>>> > From: Alex Vishnev
>>> > To: serusers(a)lists.iptel.org
>>> > Sent: Monday, April 11, 2005 02:30 PM
>>> > Subject: RE: LVS, load balancing, and stickness was ==> Re:
>>> [Serusers] > moreusrloc synchronization
>>> >
>>> >> Greger and Paul,
>>> >>
>>> >> I think you understood me correctly regarding forking proxy. It
>>> is >> the proxy that will fork out the requests to all available
>>> peering >> proxies. This approach does not require stickiness
>>> based on Call-id. >> AFAIK, once the forking proxy receives an
>>> acknowledgement from one of >> its peers, then the rest of the
>>> session will be done directly to that >> peer without the use of
>>> the forking proxy. I am considering 2 >> approaches to resolve
>>> availability of forking proxy. 1 – using >> ANYCAST (good high
>>> level article: >>
>>>
http://www.kuro5hin.org/story/2003/12/31/173152/86). 2 – using dns
>>> >> srv. I am still trying to determine if ANYCAST is a good
>>> solution for >> creating local RPs with forking proxy. However, I
>>> think that dns srv >> records can easily be implemented to allow
>>> simple round robin between >> multiple forking proxies. Thoughts?
>>> >> >> Alex
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> From: serusers-bounces(a)lists.iptel.org
>>> [mailto:serusers-bounces@lists.iptel.org] >> On Behalf Of Greger V.
>>> Teigre >> Sent: Monday, April 11, 2005 4:47 AM
>>> >> To: kramarv(a)yahoo.com
>>> >> Cc: serusers(a)lists.iptel.org
>>> >> Subject: LVS, load balancing, and stickness was ==> Re:
>>> [Serusers] >> more usrloc synchronization
>>> >>
>>> >> After my last email, I looked at ktcpvs and realized I ignored
>>> a >> couple of things: ktcpvs only supports tcp (http is obviously
>>> >> tcp-based, but I thought it supported udp for other
>>> protocols). I
>>>>> don't know how much work implementing udp would be.
>>> >> Here is a discussion of SIP and LVS that I found useful
>>> (though >> not encouraging).
>>> >>
>>>
http://www.austintek.com/LVS/LVS-HOWTO/HOWTO/LVS-HOWTO.services_that_dont_w…
>>>
>>> >>
>>> >> Paul: I'm starting to get really curious on the standard LVS
>>> >> components used for your stickiness! I'm not aware (also
after
>>> >> searching now) of an LVS balancing mechanism that allows
>>> correct >> stickness on SIP udp...!
>>> >> And I found other too who are looking for it:
>>> >>
>>>
http://archive.linuxvirtualserver.org/html/lvs-users/2005-02/msg00251.html
>>>
>>> >>
>>> >> My understanding is that ipvs must be extended (according to
>>> the >> developer) with a call-id based scheduler and that this
>>> work has >> several people willing to fund development, but that
>>> this has not(?) >> started yet. The problem is that ipvs is
>>> based on ip header analysis >> and extending the hashing
>>> algorithms to also include payload-based >> analysis is not
>>> straight-forward. >> g-)
>>> >>
>>> >>> With regards to stickiness: Have you looked at ktcpvs? SIP is
>>> an >>> "http-like" protocol and I'm pretty sure that
you can use
>>> the >>> http-based regex hashing to search for Call-Id. If you
>>> cannot use >>> it right out of the box, I'm pretty sure the
>>> modifications are >>> minimal. The user location problem:
>>> With a cluster back-end, I >>> also only
>>> >>> see save_memory() as the only option.
>>> >>> g-)
>>> >>>
>>> >>>> "Greger V. Teigre" <greger(a)teigre.com>
wrote:
>>> >>>>> Greger, thanks a lot.
>>> >>>>> The problem with load balancer is that replies goes to
the
>>> wrong >>>>> server due to rewriting outgoing a.b.c.d . BTW,
as
>>> Paul pointed, >>>>> if you define some dummy interface with
>>> Virtual IP (VIP), there >>>>> is no need to rewrite outgoing
>>> messages (I tested this a little). >>>>
>>> >>>>
>>> >>>> Yes, if you use LVS with direct routing or tunneling, that
is
>>> what >>>> you experience.
>>> >>>> ===Of course. That why I implemented small
"session"
>>> stickness. >>>> However, it causes additional internal traffic.
>>> >>>>
>>> >>>> What I described was a "generic" SIP-aware load
balancer
>>> where SIP >>>> messages would be rewritten and stickiness
>>> implemented based on ex. >>>> UA IP address (or call-id like
>>> vovida's load balancer). >>>> ====Sure, it's better
solution; I
>>> think we'll go this way soon (in >>>> our next version).
>>> >>>>
>>> >>>>> Why DNS approach is bad (except restricted NAT -
let's say I
>>> am >>>>> solving this)?
>>> >>>>
>>> >>>> Well, IMO, DNS SRV in itself is not bad. It's just that
many
>>> user >>>> clients do not support DNS SRV yet. Except that, I
like
>>> the >>>> concept and it will give you a geographical redundancy
>>> and load >>>> balancing. ===I am trying to build the following
>>> architecture: >>>>
>>> >>>> DNS (returns domain's public IP)->LVS+tunneling
(Virtual
>>> IP)->ser >>>> clusters (with private IPs)
>>> >>>>
>>> >>>>>
>>> >>>>
>>> >>>>>
>>> >>>>
>>> DB >>>> (MySQL 4.1 cluster)
>>> >>>>
>>> >>>>> I guess, Paul utilizes load-balancer scenario you have
>>> described. >>>>> Believe there are only proprietary solutions
for
>>> >>>>> "the-replies-problem". We tried Vovida
call-id-persistence
>>> >>>>> package, unfortunately it didn't work for us.
>>> >>>>
>>> >>>> Are you referring to the load balancer proxy? IMHO, the
>>> SIP-aware >>>> load balancer makes things a bit messy. It
sounds
>>> to me that the >>>> LVS + tunneling/direct routing + virtual IP
on
>>> dummy adapter is a >>>> better solution.
>>> >>>>
>>> >>>>> In my configuration I use shared remote DB cluster
(with
>>> >>>>> replication). Each ser see it as one-public-IP (exactly
the
>>> >>>>> approach you named for SIP). May be it's good idea
to use
>>> local DB >>>>> clusters, but if you have more than 2 servers
your
>>> replication >>>>> algorythm gonna be complex. Additional
problem -
>>> it still doesn't >>>>> solve usrloc synchronization - you
still
>>> have to use >>>>> t_replicate()...
>>> >>>>
>>> >>>>
>>> >>>> I'm not sure if I understand.
>>> >>>> ===Oh, probably I expressed myself not well enough...
>>> >>>>
>>> >>>> So, you have 2 servers at two location, each location with
a
>>> shared >>>> DB and then replication across an IPsec tunnel??
>>> >>>> IMHO, mysql 3.23.x two-way replication is quite shaky
and
>>> >>>> dangerous to rely on. With no locking, you will easily
get
>>> >>>> overwrites and you have to be very sure that your
application
>>> >>>> doesn't mess up the DB. I haven't looked at mysql
4.1
>>> clustering, >>>> but from the little I have seen, it looks
good.
>>> Is that what you >>>> use?
>>> >>>>
>>> >>>> ===We have 2 or more servers with MysQL 4.1 virtual server
>>> >>>> (clusters balanced by LVS). We use MySQL for maintaining
>>> >>>> subscribers' accounts, not for location. User location
is
>>> still >>>> in-memory only so far. I am afraid I have to switch
to
>>> ser 09 in >>>> order to use save_memory (thanks Paul!) and
>>> forward_tcp() for >>>> replication.
>>> >>>>
>>> >>>>> With regard to t_replicate() - it doesn't work for
more
>>> than 2 >>>>> servers, so I used exactly forward_tcp() and
>>> save_noreply() >>>>> (you're absolutely right - this
works fine
>>> so far); all sers are >>>>> happy. Of course, this causes
>>> additional traffic. Interesting >>>>> whether Paul's FIFO
patch
>>> reduces traffic between sers? >>>>
>>> >>>> I believe Paul uses forward_tcp() and save_memory() to
save
>>> the >>>> location to the replicated server's memory, while
the
>>> >>>> save("location") on the primary server will store
to the DB
>>> (which >>>> then replicates on the DB level).
>>> >>>> g-)
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>> Do you Yahoo!?
>>> >>>> Yahoo! Small Business - Try our new resources site!
>>> >>
>>> >>
>>> >>
>>> >> _______________________________________________
>>> >> Serusers mailing list
>>> >> serusers(a)lists.iptel.org
>>> >>
http://lists.iptel.org/mailman/listinfo/serusers
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Serusers mailing list
>>> serusers(a)lists.iptel.org
>>>
http://lists.iptel.org/mailman/listinfo/serusers