[Serusers] Performance tuning

List overview All Threads
Download

newer

older

[Serusers] SER stops for some...

[Serusers] asterisk server

zeusng

23 Mar 2004 23 Mar '04

1:37 a.m.

Can anyone give me a realistic test case for measuring SER performance? I've been using sipsak to stress my SER server but am not able to interpret the result.

Here are some sipsak were run:

[siptest@sipuat siptest]$ sipsak -U -I -e 10000 -s sip:40@mysip.test -r 5060 -n 800 -z -vv [siptest@sipuat siptest]$ sipsak -U -I -e 50000 -s sip:30@mysip.test -r 5060 -n 2 -z -vv [siptest@sipuat siptest]$ sipsak -U -I -e 10000 -s sip:40@mysip.test -r 5060 -n 150 -z -vv [siptest@sipuat siptest]$ sipsak -U -I -e 10000 -s sip:40@mysip.test -r 5060 -n 50 -z -vv

One of the result as follow:

All usrloc tests completed successful. received last message 109125.023 ms after first request (test duration). biggest delay between request and response was 46082.422 ms 10 retransmission(s) received from server. 9 time(s) the timeout of 5000 ms exceeded and request was retransmitted.

I guess a delay of 46s (46082.422ms) is definitely not acceptable. What should I set for -n to be comparable to real world traffic.

And what kind of result should I expect.

Side issue:

During my stress test, I experience the same problem Andres reported last Nov. udp_rcv_loop:recvfrom:[11] Resource temporarily unavailable

After I modify some kernel (Fedora Core 1, Linux 2.4.22) parameters, the problem seems to go away. Anyone happy to prove the case?

# echo "8388608" > /proc/sys/net/core/rmem_max # echo "8388608" > /proc/sys/net/core/wmem_max # echo "8388608" > /proc/sys/net/core/rmem_default # echo "8388608" > /proc/sys/net/core/wmem_default

Zeus

********************************************************************** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed.

If you have received this email in error, you are prohibited from reading, copying, distributing and using the information. Please contact the sender immediately by return email and destroy the original message. ******************************************************************

Show replies by date

Andrei Pelinescu-Onciul

23 Mar 23 Mar

9:33 a.m.

On Mar 23, 2004 at 12:37, zeusng zeus.ng@isquare.com.au wrote:

...

Can anyone give me a realistic test case for measuring SER performance? I've been using sipsak to stress my SER server but am not able to interpret the result.

Depends on what you want to measure. A more general benchmark would be to measure the calls/s. However you need a tool able to generate and terminate calls. Unfortunately I don't know of any such free tool.

You should also tune ser a little. You should start by recompiling it. Make sure -DF_MALLOC is turned on and -DDBG_*_MALLOC off in Makefile.defs. Also turn on -DNO_DEBUG (you could also compile with make EXTRA_DEFS=-DNO_DEBUG). If you don't care about logs, add also -DNO_LOG (not recommended for a production system). Use a recent gcc (3.2-3.3) or icc (intel's c compiler). add CPU=your_cpu to your make command line (e.g. make CPU=pentium4 ; by default ser is optimized for athlon on x86 archs). In your config file turn off Warning header (sip_warning=0) and mhomed (by default is no, so you don't need to do anything unless you have it explicitely turned on).

...

Here are some sipsak were run:

[siptest@sipuat siptest]$ sipsak -U -I -e 10000 -s sip:40@mysip.test -r 5060 -n 800 -z -vv [siptest@sipuat siptest]$ sipsak -U -I -e 50000 -s sip:30@mysip.test -r 5060 -n 2 -z -vv [siptest@sipuat siptest]$ sipsak -U -I -e 10000 -s sip:40@mysip.test -r 5060 -n 150 -z -vv [siptest@sipuat siptest]$ sipsak -U -I -e 10000 -s sip:40@mysip.test -r 5060 -n 50 -z -vv

One of the result as follow:

All usrloc tests completed successful. received last message 109125.023 ms after first request (test duration). biggest delay between request and response was 46082.422 ms 10 retransmission(s) received from server. 9 time(s) the timeout of 5000 ms exceeded and request was retransmitted.

I guess a delay of 46s (46082.422ms) is definitely not acceptable. What should I set for -n to be comparable to real world traffic.

I leave this for someone who knows sipsak better then me.

...

And what kind of result should I expect.

Side issue:

During my stress test, I experience the same problem Andres reported last Nov. udp_rcv_loop:recvfrom:[11] Resource temporarily unavailable

On linux, for a blocking socket, this means udp checksum error (the kernel moves the skb in the socket receive buffer, wakes up the process blocked on receive on the socket and then checks for the checksum, if the checksum is bad will return an EAGAIN). Try tcpdumping for the traffic (with -s 1514 ) and see if you can spot bad checksums.

...

After I modify some kernel (Fedora Core 1, Linux 2.4.22) parameters, the problem seems to go away. Anyone happy to prove the case?

# echo "8388608" > /proc/sys/net/core/rmem_max # echo "8388608" > /proc/sys/net/core/wmem_max # echo "8388608" > /proc/sys/net/core/rmem_default # echo "8388608" > /proc/sys/net/core/wmem_default

I don't know why increasing socket receive/send buffers size would affect this (I don't see anything relevant in the kernel code).

Andrei

zeusng

12:04 p.m.

Theoretically, I would like to measure the calls/s. Since someone has create the sipsak testing tool, I do want to test like what they have done at well.

On the other hand, a real world statistic like how many REGISTER, INVITE, BYE are received by iptel, fwd, could benefit my projection on traffic volume. Here, I'm not just talking about call volume, but also packet traffic. Assuming each method has say 100 bytes and I am receiving 1G of them every day, my service provider would charge me $$ on that.

Thanks for the tips on compile option. I do have most of them before the stress test, except NO_DEBUG as I still need debugging information right now.

As for your explanation of EAGAIN on linux, I do recall a different version on some of the other discussion group. It talks about short of resource. Also the man page shows a different story.

EAGAIN The socket is marked non-blocking and the receive operation would block, or a receive timeout had been set and the timeout expired before data was received. I'm not saying you are wrong, but that's not what I perceived.

The tuning parameter I did was based on a previous experience on another application, namely SAP. It had a similar problem until I made the changes. I must admit that I also change /proc/sys/fs/file-max and some other variables as well.

One more interesting thing I observed during the stress test. If I have a lot of REGISTERs (say thousand) expire at around the same time, SER would not process any new requests until all expired REGISTERs were gone. Any explanation on this?

Zeus

...

-----Original Message----- From: Andrei Pelinescu-Onciul [mailto:pelinescu-onciul@fokus.fraunhofer.de] Sent: Tuesday, 23 March 2004 8:34 PM To: zeusng Cc: serusers@lists.iptel.org Subject: Re: [Serusers] Performance tuning

On Mar 23, 2004 at 12:37, zeusng zeus.ng@isquare.com.au wrote:

...
Can anyone give me a realistic test case for measuring SER performance? I've been using sipsak to stress my SER server

but am not

...
able to interpret the result.

Depends on what you want to measure. A more general benchmark would be to measure the calls/s. However you need a tool able to generate and terminate calls. Unfortunately I don't know of any such free tool.

You should also tune ser a little. You should start by recompiling it. Make sure -DF_MALLOC is turned on and -DDBG_*_MALLOC off in Makefile.defs. Also turn on -DNO_DEBUG (you could also compile with make EXTRA_DEFS=-DNO_DEBUG). If you don't care about logs, add also -DNO_LOG (not recommended for a production system). Use a recent gcc (3.2-3.3) or icc (intel's c compiler). add CPU=your_cpu to your make command line (e.g. make CPU=pentium4 ; by default ser is optimized for athlon on x86 archs). In your config file turn off Warning header (sip_warning=0) and mhomed (by default is no, so you don't need to do anything unless you have it explicitely turned on).

...
Here are some sipsak were run:

[siptest@sipuat siptest]$ sipsak -U -I -e 10000 -s

sip:40@mysip.test

...
-r 5060 -n 800 -z -vv [siptest@sipuat siptest]$ sipsak -U

-I -e 50000

...
-s sip:30@mysip.test -r 5060 -n 2 -z -vv [siptest@sipuat siptest]$ sipsak -U -I -e 10000 -s

sip:40@mysip.test -r 5060

...
-n 150 -z -vv [siptest@sipuat siptest]$ sipsak -U -I -e 10000 -s

sip:40@mysip.test -r 5060

...
-n 50 -z -vv

One of the result as follow:

All usrloc tests completed successful. received last message 109125.023 ms after first request (test duration). biggest delay between request and response was

46082.422 ms

...
10 retransmission(s) received from server. 9 time(s) the timeout of 5000 ms exceeded and request was retransmitted.

I guess a delay of 46s (46082.422ms) is definitely not acceptable. What should I set for -n to be comparable to real world traffic.

I leave this for someone who knows sipsak better then me.

...
And what kind of result should I expect.

Side issue:

During my stress test, I experience the same problem Andres

reported

...
last Nov. udp_rcv_loop:recvfrom:[11] Resource temporarily

unavailable

On linux, for a blocking socket, this means udp checksum error (the kernel moves the skb in the socket receive buffer, wakes up the process blocked on receive on the socket and then checks for the checksum, if the checksum is bad will return an EAGAIN). Try tcpdumping for the traffic (with -s 1514 ) and see if you can spot bad checksums.

...
After I modify some kernel (Fedora Core 1, Linux 2.4.22)

parameters,

...
the problem seems to go away. Anyone happy to prove the case?

# echo "8388608" > /proc/sys/net/core/rmem_max # echo "8388608" > /proc/sys/net/core/wmem_max # echo "8388608" > /proc/sys/net/core/rmem_default # echo "8388608" > /proc/sys/net/core/wmem_default

I don't know why increasing socket receive/send buffers size would affect this (I don't see anything relevant in the kernel code).

Andrei

Andrei Pelinescu-Onciul

2:52 p.m.

On Mar 23, 2004 at 23:04, zeusng zeus.ng@isquare.com.au wrote: [...]

...

As for your explanation of EAGAIN on linux, I do recall a different version on some of the other discussion group. It talks about short of resource. Also the man page shows a different story.

I've just looked again over the kernel sources and I can't see any other possibility.

...

EAGAIN The socket is marked non-blocking and the receive operation would block, or a receive timeout had been set and the timeout expired before data was received. I'm not saying you are wrong, but that's not what I perceived.

The manual page has the POSIX interpretation on EAGAIN. In linux however there is this exception: socket in blocking mode, wait for receive and udp with bad checksum. Check net/ipv4/udp.c in your kernel sources. The function name is udp_recvmsg.

...

The tuning parameter I did was based on a previous experience on another application, namely SAP. It had a similar problem until I made the changes. I must admit that I also change /proc/sys/fs/file-max and some other variables as well.

One more interesting thing I observed during the stress test. If I have a lot of REGISTERs (say thousand) expire at around the same time, SER would not process any new requests until all expired REGISTERs were gone. Any explanation on this?

Andrei

Alex Bligh

8:24 p.m.

--On 23 March 2004 23:04 +1100 zeusng zeus.ng@isquare.com.au wrote:

...

Theoretically, I would like to measure the calls/s.

I am doing some work on this in my "copious free time" - it's derived from kphone and the dissipate stack in its current incarnation, rather than sipsak. I will publish it (GPL) when it is in a fit state. However, I am trying to test more components of the system than simply a cps test (which is really only going to exercise the proxy) - I want to test the effect of lots of registrations, and of the call audio (for instance through media proxy). For "just" a SIP cps test you would probably be best off using something low level (like sipsak) or possibly going down even further, just capturing some packets, doing some sprintf's, and using an ncat type approach. As someone pointed out on a similar thread earlier, the overhead of the test stack/tool is potentially as much a factor in determining capacity as the effect of ser for such a simple test.

The reason I'm not doing this type of test is I suspect in many common deployment environments, straightforward ser cps isn't going to your performance bottleneck but YMMV.

Alex

Nils Ohlmeier

28 Mar 28 Mar

1:43 p.m.

Hello,

i apologize for the late reply.

On Tuesday 23 March 2004 02:37, zeusng wrote:

...

Can anyone give me a realistic test case for measuring SER performance? I've been using sipsak to stress my SER server but am not able to interpret the result.

Sipsak is intended to stress your proxy, but not to mesaure the speed (CPS) of a proxy. There are simply to many unknown variables which can influence the results from sipsak.

...

Here are some sipsak were run:

[siptest@sipuat siptest]$ sipsak -U -I -e 10000 -s sip:40@mysip.test -r 5060 -n 800 -z -vv [siptest@sipuat siptest]$ sipsak -U -I -e 50000 -s sip:30@mysip.test -r 5060 -n 2 -z -vv [siptest@sipuat siptest]$ sipsak -U -I -e 10000 -s sip:40@mysip.test -r 5060 -n 150 -z -vv [siptest@sipuat siptest]$ sipsak -U -I -e 10000 -s sip:40@mysip.test -r 5060 -n 50 -z -vv

One of the result as follow:

All usrloc tests completed successful. received last message 109125.023 ms after first request (test duration).

This is simply the time from the first request send out the proxy till the last reply was received.

...

biggest delay between request and response was 46082.422 ms

As your allready noticed it is possible that under some circumstances your proxy do not answer very quickly (probably because it is blocked by something else). Because of this sipsak tries to mesarue the delay between the time when the request was send out and when the reply for this request was received. But it do not keep all the delays but only the biggest delay during the complete test run.

...

10 retransmission(s) received from server.

Sometimes you receive replys for requests multiple times. Mostly because the proxy answers stateless (e.g. the registrar) and the was a retransmissions of the initial requests because no reply was received untill 5000 ms. This counts simply how often sipsak received a retransmitted reply.

...

9 time(s) the timeout of 5000 ms exceeded and request was retransmitted.

9 times sipsak did not received a reply on its request after 5000 ms and retransmitted its request.

...

I guess a delay of 46s (46082.422ms) is definitely not acceptable. What should I set for -n to be comparable to real world traffic.

What is not acceptable for you depends a lot on your definitions ;-) But as i allready wrote sipsak is not intended to get any comparable messarument results. It should be fine to tests e.g. if your proxy runs faster with option X turn on or off. But allways remember to run such tests - in environments where no other actions/processes interfere with your test runs - to definitely exclude bottlenecks like network speed, CPU power, speed of used services like DNS

...

And what kind of result should I expect.

That depends too much on your environments. And the output is just a small help to ease your daily configuration work ;-)

Greetings Nils

...

Side issue:

During my stress test, I experience the same problem Andres reported last Nov. udp_rcv_loop:recvfrom:[11] Resource temporarily unavailable

After I modify some kernel (Fedora Core 1, Linux 2.4.22) parameters, the problem seems to go away. Anyone happy to prove the case?

# echo "8388608" > /proc/sys/net/core/rmem_max # echo "8388608" > /proc/sys/net/core/wmem_max # echo "8388608" > /proc/sys/net/core/rmem_default # echo "8388608" > /proc/sys/net/core/wmem_default

Zeus

This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed.

If you have received this email in error, you are prohibited from reading, copying, distributing and using the information. Please contact the sender immediately by return email and destroy the original message.

Serusers mailing list serusers@lists.iptel.org http://lists.iptel.org/mailman/listinfo/serusers

7779

Age (days ago)

7784

Last active (days ago)

sr-users@lists.kamailio.org

5 comments

4 participants

tags (0)

participants (4)

Alex Bligh
Andrei Pelinescu-Onciul
Nils Ohlmeier
zeusng