On Jun 18, 2010 at 12:42, marius zbihlei <marius.zbihlei(a)1and1.ro> wrote:
[...]
Hello Andrei,
Just performed a couple of tests (I was busy myself), but I think I
have some interesting results. I have tested with 25 UAC/UAS's per
test server, each pair generating 500 calls/s for a total of 12,500
calls/s . The test servers(running each 25 sipp as UAC and 25 sipp
as UAS on different ports) where 2 quad core Xeon machines in the
same LAN (Gigabit ethernet between them). Ser was doing a simple
forward() based on the R-URI of the request, having 8 worker
processes.
Great!
1. SER on a quad core Xeon, kernel 2.6.26.
a. I have enable just one test server for a total of 12,500 calls/s.
In this case the CPU usage was worse in case of UDP socks
(udp_raw=0)(median value)
"usr", "sys", "idl", "wai",
"hiq", "siq"
13.584, 15.030, 50.713, 0.0, 2.950, 17.723
For RAW socks (udp_raw=1) these values showed up:
"usr", "sys", "idl", "wai",
"hiq", "siq"
10.396, 4.950, 76.238, 0.0, 2.970, 5.446
So the biggest difference is in software irq servicing time (last
colum) and in sys. A little weird is the comparable usr CPU, I
expected to be greater in raw sock mode.
Yes, it's strange that udp sockets eat more cpu then the raw ones.
For example on the raw sockets I do the udp checksum by hand in an
unoptimized way.
b. I enabled both testing machines for a total of 25,000 calls/s.
In this case the CPU usage was almost identical, but mostly because
the sipp instances couldn't send 500 reqs/s in the UDP mode .I
limited sipp to send 20,000 calls per UAC/UAS pair. In the case of a
raw sock it took an average of 55 s (closer to the 40s normal ideal
value), but in udp mode it took almost 88s to send the 20,000 calls.
The system load was the same (27% Idle).
That's way better then I expected...
2. SER on a Dual quad core Xeon, kernel 2.6.32
I have done only some basic runs but the results are not consistent
with the ones on the other Ser machine. Siq time is the same, rate
is steady at 500 calls/s but user CPU is greater in raw sock mode.
I have dig around a bit and came over two interesting patches in 2.6.29
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h…
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h…
The first one has to do with opening new sockets & binding fast and the
other one is mostly the receive side. The second might help, but not for
sending, while the first one should speedup rtpproxy (if it doesn't
pre-bind the sockets on startup).
The locking on send problem is present also in the latest 2.6.35
(lock_sock(sk) in udp_sendmsg()).
Actually it looks like newer kernels are a bit slower on a receive side,
a problem that is solved in 2.6.35:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h…
The slowdown (memory accounting) was added in 95766ff:
$ git describe --all --contains 95766fff
tags/v2.6.25-rc1~1162^2~899
So it's present form 2.6.25 onwards.
The release notes here:
http://kernelnewbies.org/Linux_2_6_29#head-612c6b882f705935cc804d4af0b38316…
As time allows me I will rerun some tests and provide some graphs if
necessary.
Thanks a lot! I guess I should start thinking about making it more
portable (*BSD support) and then merging it in master.
I might be able to do some testing next week, if I manage
to setup some big testing environment and to finish the TCP & TLS stress
tests by then (kind of a low probability).
Andrei