On Thursday 01 July 2004 14:11, you wrote:
On Jun 30, 2004 at 08:53, Ezequiel Colombo ecolombo@arcotel.net
wrote:
Hi all, i am tested the mediaproxy 1.0.1 version on a dual CPU (2.6GHz) and get at least 180 simultaneous calls ! This is a good and very scalable solution to solve NAT related problems.
Ezequiel's test result seems to be in the expected range (given our own test results).
In our tests, on a 1GHz Athlon CPU running linux 2.4.26 we got the following results:
mediaproxy reaches 95% CPU load at around 60 simultaneous calls mediaproxy gets to 100% CPU load at 80-90 calls
rtpproxy reaches 95% CPU load at 120 calls rtpproxy gets to 100% CPU load at about 150 calls
Given that we discovered that rtpproxy is only at most twice as fast as mediaproxy, we never bothered to re-implement mediaproxy in C. It's much easier to stack another box (or even more), than to spend lots of time in optimizing a program and in the end to only get it to run twice as fast.
Given that Andrei's results are very different from what our tests showed, we suspect that there were some errors in the testing procedure. Also there is an error in his patch (see comments below for details). I attached a version of rtpgenerator that was modified to work with rtpproxy, but with this error fixed.
I tested it on a dual athlon mp2000. I've also modified rtpgenerator to work with nathelper (patch for nh version attached).
Patch has a problem. It sends packets to the same IP as the generator itself (the default is 127.0.0.1) and not to the rtpproxy returned IP. This means that in it's default configuration it sends from 127.0.0.1 to 127.0.0.1 where rtpproxy doesn't listen. We tried it and it gets flooded back with "udp port unreachable". This explains the high load on the generator you've seen and the much lower load on rtpproxy.
rtpgenerator has a few drawbacks: it eats more cpu than rtproxy and cannot create more than 510 calls (the python code seems to use internally select witch doesn't work on more than 1024 file descriptors).
You can call poll3 instead of poll, but it is not designed to generate that many calls. That's why we limited it to 30 max. If it generates more it loads the cpu too much and the results are not conclusive. Instead run multiple instances with 30 calls and they should do better. Or run it on a different host that the one that runs rtpproxy.
Another problem if you try to generate too many calls with it is that it won't be able to send the packets at exactly 20ms time intervals. In this case you will see a high load on the generator and a much smaller load on rtpproxy itself. (as we noticed the thing that generates the high load on both mediaproxy and rtpproxy is the combination of many calls with the small time interval that the packets arrive at the proxy (10-20 ms). if this time interval is increased because the generator cannot push that many data at every 20ms the load on rtpproxy will decrease very quickly while the load on the generator will sharply increase).
As a conclusion, I suggest the following to get results that are closer to reality:
1. Preferably run the generator on another machine. 2. Never use a count higher than 50 (at least for a 1GHz machine) for a running instance of the generator. The rule is to not see the generator eat more than 1% of CPU. If you need more calls run multiple instances of the generator to sum up the number of calls. (The attached generator accepts up to 50 calls - increased from 30) 3. Specify the commands you used to run both the generator and the proxy so it is easy to spot problems like data being sent to 127.0.0.1 or the fact that data is only sent in one direction but never answered (we've seen this sometimes if --ip is not specified) 4. Make sure the data is passing through the proxy. With mediaproxy you can use the sessions.py script of the media_sessions.phtml web page. With rtpproxy use iptraf or tcpdump.
In our test on a 1GHz Athlon linux 2.4.26 we got the results specified at the beginning of this email running the following commands:
./mediaproxy.py --no-fork --ip=10.0.0.1 --listen=10.0.0.1 --allow=any ./rtpgenerator.py --ip=10.0.0.1 --proxy=10.0.0.1 --g711 --count=30 ./rtpgenerator.py --ip=10.0.0.1 --proxy=10.0.0.1 --g711 --count=30
and for rtpproxy: ./rtpproxy -f -l 10.0.0.1 -s /tmp/rtpproxy.sock ./rtp2generator.py --ip=10.0.0.1 --proxy=/tmp/rtpproxy.sock --g711 --count=50 ./rtp2generator.py --ip=10.0.0.1 --proxy=/tmp/rtpproxy.sock --g711 --count=50 ./rtp2generator.py --ip=10.0.0.1 --proxy=/tmp/rtpproxy.sock --g711 --count=20
In both cases, the CPU load of any generator was under 1%, while the proxies reached 95%
Also mediaproxy cpu usage varies a lot (it doesn't stay constant it has a lot of peaks).
mediaproxy 1.0 eats all the cpu (100% ) at arround 90-120 simultaneous calls (it starts reaching 100% at 90 calls). rtpproxy (unstable, latest cvs) uses only 60-66% cpu for 500 simultaneous calls (rtpgenerator cannot generate more). rtpgenerator uses 70-81% during this time.
I've measured all this after 3-4 minute from starting rtpgenerator (because I was not interested on the initial session creation;, for example in the first minute rtpgenerator uses almost 100%).
Another interesting peformance benchmark is how many sessions can be handled per second, without any rtp traffic (the equivalent of a sip call: invite -> request, 200 ok -> lookup, bye -> delete). This shows how much is ser slowed down by sending commands to the rtp proxy.
Unfortunately my benchmark program works only with rtpproxy (different commands). Here are some results (this time on a pentiumM 1.6Ghz single cpu) maximum 100 simultaneous session : 5011 sessions/s 4000 simultaneous seessions : 46 sessions/S (this happens because poll scales very badly with the number of FDs used; 4000 sessions => 16000 open fds => rtpproxy spends the most time in the poll syscall).
Andrei