Hi,
bad checksum may be caused by the virtual network adapter that does not
perform hardware checksum. I had this issue on a kvm test setup.
MTU is to be looked at, as udp does not handle very well packet
fragmentation ( to say the least ). Could you reproduce the same issue
using an openvpn tcp virtual link between your two servers ?
Le 12/02/2015 12:23, Andrey Utkin a écrit :
We experience strange networking issue, not exactly
specific to
kamailio, but still related to it.
Rtpengine's "ng" interface uses UDP. Protocol messages contains SDP,
and for encrypted video call those messages exceed 1500 bytes.
Everything works fine within localhost, but when rtpengine and
Kamailio are on different hosts, and when hosts are Amazon-hosted, we
have trouble.
This is experienced with l3.large, t2.micro with Ubuntu 14. I believe
we don't have any special settings over system defaults.
We send a large datagram from remote host, e.g. with such trivial app in python:
import socket
UDP_IP = "123.123.123.123" # remote host IP
UDP_PORT = 33333
MESSAGE = """
.....0010......0020......0030......0040......0050......0060......0070......0080......0090......0100
.....0110......0120......0130......0140......0150......0160......0170......0180......0190......0200
.....0210......0220......0230......0240......0250......0260......0270......0280......0290......0300
.....0310......0320......0330......0340......0350......0360......0370......0380......0390......0400
.....0410......0420......0430......0440......0450......0460......0470......0480......0490......0500
.....0510......0520......0530......0540......0550......0560......0570......0580......0590......0600
.....0610......0620......0630......0640......0650......0660......0670......0680......0690......0700
.....0710......0720......0730......0740......0750......0760......0770......0780......0790......0800
.....0810......0820......0830......0840......0850......0860......0870......0880......0890......0900
.....0910......0920......0930......0940......0950......0960......0970......0980......0990......1000
.....1010......1020......1030......1040......1050......1060......1070......1080......1090......1100
.....1110......1120......1130......1140......1150......1160......1170......1180......1190......1200
.....1210......1220......1230......1240......1250......1260......1270......1280......1290......1300
.....1310......1320......1330......1340......1350......1360......1370......1380......1390......1400
.....1410......1420......1430......1440......1450......1460......1470......1480......1490......1500
.....1510......1520......1530......1540......1550......1560......1570......1580......1590......1600
.....1610......1620......1630......1640......1650......1660......1670......1680......1690......1700
.....1710......1720......1730......1740......1750......1760......1770......1780......1790......1800
.....1810......1820......1830......1840......1850......1860......1870......1880......1890......1900
.....1910......1920......1930......1940......1950......1960......1970......1980......1990......2000"""
print "UDP target IP:", UDP_IP
print "UDP target port:", UDP_PORT
print "message:", MESSAGE
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.sendto(MESSAGE, (UDP_IP, UDP_PORT))
Then we listen on that port with such trivial python app:
import socket
UDP_IP = "172.31.4.102" # local ip
UDP_PORT = 33333
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.bind((UDP_IP, UDP_PORT))
while True:
data, addr = sock.recvfrom(0x10000)
print "received message:", data
Meanwhile, we monitor the traffic with e.g. ngrep:
ngrep -t -e -d any -W byline -O large_udp.pcap port 33333 or
'(ip[6:2]' '&' '0x1fff)' '!=' '0'
(the part after "or" catches segments of segmented packets)
About the host:
# uname -a
Linux hostname 3.16.0-30-generic #40~14.04.1-Ubuntu SMP Thu Jan 15
17:43:14 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Also linux-image-3.13.0-36-generic and linux-image-3.13.0-45-generic
behave in same way.
What we see:
- ngrep shows the packets with correct contents. All segments are delivered.
- application doesn't get any data at all
Rarely dmesg shows such messages:
[ 102.161679] UDP: bad checksum. From 123.123.123.124:56439 to
172.31.4.102:33333 ulen 2008
but it is logged really rarely, so this is surely not what happens on
every packet transmission.
This test works fine on e.g. cheapest DigitalOcean VPS.
I am concerned with this issue because rtpengine software has UDP
interface. So on Amazon hosts this interface works only within
localhost, and I cannot distribute software to different nodes.
Any thoughts? What's wrong, how to fix?