(Sorry for the long post. Unfortunately, not all bugs are simple.)
I've got a crash bug that occurs when freeing To header parameters when freeing a SIP message. It only occurs in certain circumstances. I can reproduce the circumstances on my side consistently, but I suspect it would be difficult for someone else to reproduce them independently.
Those circumstances are:
1. A Kamailio registrar routes an inbound call from the PSTN to a user using a Snom One PBX using serial forking across two different bindings (two different AOR domains) that come from two different registrars (going into the same database).
2. The PBX sends back a 480 Temporarily Unavailable message to Kamailio in response to every INVITE, on all branches, all the time.
3. It takes several attempts (usually 5-10) of calling that PBX to get to Kamailio crash.
4. It only happens when the initial INVITE is processed using a particular sequence of routes in the route script. There is nothing unusual about the routes themselves, and I've eliminated all the side effects and dependencies I can. It just doesn't happen when I have the call zig through those routes as opposed to those routes. That suggests package memory corruption that is tripped over by a particular conjuncture of circumstances.
Here is the 480 Temporarily Unavailable message sent by the PBX. IPs and numbers have been redacted for customer privacy, but the string lengths are the same:
Here's the INVITE we're sending to the PBX:
INVITE sip:16504578199@184.178.234.121:5060;transport=udp;line=eccbc87e SIP/2.0 Record-Route: sip:55.177.31.199;lr=on;ftag=c00080a-13c4-521c7390-2e862c51-6c96b554;vsf=AAAAAB0GDg4NDQUCBwd3A3AKHgEaFwUbCBsIHjEwOjUwNjA7aXN1cC1vbGk9MA--;dlgcor=5af.3bb From: sip:6784841111@192.13.219.87;tag=c00080a-13c4-521c7390-2e862c51-6c96b554 To: sip:16504578199@55.177.31.199:5060 Call-ID: CXC-72-6a0635d8-c00080a-13c4-521c7390-2e862c51-7eeb3166@192.13.219.87 CSeq: 1 INVITE Via: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKcdfc.af721dd5.1 Via: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-3290-521c7390-2e862c51-3a62fcfd Max-Forwards: 32 Content-Disposition: session;handling=required Contact: sip:+16784841111@192.13.219.87:5060;isup-oli=0;maddr=192.13.219.87;transport=udp Content-Type: application/sdp Content-Length: 274 Privacy: off P-Asserted-Identity: "Unavailable" sip:+16784841111@192.13.219.87
v=0 o=Acme_UAS 0 1 IN IP4 175.154.21.89 s=SIP Media Capabilities c=IN IP4 12.131.2.144 t=0 0 m=audio 6698 RTP/AVP 0 18 101 a=rtpmap:0 PCMU/8000 a=rtpmap:18 G729/8000 a=rtpmap:101 telephone-event/8000 a=fmtp:18 annexb=no a=fmtp:101 0-15 a=maxptime:30 a=sendrecv
And here's the 480 Temporarily Unavailable we get back every time we do, on both branches. Nothing about it looks unusual to me:
SIP/2.0 480 Temporarily Unavailable Via: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKcdfc.af721dd5.0 Via: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-3290-521c7390-2e862c51-3a62fcfd Record-Route: sip:55.177.31.199;lr=on;ftag=c00080a-13c4-521c7390-2e862c51-6c96b554;vsf=AAAAAB0GDg4NDQUCBwd3A3AKHgEaFwUbCBsIHjEwOjUwNjA7aXN1cC1vbGk9MA--;dlgcor=5af.3bb From: sip:6784841111@192.13.219.87;tag=c00080a-13c4-521c7390-2e862c51-6c96b554 To: sip:16504578199@55.177.31.199:5060;tag=d3516ef5fa Call-ID: CXC-72-6a0635d8-c00080a-13c4-521c7390-2e862c51-7eeb3166@192.13.219.87 CSeq: 1 INVITE Contact: sip:5282797646@195.105.225.111:5060;transport=udp Supported: 100rel, replaces, norefersub Allow-Events: refer Allow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE Accept: application/sdp User-Agent: snomONE/4.5.1.1107 Zeta Perseids Content-Length: 0
But when this happens a few times, Kamailio crashes:
05:38:24.317781 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 592 E..l..@.4.2....yA.,......X..SIP/2.0 100 Trying -- 05:38:24.333573 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i.$SIP/2.0 480 Temporarily Unavailable -- 05:38:24.333767 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@..<A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:24.335593 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1187 E.......@..3A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:24.385873 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i.$SIP/2.0 480 Temporarily Unavailable -- 05:38:24.386126 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@..:A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:24.831672 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1187 E.......@..1A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:24.877627 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i.$SIP/2.0 480 Temporarily Unavailable -- 05:38:24.877933 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@..8A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:25.831617 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1187 E.......@../A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:25.881560 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i.$SIP/2.0 480 Temporarily Unavailable -- 05:38:25.881731 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@..6A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:32.219484 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1183 E.......@..1A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:32.270568 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 588 E..h..@.4.2....yA.,......T..SIP/2.0 100 Trying -- 05:38:32.289120 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 861 E..y..@.4.1....yA.,......e.7SIP/2.0 480 Temporarily Unavailable -- 05:38:32.289284 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E.......@..7A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:32.291180 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1183 E.......@../A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:32.340763 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 861 E..y..@.4.1....yA.,......e.7SIP/2.0 480 Temporarily Unavailable -- 05:38:32.340986 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E.......@..5A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:32.769133 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1183 E.......@..-A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:32.814851 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 861 E..y..@.4.1....yA.,......e.7SIP/2.0 480 Temporarily Unavailable -- 05:38:32.814993 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E.... ..@..3A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:33.769113 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1183 E.... -- 05:38:33.815374 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 861 E..y..@.4.1....yA.,......e.7SIP/2.0 480 Temporarily Unavailable -- 05:38:33.815600 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E.......@..1A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:42.371141 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@..$A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:42.422190 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 592 E..l..@.4.2....yA.,......X.jSIP/2.0 100 Trying -- 05:38:42.438849 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......iE SIP/2.0 480 Temporarily Unavailable -- 05:38:42.439005 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 410 ..@..-A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:42.440373 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@.."A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:42.490261 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......iE SIP/2.0 480 Temporarily Unavailable -- 05:38:42.490348 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 410 E.......@..+A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:42.894122 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@.. A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:42.940323 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......iE SIP/2.0 480 Temporarily Unavailable -- 05:38:42.940562 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 410 E.......@..)A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:43.894116 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:43.940482 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......iE SIP/2.0 480 Temporarily Unavailable -- 05:38:43.940604 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 410 E.......@..'A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:50.261551 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:50.313509 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 592 E..l..@.4.2....yA.,......X..SIP/2.0 100 Trying -- 05:38:50.329411 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i..SIP/2.0 480 Temporarily Unavailable -- 05:38:50.329584 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@..$A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:50.331250 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:50.379415 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i..SIP/2.0 480 Temporarily Unavailable -- 05:38:50.379584 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@.."A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:50.769124 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:50.817797 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i..SIP/2.0 480 Temporarily Unavailable -- 05:38:50.817936 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@.. A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:51.769119 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:51.815689 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i..SIP/2.0 480 Temporarily Unavailable -- 05:38:51.815844 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:00.197890 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1182 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:00.248713 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 586 E..f..@.4.2....yA.,......RD_SIP/2.0 100 Trying -- 05:39:00.264983 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 859 E..w..@.4.1....yA.,......c-.SIP/2.0 480 Temporarily Unavailable -- 05:39:00.265140 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E.......@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:00.266675 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1182 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:39:00.317151 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 859 E..w..@.4.1....yA.,......c-.SIP/2.0 480 Temporarily Unavailable -- 05:39:00.317285 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E.......@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:00.706627 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1182 E.... ..@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:39:00.752906 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 859 E..w..@.4.1....yA.,......c-.SIP/2.0 480 Temporarily Unavailable -- 05:39:00.753054 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E....!..@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:01.706639 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1182 E...."..@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:39:01.753036 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 859 E..w..@.4.1....yA.,......c-.SIP/2.0 480 Temporarily Unavailable -- 05:39:01.753198 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E....#..@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:08.088801 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1190 E....$..@.. -- 05:39:08.139841 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 594 E..n..@.4.2....yA.,......ZhVSIP/2.0 100 Trying -- 05:39:08.156036 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 867 E.....@.4.1....yA.,......k4.SIP/2.0 480 Temporarily Unavailable -- 05:39:08.156222 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 412 E....%..@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:08.157697 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1190 E....&..@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:39:08.208158 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 867 E.....@.4.1....yA.,......k4.SIP/2.0 480 Temporarily Unavailable -- 05:39:08.208282 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 412 E....'..@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:08.644104 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1190 E....(..@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:39:08.690068 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 867 E.....@.4.1....yA.,......k4.SIP/2.0 480 Temporarily Unavailable -- 05:39:08.690212 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 412 E....)..@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:09.644112 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1190 E....*..@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:39:09.690201 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 867 E.....@.4.1....yA.,......k4.SIP/2.0 480 Temporarily Unavailable -- 05:39:09.690374 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 412 A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0
Anyway, my core dumps were showing me qm_*() free errors. To get at the problem, I disabled loop unrolling, removed -09 and added the GDB debug flags (-ggdb*) to the compile flags. When I do that, I get one of these two call stacks--mostly the first, and occasionally the second:
#0 0x000000000055dc2e in free_to_params (tb=0x7fcb38f5f750) at parser/parse_to.c:827 #1 0x000000000055dc75 in free_to (tb=0x7fcb38f5f750) at parser/parse_to.c:836 #2 0x000000000053d9df in clean_hdr_field (hf=0x7fcb38f5fb50) at parser/hf.c:169 #3 0x000000000053db49 in free_hdr_field_lst (hf=0x7fcb38f590e0) at parser/hf.c:223 #4 0x0000000000542330 in free_sip_msg (msg=0x7fcb38f5edf0) at parser/msg_parser.c:729 #5 0x000000000049d9cb in receive_msg ( buf=0x905580 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK818c.b6308b63.0\r\nVia: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-28db-521c65ea-2e50da5e- 8f9ae0d\r\nRecord-Route: <sip"..., len=864, rcv_info=0x7fff2bcad950) at receive.c:294 #6 0x000000000052f5cd in udp_rcv_loop () at udp_server.c:557 #7 0x0000000000467de2 in main_loop () at main.c:1638 #8 0x000000000046ad8b in main (argc=13, argv=0x7fff2bcadc88) at main.c:2566
----
#0 0x00000000005394e9 in qm_detach_free (qm=0x7f25c3037010, frag=0x7f25c3256af0) at mem/q_malloc.c:266 #1 0x0000000000539744 in qm_malloc (qm=0x7f25c3037010, size=960) at mem/q_malloc.c:386 #2 0x00000000004bba49 in rval_new_empty (extra_size=102) at rvalue.c:236 #3 0x00000000004bbabb in rval_new_str (s=0x7fffbad95460, extra_size=80) at rvalue.c:260 #4 0x00000000004be1b3 in rval_convert (h=0x7fffbad972a0, msg=0x7f25c2587920, type=RV_STR, v=0x7f25c3182ee8, c=0x7fffbad95620) at rvalue.c:1321 #5 0x00000000004bf657 in rval_str_lop2 (h=0x7fffbad972a0, msg=0x7f25c2587920, res=0x7fffbad95ac8, op=RVE_EQ_OP, l=0x7f25c3182ee8, c1=0x7fffbad95620, r=0x7f25c3183668, c2=0x0) at rvalue.c:1752 #6 0x00000000004c028d in rval_expr_eval_int (h=0x7fffbad972a0, msg=0x7f25c2587920, res=0x7fffbad95ac8, rve=0x7f25c3183d40) at rvalue.c:2058 #7 0x0000000000418d5a in do_action (h=0x7fffbad972a0, a=0x7f25c3185190, msg=0x7f25c2587920) at action.c:1050 #8 0x0000000000421aa7 in run_actions (h=0x7fffbad972a0, a=0x7f25c3182d40, msg=0x7f25c2587920) at action.c:1573 #9 0x000000000042047f in do_action (h=0x7fffbad972a0, a=0x7f25c3185530, msg=0x7f25c2587920) at action.c:1374 #10 0x0000000000421aa7 in run_actions (h=0x7fffbad972a0, a=0x7f25c317a2a0, msg=0x7f25c2587920) at action.c:1573 #11 0x0000000000418fa2 in do_action (h=0x7fffbad972a0, a=0x7f25c318ac10, msg=0x7f25c2587920) at action.c:1065 #12 0x0000000000421aa7 in run_actions (h=0x7fffbad972a0, a=0x7f25c318ac10, ---Type <return> to continue, or q <return> to quit--- msg=0x7f25c2587920) at action.c:1573 #13 0x0000000000418ffb in do_action (h=0x7fffbad972a0, a=0x7f25c318ad10, msg=0x7f25c2587920) at action.c:1069 #14 0x0000000000421aa7 in run_actions (h=0x7fffbad972a0, a=0x7f25c3163930, msg=0x7f25c2587920) at action.c:1573 #15 0x0000000000416f3a in do_action (h=0x7fffbad972a0, a=0x7f25c3194fc0, msg=0x7f25c2587920) at action.c:690 #16 0x0000000000421aa7 in run_actions (h=0x7fffbad972a0, a=0x7f25c318eaf0, msg=0x7f25c2587920) at action.c:1573 #17 0x0000000000422231 in run_top_route (a=0x7f25c318eaf0, msg=0x7f25c2587920, c=0x0) at action.c:1658 #18 0x00007f25c233b220 in run_failure_handlers (t=0x7f24be53b2a0, rpl=0x7f25c32554b0, code=480, extra_flags=64) at t_reply.c:1024 #19 0x00007f25c233c39b in t_should_relay_response (Trans=0x7f24be53b2a0, new_code=480, branch=0, should_store=0x7fffbad97580, should_relay=0x7fffbad97584, cancel_data=0x7fffbad97790, reply=0x7f25c32554b0) at t_reply.c:1300 #20 0x00007f25c233dec4 in relay_reply (t=0x7f24be53b2a0, p_msg=0x7f25c32554b0, branch=0, msg_status=480, cancel_data=0x7fffbad97790, do_put_on_wait=1) at t_reply.c:1703 #21 0x00007f25c2340f46 in reply_received (p_msg=0x7f25c32554b0) at t_reply.c:2370 #22 0x0000000000458861 in do_forward_reply (msg=0x7f25c32554b0, mode=0) at forward.c:799 #23 0x00000000004590d0 in forward_reply (msg=0x7f25c32554b0) at forward.c:882 #24 0x000000000049d8a4 in receive_msg ( ---Type <return> to continue, or q <return> to quit--- buf=0x905580 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK0c.2611fa52.0\r\nVia: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-293c-521c669b-2e538ce5- 7bec16be\r\nRecord-Route: <sip:"..., len=863, rcv_info=0x7fffbad97af0) at receive.c:270 #25 0x000000000052f5cd in udp_rcv_loop () at udp_server.c:557 #26 0x0000000000467de2 in main_loop () at main.c:1638 #27 0x000000000046ad8b in main (argc=13, argv=0x7fffbad97e28) at main.c:2566
But, 90% of the time, it's the first one. The second one shows up only occasionally. Sometimes, this variant appears:
#0 0x0000000000538d7a in qm_insert_free (qm=0x7f069c7a6010, frag=0x7f069c9c57e0) at mem/q_malloc.c:182 #1 0x0000000000539c3d in qm_free (qm=0x7f069c7a6010, p=0x7f069c9c57f0) at mem/q_malloc.c:527 #2 0x000000000055dc4c in free_to_params (tb=0x7f069c9c73a0) at parser/parse_to.c:828 #3 0x000000000055dc75 in free_to (tb=0x7f069c9c73a0) at parser/parse_to.c:836 #4 0x000000000053d8d5 in clean_hdr_field (hf=0x7f069c7d7da0) at parser/hf.c:113 #5 0x000000000053db49 in free_hdr_field_lst (hf=0x7f069c7d6340) at parser/hf.c:223 #6 0x0000000000542330 in free_sip_msg (msg=0x7f069c9c44b0) at parser/msg_parser.c:729 #7 0x000000000049d9cb in receive_msg ( buf=0x905580 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKeb96.24a5718.0\r\nVia: SIP/2.0/UDP 208.94.157.10:5060;branch=z9hG4bK-368a-521c67f1-3b3bc7c- 5366754e\r\nRecord-Route: sip:55.177.31.199;lr=on;ftag=c00070a-13c4- 521c67f1-3b3bc7c- 3af08849;vsf=AAAAAB0GDg4NDQUCBwd3A3AKHgEaFwUbBhsGHjEwOjUwNjA7aXN1cC1vbGk 9MA--;dlgcor=789.3d01\r\nFrom: sip:6784841111@208.94.157.10;tag=c00070a-13c4-521c67f1-3b3bc7c- 3af08849\r\nTo: sip:16504578199@55.177.31.199:5060;tag=3af8238437\r\nCall-ID: CXC-408- 6bb35b50-c00070a-13c4-521c67f1-3b3bc7c-3c339c13@208.94.157.10\r\nCSeq: 1 INVITE\r\nContact: sip:5282797646@195.105.225.111:5060;transport=udp\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n", len=862, rcv_info=0x7fff87c23070)
So, I focused on the first one. The crash is on pkg_free() in this loop in parser/parse_to.c:
void free_to_params(struct to_body* const tb) { struct to_param *tp=tb->param_lst; struct to_param *foo; while (tp){ foo = tp->next; pkg_free(tp); tp=foo; } }
Removing the pkg_free() definitely eliminates the crash. Or, at least, I haven't been able to get the proxy to crash in that scenario.
In poking at the values of the to_body,
(gdb) frame 2 #2 0x000000000055dc4c in free_to_params (tb=0x7f069c9c73a0) at parser/parse_to.c:828 828 pkg_free(tp); (gdb) print tb $30 = (struct to_body * const) 0x7f069c9c73a0 (gdb) print tb->uri->s $31 = 0x9056e6 "sip:6784841111@208.94.157.10>;tag=c00070a-13c4-521c67f1- 3b3bc7c-3af08849\r\nTo: sip:16504578199@55.177.31.199:5060;tag=3af8238437\r\nCall-ID: CXC-408- 6bb35b50-c00070a-13c4-521c67f1-3b3bc7c-3c339c13@208.94.157.10\r\nCSeq: 1 INVITE\r\nContact: sip:5282797646@195.105.225.111:5060;transport=udp\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n" (gdb) print tb->param_lst $32 = (struct to_param *) 0x7f069c9c57f0 (gdb) print tb->param_lst->next $33 = (struct to_param *) 0x0 (gdb) print tb->param_lst->value $34 = { s = 0x905708 "c00070a-13c4-521c67f1-3b3bc7c-3af08849\r\nTo: sip:16504578199@55.177.31.199:5060;tag=3af8238437\r\nCall-ID: CXC-408- 6bb35b50-c00070a-13c4-521c67f1-3b3bc7c-3c339c13@208.94.157.10\r\nCSeq: 1 INVITE\r\nContact: sip:5282797646@195.105.225.111:5060;transport=udp\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n", len = 38} (gdb) print tb->param_lst->name $35 = {s = 0x3e653834323335 <Address 0x3e653834323335 out of bounds>, len = 3}
I eventually noticed that I can't seem to get at the 'name' value in the first (and only) parameter list node:
(gdb) print tb->param_lst->name $35 = {s = 0x3e653834323335 <Address 0x3e653834323335 out of bounds>, len = 3}
I guess that should be 'tag', right? It's the only parameter, unsurprisingly:
(gdb) print tb->param_lst->next $33 = (struct to_param *) 0x0
Anyway, I can also say that accessing the pointer into the message buffer to which it corresponds crashes it, but only for that 480 -- not for any other SIP message that comes through the proxy:
while (tp){ foo = tp->next;
LM_ERR("Freeing up the hinterland: <%s>\n", tp->name.s);
pkg_free(tp); tp=foo; }
(gdb) where #0 0x0000003dbae480ac in vfprintf () from /lib64/libc.so.6 #1 0x0000003dbaee4fe0 in __vsyslog_chk () from /lib64/libc.so.6 #2 0x0000003dbaee5210 in syslog () from /lib64/libc.so.6 #3 0x000000000055dd47 in free_to_params (tb=0x7fc1def09610)
So, I really get the sense that there's something wrong with that 480 that's corrupting memory. Is that an accurate conclusion? If so, what it could be, and what's the next step to try to diagnose further?
Thank you!
-- Alex
Hello,
can you try this patch? - http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=14835f89...
One reason for such crash could be double-free, which could eventually happen because the pointer to params was not reset after freeing the list.
Cheers, Daniel
On 8/27/13 12:17 PM, Alex Balashov wrote:
(Sorry for the long post. Unfortunately, not all bugs are simple.)
I've got a crash bug that occurs when freeing To header parameters when freeing a SIP message. It only occurs in certain circumstances. I can reproduce the circumstances on my side consistently, but I suspect it would be difficult for someone else to reproduce them independently.
Those circumstances are:
- A Kamailio registrar routes an inbound call from the PSTN to a user
using a Snom One PBX using serial forking across two different bindings (two different AOR domains) that come from two different registrars (going into the same database).
- The PBX sends back a 480 Temporarily Unavailable message to
Kamailio in response to every INVITE, on all branches, all the time.
- It takes several attempts (usually 5-10) of calling that PBX to get
to Kamailio crash.
- It only happens when the initial INVITE is processed using a
particular sequence of routes in the route script. There is nothing unusual about the routes themselves, and I've eliminated all the side effects and dependencies I can. It just doesn't happen when I have the call zig through those routes as opposed to those routes. That suggests package memory corruption that is tripped over by a particular conjuncture of circumstances.
Here is the 480 Temporarily Unavailable message sent by the PBX. IPs and numbers have been redacted for customer privacy, but the string lengths are the same:
Here's the INVITE we're sending to the PBX:
INVITE sip:16504578199@184.178.234.121:5060;transport=udp;line=eccbc87e SIP/2.0 Record-Route: sip:55.177.31.199;lr=on;ftag=c00080a-13c4-521c7390-2e862c51-6c96b554;vsf=AAAAAB0GDg4NDQUCBwd3A3AKHgEaFwUbCBsIHjEwOjUwNjA7aXN1cC1vbGk9MA--;dlgcor=5af.3bb From: sip:6784841111@192.13.219.87;tag=c00080a-13c4-521c7390-2e862c51-6c96b554 To: sip:16504578199@55.177.31.199:5060 Call-ID: CXC-72-6a0635d8-c00080a-13c4-521c7390-2e862c51-7eeb3166@192.13.219.87 CSeq: 1 INVITE Via: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKcdfc.af721dd5.1 Via: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-3290-521c7390-2e862c51-3a62fcfd Max-Forwards: 32 Content-Disposition: session;handling=required Contact: sip:+16784841111@192.13.219.87:5060;isup-oli=0;maddr=192.13.219.87;transport=udp Content-Type: application/sdp Content-Length: 274 Privacy: off P-Asserted-Identity: "Unavailable" sip:+16784841111@192.13.219.87
v=0 o=Acme_UAS 0 1 IN IP4 175.154.21.89 s=SIP Media Capabilities c=IN IP4 12.131.2.144 t=0 0 m=audio 6698 RTP/AVP 0 18 101 a=rtpmap:0 PCMU/8000 a=rtpmap:18 G729/8000 a=rtpmap:101 telephone-event/8000 a=fmtp:18 annexb=no a=fmtp:101 0-15 a=maxptime:30 a=sendrecv
And here's the 480 Temporarily Unavailable we get back every time we do, on both branches. Nothing about it looks unusual to me:
SIP/2.0 480 Temporarily Unavailable Via: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKcdfc.af721dd5.0 Via: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-3290-521c7390-2e862c51-3a62fcfd Record-Route: sip:55.177.31.199;lr=on;ftag=c00080a-13c4-521c7390-2e862c51-6c96b554;vsf=AAAAAB0GDg4NDQUCBwd3A3AKHgEaFwUbCBsIHjEwOjUwNjA7aXN1cC1vbGk9MA--;dlgcor=5af.3bb From: sip:6784841111@192.13.219.87;tag=c00080a-13c4-521c7390-2e862c51-6c96b554 To: sip:16504578199@55.177.31.199:5060;tag=d3516ef5fa Call-ID: CXC-72-6a0635d8-c00080a-13c4-521c7390-2e862c51-7eeb3166@192.13.219.87 CSeq: 1 INVITE Contact: sip:5282797646@195.105.225.111:5060;transport=udp Supported: 100rel, replaces, norefersub Allow-Events: refer Allow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE Accept: application/sdp User-Agent: snomONE/4.5.1.1107 Zeta Perseids Content-Length: 0
But when this happens a few times, Kamailio crashes:
05:38:24.317781 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 592 E..l..@.4.2....yA.,......X..SIP/2.0 100 Trying -- 05:38:24.333573 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i.$SIP/2.0 480 Temporarily Unavailable -- 05:38:24.333767 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@..<A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:24.335593 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1187 E.......@..3A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:24.385873 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i.$SIP/2.0 480 Temporarily Unavailable -- 05:38:24.386126 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@..:A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:24.831672 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1187 E.......@..1A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:24.877627 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i.$SIP/2.0 480 Temporarily Unavailable -- 05:38:24.877933 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@..8A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:25.831617 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1187 E.......@../A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:25.881560 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i.$SIP/2.0 480 Temporarily Unavailable -- 05:38:25.881731 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@..6A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:32.219484 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1183 E.......@..1A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:32.270568 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 588 E..h..@.4.2....yA.,......T..SIP/2.0 100 Trying -- 05:38:32.289120 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 861 E..y..@.4.1....yA.,......e.7SIP/2.0 480 Temporarily Unavailable -- 05:38:32.289284 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E.......@..7A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:32.291180 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1183 E.......@../A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:32.340763 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 861 E..y..@.4.1....yA.,......e.7SIP/2.0 480 Temporarily Unavailable -- 05:38:32.340986 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E.......@..5A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:32.769133 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1183 E.......@..-A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:32.814851 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 861 E..y..@.4.1....yA.,......e.7SIP/2.0 480 Temporarily Unavailable -- 05:38:32.814993 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E.... ..@..3A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:33.769113 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1183 E.... -- 05:38:33.815374 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 861 E..y..@.4.1....yA.,......e.7SIP/2.0 480 Temporarily Unavailable -- 05:38:33.815600 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E.......@..1A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:42.371141 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@..$A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:42.422190 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 592 E..l..@.4.2....yA.,......X.jSIP/2.0 100 Trying -- 05:38:42.438849 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......iE SIP/2.0 480 Temporarily Unavailable -- 05:38:42.439005 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 410 ..@..-A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:42.440373 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@.."A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:42.490261 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......iE SIP/2.0 480 Temporarily Unavailable -- 05:38:42.490348 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 410 E.......@..+A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:42.894122 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@.. A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:42.940323 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......iE SIP/2.0 480 Temporarily Unavailable -- 05:38:42.940562 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 410 E.......@..)A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:43.894116 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:43.940482 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......iE SIP/2.0 480 Temporarily Unavailable -- 05:38:43.940604 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 410 E.......@..'A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:50.261551 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:50.313509 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 592 E..l..@.4.2....yA.,......X..SIP/2.0 100 Trying -- 05:38:50.329411 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i..SIP/2.0 480 Temporarily Unavailable -- 05:38:50.329584 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@..$A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:50.331250 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:50.379415 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i..SIP/2.0 480 Temporarily Unavailable -- 05:38:50.379584 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@.."A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:50.769124 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:50.817797 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i..SIP/2.0 480 Temporarily Unavailable -- 05:38:50.817936 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@.. A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:38:51.769119 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1188 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:38:51.815689 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 865 E..}..@.4.1....yA.,......i..SIP/2.0 480 Temporarily Unavailable -- 05:38:51.815844 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 411 E.......@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:00.197890 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1182 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:00.248713 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 586 E..f..@.4.2....yA.,......RD_SIP/2.0 100 Trying -- 05:39:00.264983 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 859 E..w..@.4.1....yA.,......c-.SIP/2.0 480 Temporarily Unavailable -- 05:39:00.265140 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E.......@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:00.266675 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1182 E.......@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:39:00.317151 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 859 E..w..@.4.1....yA.,......c-.SIP/2.0 480 Temporarily Unavailable -- 05:39:00.317285 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E.......@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:00.706627 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1182 E.... ..@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:39:00.752906 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 859 E..w..@.4.1....yA.,......c-.SIP/2.0 480 Temporarily Unavailable -- 05:39:00.753054 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E....!..@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:01.706639 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1182 E...."..@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:39:01.753036 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 859 E..w..@.4.1....yA.,......c-.SIP/2.0 480 Temporarily Unavailable -- 05:39:01.753198 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 408 E....#..@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:08.088801 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1190 E....$..@.. -- 05:39:08.139841 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 594 E..n..@.4.2....yA.,......ZhVSIP/2.0 100 Trying -- 05:39:08.156036 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 867 E.....@.4.1....yA.,......k4.SIP/2.0 480 Temporarily Unavailable -- 05:39:08.156222 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 412 E....%..@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:08.157697 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1190 E....&..@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:39:08.208158 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 867 E.....@.4.1....yA.,......k4.SIP/2.0 480 Temporarily Unavailable -- 05:39:08.208282 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 412 E....'..@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:08.644104 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1190 E....(..@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:39:08.690068 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 867 E.....@.4.1....yA.,......k4.SIP/2.0 480 Temporarily Unavailable -- 05:39:08.690212 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 412 E....)..@...A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0 -- 05:39:09.644112 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 1190 E....*..@...A.,....y........INVITE sip:16504578199@195.105.225.111:5060;transport=udp;line=eccbc87e SIP/2.0 -- 05:39:09.690201 IP 195.105.225.111.5060 > 55.177.31.199.5060: SIP, length: 867 E.....@.4.1....yA.,......k4.SIP/2.0 480 Temporarily Unavailable -- 05:39:09.690374 IP 55.177.31.199.5060 > 195.105.225.111.5060: SIP, length: 412 A.,....y........ACK sip:16504578199@195.105.225.111:5060;transport=udp;line=c81e728d SIP/2.0
Anyway, my core dumps were showing me qm_*() free errors. To get at the problem, I disabled loop unrolling, removed -09 and added the GDB debug flags (-ggdb*) to the compile flags. When I do that, I get one of these two call stacks--mostly the first, and occasionally the second:
#0 0x000000000055dc2e in free_to_params (tb=0x7fcb38f5f750) at parser/parse_to.c:827 #1 0x000000000055dc75 in free_to (tb=0x7fcb38f5f750) at parser/parse_to.c:836 #2 0x000000000053d9df in clean_hdr_field (hf=0x7fcb38f5fb50) at parser/hf.c:169 #3 0x000000000053db49 in free_hdr_field_lst (hf=0x7fcb38f590e0) at parser/hf.c:223 #4 0x0000000000542330 in free_sip_msg (msg=0x7fcb38f5edf0) at parser/msg_parser.c:729 #5 0x000000000049d9cb in receive_msg ( buf=0x905580 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK818c.b6308b63.0\r\nVia: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-28db-521c65ea-2e50da5e- 8f9ae0d\r\nRecord-Route: <sip"..., len=864, rcv_info=0x7fff2bcad950) at receive.c:294 #6 0x000000000052f5cd in udp_rcv_loop () at udp_server.c:557 #7 0x0000000000467de2 in main_loop () at main.c:1638 #8 0x000000000046ad8b in main (argc=13, argv=0x7fff2bcadc88) at main.c:2566
#0 0x00000000005394e9 in qm_detach_free (qm=0x7f25c3037010, frag=0x7f25c3256af0) at mem/q_malloc.c:266 #1 0x0000000000539744 in qm_malloc (qm=0x7f25c3037010, size=960) at mem/q_malloc.c:386 #2 0x00000000004bba49 in rval_new_empty (extra_size=102) at rvalue.c:236 #3 0x00000000004bbabb in rval_new_str (s=0x7fffbad95460, extra_size=80) at rvalue.c:260 #4 0x00000000004be1b3 in rval_convert (h=0x7fffbad972a0, msg=0x7f25c2587920, type=RV_STR, v=0x7f25c3182ee8, c=0x7fffbad95620) at rvalue.c:1321 #5 0x00000000004bf657 in rval_str_lop2 (h=0x7fffbad972a0, msg=0x7f25c2587920, res=0x7fffbad95ac8, op=RVE_EQ_OP, l=0x7f25c3182ee8, c1=0x7fffbad95620, r=0x7f25c3183668, c2=0x0) at rvalue.c:1752 #6 0x00000000004c028d in rval_expr_eval_int (h=0x7fffbad972a0, msg=0x7f25c2587920, res=0x7fffbad95ac8, rve=0x7f25c3183d40) at rvalue.c:2058 #7 0x0000000000418d5a in do_action (h=0x7fffbad972a0, a=0x7f25c3185190, msg=0x7f25c2587920) at action.c:1050 #8 0x0000000000421aa7 in run_actions (h=0x7fffbad972a0, a=0x7f25c3182d40, msg=0x7f25c2587920) at action.c:1573 #9 0x000000000042047f in do_action (h=0x7fffbad972a0, a=0x7f25c3185530, msg=0x7f25c2587920) at action.c:1374 #10 0x0000000000421aa7 in run_actions (h=0x7fffbad972a0, a=0x7f25c317a2a0, msg=0x7f25c2587920) at action.c:1573 #11 0x0000000000418fa2 in do_action (h=0x7fffbad972a0, a=0x7f25c318ac10, msg=0x7f25c2587920) at action.c:1065 #12 0x0000000000421aa7 in run_actions (h=0x7fffbad972a0, a=0x7f25c318ac10, ---Type <return> to continue, or q <return> to quit--- msg=0x7f25c2587920) at action.c:1573 #13 0x0000000000418ffb in do_action (h=0x7fffbad972a0, a=0x7f25c318ad10, msg=0x7f25c2587920) at action.c:1069 #14 0x0000000000421aa7 in run_actions (h=0x7fffbad972a0, a=0x7f25c3163930, msg=0x7f25c2587920) at action.c:1573 #15 0x0000000000416f3a in do_action (h=0x7fffbad972a0, a=0x7f25c3194fc0, msg=0x7f25c2587920) at action.c:690 #16 0x0000000000421aa7 in run_actions (h=0x7fffbad972a0, a=0x7f25c318eaf0, msg=0x7f25c2587920) at action.c:1573 #17 0x0000000000422231 in run_top_route (a=0x7f25c318eaf0, msg=0x7f25c2587920, c=0x0) at action.c:1658 #18 0x00007f25c233b220 in run_failure_handlers (t=0x7f24be53b2a0, rpl=0x7f25c32554b0, code=480, extra_flags=64) at t_reply.c:1024 #19 0x00007f25c233c39b in t_should_relay_response (Trans=0x7f24be53b2a0, new_code=480, branch=0, should_store=0x7fffbad97580, should_relay=0x7fffbad97584, cancel_data=0x7fffbad97790, reply=0x7f25c32554b0) at t_reply.c:1300 #20 0x00007f25c233dec4 in relay_reply (t=0x7f24be53b2a0, p_msg=0x7f25c32554b0, branch=0, msg_status=480, cancel_data=0x7fffbad97790, do_put_on_wait=1) at t_reply.c:1703 #21 0x00007f25c2340f46 in reply_received (p_msg=0x7f25c32554b0) at t_reply.c:2370 #22 0x0000000000458861 in do_forward_reply (msg=0x7f25c32554b0, mode=0) at forward.c:799 #23 0x00000000004590d0 in forward_reply (msg=0x7f25c32554b0) at forward.c:882 #24 0x000000000049d8a4 in receive_msg ( ---Type <return> to continue, or q <return> to quit--- buf=0x905580 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK0c.2611fa52.0\r\nVia: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-293c-521c669b-2e538ce5- 7bec16be\r\nRecord-Route: <sip:"..., len=863, rcv_info=0x7fffbad97af0) at receive.c:270 #25 0x000000000052f5cd in udp_rcv_loop () at udp_server.c:557 #26 0x0000000000467de2 in main_loop () at main.c:1638 #27 0x000000000046ad8b in main (argc=13, argv=0x7fffbad97e28) at main.c:2566
But, 90% of the time, it's the first one. The second one shows up only occasionally. Sometimes, this variant appears:
#0 0x0000000000538d7a in qm_insert_free (qm=0x7f069c7a6010, frag=0x7f069c9c57e0) at mem/q_malloc.c:182 #1 0x0000000000539c3d in qm_free (qm=0x7f069c7a6010, p=0x7f069c9c57f0) at mem/q_malloc.c:527 #2 0x000000000055dc4c in free_to_params (tb=0x7f069c9c73a0) at parser/parse_to.c:828 #3 0x000000000055dc75 in free_to (tb=0x7f069c9c73a0) at parser/parse_to.c:836 #4 0x000000000053d8d5 in clean_hdr_field (hf=0x7f069c7d7da0) at parser/hf.c:113 #5 0x000000000053db49 in free_hdr_field_lst (hf=0x7f069c7d6340) at parser/hf.c:223 #6 0x0000000000542330 in free_sip_msg (msg=0x7f069c9c44b0) at parser/msg_parser.c:729 #7 0x000000000049d9cb in receive_msg ( buf=0x905580 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKeb96.24a5718.0\r\nVia: SIP/2.0/UDP 208.94.157.10:5060;branch=z9hG4bK-368a-521c67f1-3b3bc7c- 5366754e\r\nRecord-Route: sip:55.177.31.199;lr=on;ftag=c00070a-13c4- 521c67f1-3b3bc7c- 3af08849;vsf=AAAAAB0GDg4NDQUCBwd3A3AKHgEaFwUbBhsGHjEwOjUwNjA7aXN1cC1vbGk 9MA--;dlgcor=789.3d01\r\nFrom: sip:6784841111@208.94.157.10;tag=c00070a-13c4-521c67f1-3b3bc7c- 3af08849\r\nTo: sip:16504578199@55.177.31.199:5060;tag=3af8238437\r\nCall-ID: CXC-408- 6bb35b50-c00070a-13c4-521c67f1-3b3bc7c-3c339c13@208.94.157.10\r\nCSeq: 1 INVITE\r\nContact: sip:5282797646@195.105.225.111:5060;transport=udp\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n", len=862, rcv_info=0x7fff87c23070)
So, I focused on the first one. The crash is on pkg_free() in this loop in parser/parse_to.c:
void free_to_params(struct to_body* const tb) { struct to_param *tp=tb->param_lst; struct to_param *foo; while (tp){ foo = tp->next; pkg_free(tp); tp=foo; } }
Removing the pkg_free() definitely eliminates the crash. Or, at least, I haven't been able to get the proxy to crash in that scenario.
In poking at the values of the to_body,
(gdb) frame 2 #2 0x000000000055dc4c in free_to_params (tb=0x7f069c9c73a0) at parser/parse_to.c:828 828 pkg_free(tp); (gdb) print tb $30 = (struct to_body * const) 0x7f069c9c73a0 (gdb) print tb->uri->s $31 = 0x9056e6 "sip:6784841111@208.94.157.10>;tag=c00070a-13c4-521c67f1- 3b3bc7c-3af08849\r\nTo: sip:16504578199@55.177.31.199:5060;tag=3af8238437\r\nCall-ID: CXC-408- 6bb35b50-c00070a-13c4-521c67f1-3b3bc7c-3c339c13@208.94.157.10\r\nCSeq: 1 INVITE\r\nContact: sip:5282797646@195.105.225.111:5060;transport=udp\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n" (gdb) print tb->param_lst $32 = (struct to_param *) 0x7f069c9c57f0 (gdb) print tb->param_lst->next $33 = (struct to_param *) 0x0 (gdb) print tb->param_lst->value $34 = { s = 0x905708 "c00070a-13c4-521c67f1-3b3bc7c-3af08849\r\nTo: sip:16504578199@55.177.31.199:5060;tag=3af8238437\r\nCall-ID: CXC-408- 6bb35b50-c00070a-13c4-521c67f1-3b3bc7c-3c339c13@208.94.157.10\r\nCSeq: 1 INVITE\r\nContact: sip:5282797646@195.105.225.111:5060;transport=udp\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n", len = 38} (gdb) print tb->param_lst->name $35 = {s = 0x3e653834323335 <Address 0x3e653834323335 out of bounds>, len = 3}
I eventually noticed that I can't seem to get at the 'name' value in the first (and only) parameter list node:
(gdb) print tb->param_lst->name $35 = {s = 0x3e653834323335 <Address 0x3e653834323335 out of bounds>, len = 3}
I guess that should be 'tag', right? It's the only parameter, unsurprisingly:
(gdb) print tb->param_lst->next $33 = (struct to_param *) 0x0
Anyway, I can also say that accessing the pointer into the message buffer to which it corresponds crashes it, but only for that 480 -- not for any other SIP message that comes through the proxy:
while (tp){ foo = tp->next; LM_ERR("Freeing up the hinterland: <%s>\n", tp->name.s); pkg_free(tp); tp=foo; }
(gdb) where #0 0x0000003dbae480ac in vfprintf () from /lib64/libc.so.6 #1 0x0000003dbaee4fe0 in __vsyslog_chk () from /lib64/libc.so.6 #2 0x0000003dbaee5210 in syslog () from /lib64/libc.so.6 #3 0x000000000055dd47 in free_to_params (tb=0x7fc1def09610)
So, I really get the sense that there's something wrong with that 480 that's corrupting memory. Is that an accurate conclusion? If so, what it could be, and what's the next step to try to diagnose further?
Thank you!
-- Alex
Hi Daniel,
On 08/27/2013 08:47 AM, Daniel-Constantin Mierla wrote:
Hello,
can you try this patch?
http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=14835f89...
One reason for such crash could be double-free, which could eventually happen because the pointer to params was not reset after freeing the list.
I will certainly try it, thank you.
However, it is curious that this crash occurs only in this exact situation, only when calling this PBX, only when it has two registrants to fork among, only when I use this combination of request routes/subroutines.
Hi Daniel,
With your patch applied (setting param list head to NULL), it now crashes in a different place:
Program terminated with signal 11, Segmentation fault. #0 0x000000000055e602 in free_to_params (tb=0x7f31fee421a0) at parser/parse_to.c:827 827 foo = tp->next; Missing separate debuginfos, use: debuginfo-install cyrus-sasl-lib-2.1.23-13.el6_3.1.x86_64 glibc-2.12-1.107.el6.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.2.x86_64 libcom_err-1.41.12-14.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 nspr-4.9.2-1.el6.x86_64 nss-3.14.0.0-12.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 nss-util-3.14.0.0-2.el6.x86_64 openldap-2.4.23-32.el6_4.1.x86_64 openssl-1.0.0-27.el6_4.2.x86_64 postgresql92-libs-9.2.4-1PGDG.rhel6.x86_64 zlib-1.2.3-29.el6.x86_64 (gdb) where #0 0x000000000055e602 in free_to_params (tb=0x7f31fee421a0) at parser/parse_to.c:827 #1 0x000000000055e658 in free_to (tb=0x7f31fee421a0) at parser/parse_to.c:838 #2 0x000000000053e2a9 in clean_hdr_field (hf=0x7f31fee23bc0) at parser/hf.c:113 #3 0x000000000053e51d in free_hdr_field_lst (hf=0x7f31fee20a60) at parser/hf.c:223 #4 0x0000000000542d04 in free_sip_msg (msg=0x7f31fee40df0) at parser/msg_parser.c:729 #5 0x000000000049e39d in receive_msg ( buf=0x9065c0 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKbe3a.dab6345.0\r\nVia: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-1a97-521d9f57-331967d3-3174bfdc\r\nRecord-Route: <sip"..., len=866, rcv_info=0x7fff34138bd0) at receive.c:296 #6 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #7 0x0000000000467de2 in main_loop () at main.c:1638 #8 0x000000000046ad8b in main (argc=13, argv=0x7fff34138f08) at main.c:2566
-- Alex
On 08/27/2013 08:49 AM, Alex Balashov wrote:
Hi Daniel,
On 08/27/2013 08:47 AM, Daniel-Constantin Mierla wrote:
Hello,
can you try this patch?
http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=14835f89...
One reason for such crash could be double-free, which could eventually happen because the pointer to params was not reset after freeing the list.
I will certainly try it, thank you.
However, it is curious that this crash occurs only in this exact situation, only when calling this PBX, only when it has two registrants to fork among, only when I use this combination of request routes/subroutines.
With the patch applied, I sometimes get this crash, too:
(gdb) where #0 0x0000000000539ed9 in qm_detach_free (qm=0x7f516c197010, frag=0x7f516c3b6dc0) at mem/q_malloc.c:268 #1 0x000000000053a118 in qm_malloc (qm=0x7f516c197010, size=960) at mem/q_malloc.c:386 #2 0x00000000004bc41d in rval_new_empty (extra_size=102) at rvalue.c:236 #3 0x00000000004bc48f in rval_new_str (s=0x7ffff3194e70, extra_size=80) at rvalue.c:260 #4 0x00000000004beb87 in rval_convert (h=0x7ffff3196cb0, msg=0x7f516b6e7920, type=RV_STR, v=0x7f516c2e3728, c=0x7ffff3195030) at rvalue.c:1321 #5 0x00000000004c002b in rval_str_lop2 (h=0x7ffff3196cb0, msg=0x7f516b6e7920, res=0x7ffff31954d8, op=RVE_EQ_OP, l=0x7f516c2e3728, c1=0x7ffff3195030, r=0x7f516c2e3ea8, c2=0x0) at rvalue.c:1752 #6 0x00000000004c0c61 in rval_expr_eval_int (h=0x7ffff3196cb0, msg=0x7f516b6e7920, res=0x7ffff31954d8, rve=0x7f516c2e4580) at rvalue.c:2058 #7 0x0000000000418d5a in do_action (h=0x7ffff3196cb0, a=0x7f516c2e59d0, msg=0x7f516b6e7920) at action.c:1050 #8 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2e3580, msg=0x7f516b6e7920) at action.c:1573 #9 0x000000000042047f in do_action (h=0x7ffff3196cb0, a=0x7f516c2e5d70, msg=0x7f516b6e7920) at action.c:1374 #10 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2daae0, msg=0x7f516b6e7920) at action.c:1573 #11 0x0000000000418fa2 in do_action (h=0x7ffff3196cb0, a=0x7f516c2eb450, msg=0x7f516b6e7920) at action.c:1065 #12 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2eb450, msg=0x7f516b6e7920) at action.c:1573 #13 0x0000000000418ffb in do_action (h=0x7ffff3196cb0, a=0x7f516c2eb550, msg=0x7f516b6e7920) at action.c:1069 #14 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2c4170, msg=0x7f516b6e7920) at action.c:1573 #15 0x0000000000416f3a in do_action (h=0x7ffff3196cb0, a=0x7f516c2f5800, ---Type <return> to continue, or q <return> to quit--- msg=0x7f516b6e7920) at action.c:690 #16 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2ef330, msg=0x7f516b6e7920) at action.c:1573 #17 0x0000000000422231 in run_top_route (a=0x7f516c2ef330, msg=0x7f516b6e7920, c=0x0) at action.c:1658 #18 0x00007f516b49b220 in run_failure_handlers (t=0x7f506769a3b0, rpl=0x7f516c3b5df0, code=480, extra_flags=64) at t_reply.c:1024 #19 0x00007f516b49c39b in t_should_relay_response (Trans=0x7f506769a3b0, new_code=480, branch=0, should_store=0x7ffff3196f90, should_relay=0x7ffff3196f94, cancel_data=0x7ffff31971a0, reply=0x7f516c3b5df0) at t_reply.c:1300 #20 0x00007f516b49dec4 in relay_reply (t=0x7f506769a3b0, p_msg=0x7f516c3b5df0, branch=0, msg_status=480, cancel_data=0x7ffff31971a0, do_put_on_wait=1) at t_reply.c:1703 #21 0x00007f516b4a0f46 in reply_received (p_msg=0x7f516c3b5df0) at t_reply.c:2370 #22 0x0000000000458861 in do_forward_reply (msg=0x7f516c3b5df0, mode=0) at forward.c:799 #23 0x00000000004590d0 in forward_reply (msg=0x7f516c3b5df0) at forward.c:882 #24 0x000000000049e276 in receive_msg ( buf=0x9065c0 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK25c.fe28da07.0\r\nVia: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-1cc0-521da21e-332440e0-d482ab\r\nRecord-Route: <sip:6"..., len=862, rcv_info=0x7ffff3197520) at receive.c:272 #25 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #26 0x0000000000467de2 in main_loop () at main.c:1638 #27 0x000000000046ad8b in main (argc=13, argv=0x7ffff3197858) at main.c:2566
I've seen it before in this scenario, but so infrequently that I didn't think it was worth mentioning.
On 08/28/2013 03:01 AM, Alex Balashov wrote:
Hi Daniel,
With your patch applied (setting param list head to NULL), it now crashes in a different place:
Program terminated with signal 11, Segmentation fault. #0 0x000000000055e602 in free_to_params (tb=0x7f31fee421a0) at parser/parse_to.c:827 827 foo = tp->next; Missing separate debuginfos, use: debuginfo-install cyrus-sasl-lib-2.1.23-13.el6_3.1.x86_64 glibc-2.12-1.107.el6.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.2.x86_64 libcom_err-1.41.12-14.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 nspr-4.9.2-1.el6.x86_64 nss-3.14.0.0-12.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 nss-util-3.14.0.0-2.el6.x86_64 openldap-2.4.23-32.el6_4.1.x86_64 openssl-1.0.0-27.el6_4.2.x86_64 postgresql92-libs-9.2.4-1PGDG.rhel6.x86_64 zlib-1.2.3-29.el6.x86_64 (gdb) where #0 0x000000000055e602 in free_to_params (tb=0x7f31fee421a0) at parser/parse_to.c:827 #1 0x000000000055e658 in free_to (tb=0x7f31fee421a0) at parser/parse_to.c:838 #2 0x000000000053e2a9 in clean_hdr_field (hf=0x7f31fee23bc0) at parser/hf.c:113 #3 0x000000000053e51d in free_hdr_field_lst (hf=0x7f31fee20a60) at parser/hf.c:223 #4 0x0000000000542d04 in free_sip_msg (msg=0x7f31fee40df0) at parser/msg_parser.c:729 #5 0x000000000049e39d in receive_msg ( buf=0x9065c0 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKbe3a.dab6345.0\r\nVia: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-1a97-521d9f57-331967d3-3174bfdc\r\nRecord-Route: <sip"..., len=866, rcv_info=0x7fff34138bd0) at receive.c:296 #6 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #7 0x0000000000467de2 in main_loop () at main.c:1638 #8 0x000000000046ad8b in main (argc=13, argv=0x7fff34138f08) at main.c:2566
-- Alex
On 08/27/2013 08:49 AM, Alex Balashov wrote:
Hi Daniel,
On 08/27/2013 08:47 AM, Daniel-Constantin Mierla wrote:
Hello,
can you try this patch?
http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=14835f89...
One reason for such crash could be double-free, which could eventually happen because the pointer to params was not reset after freeing the list.
I will certainly try it, thank you.
However, it is curious that this crash occurs only in this exact situation, only when calling this PBX, only when it has two registrants to fork among, only when I use this combination of request routes/subroutines.
Could be a buffer overflow somewhere.
First, do a 'bt full' for both cores and send the output, just to see if something is strage inside the structures.
Then, can you compile with MEMDBG=1 in Makefile.defs, reinstall and test again. Check the logs for memory related error messages and see if you get head/tail overwritten.
Cheers, Daniel
On 8/28/13 9:11 AM, Alex Balashov wrote:
With the patch applied, I sometimes get this crash, too:
(gdb) where #0 0x0000000000539ed9 in qm_detach_free (qm=0x7f516c197010, frag=0x7f516c3b6dc0) at mem/q_malloc.c:268 #1 0x000000000053a118 in qm_malloc (qm=0x7f516c197010, size=960) at mem/q_malloc.c:386 #2 0x00000000004bc41d in rval_new_empty (extra_size=102) at rvalue.c:236 #3 0x00000000004bc48f in rval_new_str (s=0x7ffff3194e70, extra_size=80) at rvalue.c:260 #4 0x00000000004beb87 in rval_convert (h=0x7ffff3196cb0, msg=0x7f516b6e7920, type=RV_STR, v=0x7f516c2e3728, c=0x7ffff3195030) at rvalue.c:1321 #5 0x00000000004c002b in rval_str_lop2 (h=0x7ffff3196cb0, msg=0x7f516b6e7920, res=0x7ffff31954d8, op=RVE_EQ_OP, l=0x7f516c2e3728, c1=0x7ffff3195030, r=0x7f516c2e3ea8, c2=0x0) at rvalue.c:1752 #6 0x00000000004c0c61 in rval_expr_eval_int (h=0x7ffff3196cb0, msg=0x7f516b6e7920, res=0x7ffff31954d8, rve=0x7f516c2e4580) at rvalue.c:2058 #7 0x0000000000418d5a in do_action (h=0x7ffff3196cb0, a=0x7f516c2e59d0, msg=0x7f516b6e7920) at action.c:1050 #8 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2e3580, msg=0x7f516b6e7920) at action.c:1573 #9 0x000000000042047f in do_action (h=0x7ffff3196cb0, a=0x7f516c2e5d70, msg=0x7f516b6e7920) at action.c:1374 #10 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2daae0, msg=0x7f516b6e7920) at action.c:1573 #11 0x0000000000418fa2 in do_action (h=0x7ffff3196cb0, a=0x7f516c2eb450, msg=0x7f516b6e7920) at action.c:1065 #12 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2eb450, msg=0x7f516b6e7920) at action.c:1573 #13 0x0000000000418ffb in do_action (h=0x7ffff3196cb0, a=0x7f516c2eb550, msg=0x7f516b6e7920) at action.c:1069 #14 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2c4170, msg=0x7f516b6e7920) at action.c:1573 #15 0x0000000000416f3a in do_action (h=0x7ffff3196cb0, a=0x7f516c2f5800, ---Type <return> to continue, or q <return> to quit--- msg=0x7f516b6e7920) at action.c:690 #16 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2ef330, msg=0x7f516b6e7920) at action.c:1573 #17 0x0000000000422231 in run_top_route (a=0x7f516c2ef330, msg=0x7f516b6e7920, c=0x0) at action.c:1658 #18 0x00007f516b49b220 in run_failure_handlers (t=0x7f506769a3b0, rpl=0x7f516c3b5df0, code=480, extra_flags=64) at t_reply.c:1024 #19 0x00007f516b49c39b in t_should_relay_response (Trans=0x7f506769a3b0, new_code=480, branch=0, should_store=0x7ffff3196f90, should_relay=0x7ffff3196f94, cancel_data=0x7ffff31971a0, reply=0x7f516c3b5df0) at t_reply.c:1300 #20 0x00007f516b49dec4 in relay_reply (t=0x7f506769a3b0, p_msg=0x7f516c3b5df0, branch=0, msg_status=480, cancel_data=0x7ffff31971a0, do_put_on_wait=1) at t_reply.c:1703 #21 0x00007f516b4a0f46 in reply_received (p_msg=0x7f516c3b5df0) at t_reply.c:2370 #22 0x0000000000458861 in do_forward_reply (msg=0x7f516c3b5df0, mode=0) at forward.c:799 #23 0x00000000004590d0 in forward_reply (msg=0x7f516c3b5df0) at forward.c:882 #24 0x000000000049e276 in receive_msg ( buf=0x9065c0 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK25c.fe28da07.0\r\nVia: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-1cc0-521da21e-332440e0-d482ab\r\nRecord-Route: <sip:6"..., len=862, rcv_info=0x7ffff3197520) at receive.c:272 #25 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #26 0x0000000000467de2 in main_loop () at main.c:1638 #27 0x000000000046ad8b in main (argc=13, argv=0x7ffff3197858) at main.c:2566
I've seen it before in this scenario, but so infrequently that I didn't think it was worth mentioning.
On 08/28/2013 03:01 AM, Alex Balashov wrote:
Hi Daniel,
With your patch applied (setting param list head to NULL), it now crashes in a different place:
Program terminated with signal 11, Segmentation fault. #0 0x000000000055e602 in free_to_params (tb=0x7f31fee421a0) at parser/parse_to.c:827 827 foo = tp->next; Missing separate debuginfos, use: debuginfo-install cyrus-sasl-lib-2.1.23-13.el6_3.1.x86_64 glibc-2.12-1.107.el6.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.2.x86_64 libcom_err-1.41.12-14.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 nspr-4.9.2-1.el6.x86_64 nss-3.14.0.0-12.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 nss-util-3.14.0.0-2.el6.x86_64 openldap-2.4.23-32.el6_4.1.x86_64 openssl-1.0.0-27.el6_4.2.x86_64 postgresql92-libs-9.2.4-1PGDG.rhel6.x86_64 zlib-1.2.3-29.el6.x86_64 (gdb) where #0 0x000000000055e602 in free_to_params (tb=0x7f31fee421a0) at parser/parse_to.c:827 #1 0x000000000055e658 in free_to (tb=0x7f31fee421a0) at parser/parse_to.c:838 #2 0x000000000053e2a9 in clean_hdr_field (hf=0x7f31fee23bc0) at parser/hf.c:113 #3 0x000000000053e51d in free_hdr_field_lst (hf=0x7f31fee20a60) at parser/hf.c:223 #4 0x0000000000542d04 in free_sip_msg (msg=0x7f31fee40df0) at parser/msg_parser.c:729 #5 0x000000000049e39d in receive_msg ( buf=0x9065c0 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKbe3a.dab6345.0\r\nVia: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-1a97-521d9f57-331967d3-3174bfdc\r\nRecord-Route:
<sip"..., len=866, rcv_info=0x7fff34138bd0) at receive.c:296 #6 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #7 0x0000000000467de2 in main_loop () at main.c:1638 #8 0x000000000046ad8b in main (argc=13, argv=0x7fff34138f08) at main.c:2566
-- Alex
On 08/27/2013 08:49 AM, Alex Balashov wrote:
Hi Daniel,
On 08/27/2013 08:47 AM, Daniel-Constantin Mierla wrote:
Hello,
can you try this patch?
http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=14835f89...
One reason for such crash could be double-free, which could eventually happen because the pointer to params was not reset after freeing the list.
I will certainly try it, thank you.
However, it is curious that this crash occurs only in this exact situation, only when calling this PBX, only when it has two registrants to fork among, only when I use this combination of request routes/subroutines.
Hi Daniel,
Here is the 'bt full' from the first core:
I should add that this inbound routing scheme iterates through an array of AVPs within failure_routes. I wonder if there is some sort of buffer overflow issue there.
-- Alex
On 08/28/2013 03:31 AM, Daniel-Constantin Mierla wrote:
Could be a buffer overflow somewhere.
First, do a 'bt full' for both cores and send the output, just to see if something is strage inside the structures.
Then, can you compile with MEMDBG=1 in Makefile.defs, reinstall and test again. Check the logs for memory related error messages and see if you get head/tail overwritten.
Cheers, Daniel
On 8/28/13 9:11 AM, Alex Balashov wrote:
With the patch applied, I sometimes get this crash, too:
(gdb) where #0 0x0000000000539ed9 in qm_detach_free (qm=0x7f516c197010, frag=0x7f516c3b6dc0) at mem/q_malloc.c:268 #1 0x000000000053a118 in qm_malloc (qm=0x7f516c197010, size=960) at mem/q_malloc.c:386 #2 0x00000000004bc41d in rval_new_empty (extra_size=102) at rvalue.c:236 #3 0x00000000004bc48f in rval_new_str (s=0x7ffff3194e70, extra_size=80) at rvalue.c:260 #4 0x00000000004beb87 in rval_convert (h=0x7ffff3196cb0, msg=0x7f516b6e7920, type=RV_STR, v=0x7f516c2e3728, c=0x7ffff3195030) at rvalue.c:1321 #5 0x00000000004c002b in rval_str_lop2 (h=0x7ffff3196cb0, msg=0x7f516b6e7920, res=0x7ffff31954d8, op=RVE_EQ_OP, l=0x7f516c2e3728, c1=0x7ffff3195030, r=0x7f516c2e3ea8, c2=0x0) at rvalue.c:1752 #6 0x00000000004c0c61 in rval_expr_eval_int (h=0x7ffff3196cb0, msg=0x7f516b6e7920, res=0x7ffff31954d8, rve=0x7f516c2e4580) at rvalue.c:2058 #7 0x0000000000418d5a in do_action (h=0x7ffff3196cb0, a=0x7f516c2e59d0, msg=0x7f516b6e7920) at action.c:1050 #8 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2e3580, msg=0x7f516b6e7920) at action.c:1573 #9 0x000000000042047f in do_action (h=0x7ffff3196cb0, a=0x7f516c2e5d70, msg=0x7f516b6e7920) at action.c:1374 #10 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2daae0, msg=0x7f516b6e7920) at action.c:1573 #11 0x0000000000418fa2 in do_action (h=0x7ffff3196cb0, a=0x7f516c2eb450, msg=0x7f516b6e7920) at action.c:1065 #12 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2eb450, msg=0x7f516b6e7920) at action.c:1573 #13 0x0000000000418ffb in do_action (h=0x7ffff3196cb0, a=0x7f516c2eb550, msg=0x7f516b6e7920) at action.c:1069 #14 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2c4170, msg=0x7f516b6e7920) at action.c:1573 #15 0x0000000000416f3a in do_action (h=0x7ffff3196cb0, a=0x7f516c2f5800, ---Type <return> to continue, or q <return> to quit--- msg=0x7f516b6e7920) at action.c:690 #16 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2ef330, msg=0x7f516b6e7920) at action.c:1573 #17 0x0000000000422231 in run_top_route (a=0x7f516c2ef330, msg=0x7f516b6e7920, c=0x0) at action.c:1658 #18 0x00007f516b49b220 in run_failure_handlers (t=0x7f506769a3b0, rpl=0x7f516c3b5df0, code=480, extra_flags=64) at t_reply.c:1024 #19 0x00007f516b49c39b in t_should_relay_response (Trans=0x7f506769a3b0, new_code=480, branch=0, should_store=0x7ffff3196f90, should_relay=0x7ffff3196f94, cancel_data=0x7ffff31971a0, reply=0x7f516c3b5df0) at t_reply.c:1300 #20 0x00007f516b49dec4 in relay_reply (t=0x7f506769a3b0, p_msg=0x7f516c3b5df0, branch=0, msg_status=480, cancel_data=0x7ffff31971a0, do_put_on_wait=1) at t_reply.c:1703 #21 0x00007f516b4a0f46 in reply_received (p_msg=0x7f516c3b5df0) at t_reply.c:2370 #22 0x0000000000458861 in do_forward_reply (msg=0x7f516c3b5df0, mode=0) at forward.c:799 #23 0x00000000004590d0 in forward_reply (msg=0x7f516c3b5df0) at forward.c:882 #24 0x000000000049e276 in receive_msg ( buf=0x9065c0 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK25c.fe28da07.0\r\nVia: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-1cc0-521da21e-332440e0-d482ab\r\nRecord-Route: <sip:6"..., len=862, rcv_info=0x7ffff3197520) at receive.c:272 #25 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #26 0x0000000000467de2 in main_loop () at main.c:1638 #27 0x000000000046ad8b in main (argc=13, argv=0x7ffff3197858) at main.c:2566
I've seen it before in this scenario, but so infrequently that I didn't think it was worth mentioning.
On 08/28/2013 03:01 AM, Alex Balashov wrote:
Hi Daniel,
With your patch applied (setting param list head to NULL), it now crashes in a different place:
Program terminated with signal 11, Segmentation fault. #0 0x000000000055e602 in free_to_params (tb=0x7f31fee421a0) at parser/parse_to.c:827 827 foo = tp->next; Missing separate debuginfos, use: debuginfo-install cyrus-sasl-lib-2.1.23-13.el6_3.1.x86_64 glibc-2.12-1.107.el6.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.2.x86_64 libcom_err-1.41.12-14.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 nspr-4.9.2-1.el6.x86_64 nss-3.14.0.0-12.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 nss-util-3.14.0.0-2.el6.x86_64 openldap-2.4.23-32.el6_4.1.x86_64 openssl-1.0.0-27.el6_4.2.x86_64 postgresql92-libs-9.2.4-1PGDG.rhel6.x86_64 zlib-1.2.3-29.el6.x86_64 (gdb) where #0 0x000000000055e602 in free_to_params (tb=0x7f31fee421a0) at parser/parse_to.c:827 #1 0x000000000055e658 in free_to (tb=0x7f31fee421a0) at parser/parse_to.c:838 #2 0x000000000053e2a9 in clean_hdr_field (hf=0x7f31fee23bc0) at parser/hf.c:113 #3 0x000000000053e51d in free_hdr_field_lst (hf=0x7f31fee20a60) at parser/hf.c:223 #4 0x0000000000542d04 in free_sip_msg (msg=0x7f31fee40df0) at parser/msg_parser.c:729 #5 0x000000000049e39d in receive_msg ( buf=0x9065c0 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKbe3a.dab6345.0\r\nVia: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-1a97-521d9f57-331967d3-3174bfdc\r\nRecord-Route:
<sip"..., len=866, rcv_info=0x7fff34138bd0) at receive.c:296 #6 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #7 0x0000000000467de2 in main_loop () at main.c:1638 #8 0x000000000046ad8b in main (argc=13, argv=0x7fff34138f08) at main.c:2566
-- Alex
On 08/27/2013 08:49 AM, Alex Balashov wrote:
Hi Daniel,
On 08/27/2013 08:47 AM, Daniel-Constantin Mierla wrote:
Hello,
can you try this patch?
http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=14835f89...
One reason for such crash could be double-free, which could eventually happen because the pointer to params was not reset after freeing the list.
I will certainly try it, thank you.
However, it is curious that this crash occurs only in this exact situation, only when calling this PBX, only when it has two registrants to fork among, only when I use this combination of request routes/subroutines.
Hello,
one more thing, in frame 0, do:
p *prev
Cheers, Daniel
On 8/28/13 9:40 AM, Alex Balashov wrote:
Hi Daniel,
Here is the 'bt full' from the first core:
I should add that this inbound routing scheme iterates through an array of AVPs within failure_routes. I wonder if there is some sort of buffer overflow issue there.
-- Alex
On 08/28/2013 03:31 AM, Daniel-Constantin Mierla wrote:
Could be a buffer overflow somewhere.
First, do a 'bt full' for both cores and send the output, just to see if something is strage inside the structures.
Then, can you compile with MEMDBG=1 in Makefile.defs, reinstall and test again. Check the logs for memory related error messages and see if you get head/tail overwritten.
Cheers, Daniel
On 8/28/13 9:11 AM, Alex Balashov wrote:
With the patch applied, I sometimes get this crash, too:
(gdb) where #0 0x0000000000539ed9 in qm_detach_free (qm=0x7f516c197010, frag=0x7f516c3b6dc0) at mem/q_malloc.c:268 #1 0x000000000053a118 in qm_malloc (qm=0x7f516c197010, size=960) at mem/q_malloc.c:386 #2 0x00000000004bc41d in rval_new_empty (extra_size=102) at rvalue.c:236 #3 0x00000000004bc48f in rval_new_str (s=0x7ffff3194e70, extra_size=80) at rvalue.c:260 #4 0x00000000004beb87 in rval_convert (h=0x7ffff3196cb0, msg=0x7f516b6e7920, type=RV_STR, v=0x7f516c2e3728, c=0x7ffff3195030) at rvalue.c:1321 #5 0x00000000004c002b in rval_str_lop2 (h=0x7ffff3196cb0, msg=0x7f516b6e7920, res=0x7ffff31954d8, op=RVE_EQ_OP, l=0x7f516c2e3728, c1=0x7ffff3195030, r=0x7f516c2e3ea8, c2=0x0) at rvalue.c:1752 #6 0x00000000004c0c61 in rval_expr_eval_int (h=0x7ffff3196cb0, msg=0x7f516b6e7920, res=0x7ffff31954d8, rve=0x7f516c2e4580) at rvalue.c:2058 #7 0x0000000000418d5a in do_action (h=0x7ffff3196cb0, a=0x7f516c2e59d0, msg=0x7f516b6e7920) at action.c:1050 #8 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2e3580, msg=0x7f516b6e7920) at action.c:1573 #9 0x000000000042047f in do_action (h=0x7ffff3196cb0, a=0x7f516c2e5d70, msg=0x7f516b6e7920) at action.c:1374 #10 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2daae0, msg=0x7f516b6e7920) at action.c:1573 #11 0x0000000000418fa2 in do_action (h=0x7ffff3196cb0, a=0x7f516c2eb450, msg=0x7f516b6e7920) at action.c:1065 #12 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2eb450, msg=0x7f516b6e7920) at action.c:1573 #13 0x0000000000418ffb in do_action (h=0x7ffff3196cb0, a=0x7f516c2eb550, msg=0x7f516b6e7920) at action.c:1069 #14 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2c4170, msg=0x7f516b6e7920) at action.c:1573 #15 0x0000000000416f3a in do_action (h=0x7ffff3196cb0, a=0x7f516c2f5800, ---Type <return> to continue, or q <return> to quit--- msg=0x7f516b6e7920) at action.c:690 #16 0x0000000000421aa7 in run_actions (h=0x7ffff3196cb0, a=0x7f516c2ef330, msg=0x7f516b6e7920) at action.c:1573 #17 0x0000000000422231 in run_top_route (a=0x7f516c2ef330, msg=0x7f516b6e7920, c=0x0) at action.c:1658 #18 0x00007f516b49b220 in run_failure_handlers (t=0x7f506769a3b0, rpl=0x7f516c3b5df0, code=480, extra_flags=64) at t_reply.c:1024 #19 0x00007f516b49c39b in t_should_relay_response (Trans=0x7f506769a3b0, new_code=480, branch=0, should_store=0x7ffff3196f90, should_relay=0x7ffff3196f94, cancel_data=0x7ffff31971a0, reply=0x7f516c3b5df0) at t_reply.c:1300 #20 0x00007f516b49dec4 in relay_reply (t=0x7f506769a3b0, p_msg=0x7f516c3b5df0, branch=0, msg_status=480, cancel_data=0x7ffff31971a0, do_put_on_wait=1) at t_reply.c:1703 #21 0x00007f516b4a0f46 in reply_received (p_msg=0x7f516c3b5df0) at t_reply.c:2370 #22 0x0000000000458861 in do_forward_reply (msg=0x7f516c3b5df0, mode=0) at forward.c:799 #23 0x00000000004590d0 in forward_reply (msg=0x7f516c3b5df0) at forward.c:882 #24 0x000000000049e276 in receive_msg ( buf=0x9065c0 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK25c.fe28da07.0\r\nVia: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-1cc0-521da21e-332440e0-d482ab\r\nRecord-Route:
<sip:6"..., len=862, rcv_info=0x7ffff3197520) at receive.c:272 #25 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #26 0x0000000000467de2 in main_loop () at main.c:1638 #27 0x000000000046ad8b in main (argc=13, argv=0x7ffff3197858) at main.c:2566
I've seen it before in this scenario, but so infrequently that I didn't think it was worth mentioning.
On 08/28/2013 03:01 AM, Alex Balashov wrote:
Hi Daniel,
With your patch applied (setting param list head to NULL), it now crashes in a different place:
Program terminated with signal 11, Segmentation fault. #0 0x000000000055e602 in free_to_params (tb=0x7f31fee421a0) at parser/parse_to.c:827 827 foo = tp->next; Missing separate debuginfos, use: debuginfo-install cyrus-sasl-lib-2.1.23-13.el6_3.1.x86_64 glibc-2.12-1.107.el6.x86_64 keyutils-libs-1.4-4.el6.x86_64 krb5-libs-1.10.3-10.el6_4.2.x86_64 libcom_err-1.41.12-14.el6.x86_64 libselinux-2.0.94-5.3.el6_4.1.x86_64 nspr-4.9.2-1.el6.x86_64 nss-3.14.0.0-12.el6.x86_64 nss-softokn-freebl-3.12.9-11.el6.x86_64 nss-util-3.14.0.0-2.el6.x86_64 openldap-2.4.23-32.el6_4.1.x86_64 openssl-1.0.0-27.el6_4.2.x86_64 postgresql92-libs-9.2.4-1PGDG.rhel6.x86_64 zlib-1.2.3-29.el6.x86_64 (gdb) where #0 0x000000000055e602 in free_to_params (tb=0x7f31fee421a0) at parser/parse_to.c:827 #1 0x000000000055e658 in free_to (tb=0x7f31fee421a0) at parser/parse_to.c:838 #2 0x000000000053e2a9 in clean_hdr_field (hf=0x7f31fee23bc0) at parser/hf.c:113 #3 0x000000000053e51d in free_hdr_field_lst (hf=0x7f31fee20a60) at parser/hf.c:223 #4 0x0000000000542d04 in free_sip_msg (msg=0x7f31fee40df0) at parser/msg_parser.c:729 #5 0x000000000049e39d in receive_msg ( buf=0x9065c0 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKbe3a.dab6345.0\r\nVia: SIP/2.0/UDP 192.13.219.87:5060;branch=z9hG4bK-1a97-521d9f57-331967d3-3174bfdc\r\nRecord-Route:
<sip"..., len=866, rcv_info=0x7fff34138bd0) at receive.c:296 #6 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #7 0x0000000000467de2 in main_loop () at main.c:1638 #8 0x000000000046ad8b in main (argc=13, argv=0x7fff34138f08) at main.c:2566
-- Alex
On 08/27/2013 08:49 AM, Alex Balashov wrote:
Hi Daniel,
On 08/27/2013 08:47 AM, Daniel-Constantin Mierla wrote:
Hello,
can you try this patch?
http://git.sip-router.org/cgi-bin/gitweb.cgi/sip-router/?a=commit;h=14835f89...
One reason for such crash could be double-free, which could eventually happen because the pointer to params was not reset after freeing the list.
I will certainly try it, thank you.
However, it is curious that this crash occurs only in this exact situation, only when calling this PBX, only when it has two registrants to fork among, only when I use this combination of request routes/subroutines.
On 08/28/2013 03:43 AM, Daniel-Constantin Mierla wrote:
Hello,
one more thing, in frame 0, do:
p *prev
In the core dump whose 'bt full' output I put into pastebin, right?
In the case of this crash (with the head/tail message with MEMDBG=1), the backtrace was a bit more conventional:
(gdb) where #0 0x0000003dbae328a5 in raise () from /lib64/libc.so.6 #1 0x0000003dbae34085 in abort () from /lib64/libc.so.6 #2 0x000000000053c4c1 in qm_debug_frag (qm=0x7f99b6e80010, f=0x7f99b713d008) at mem/q_malloc.c:161 #3 0x000000000053de76 in qm_free (qm=0x7f99b6e80010, p=0x7f99b713d038, file=0x6165b1 "<core>: parser/parse_to.c", func=0x617e40 "free_to", line=839) at mem/q_malloc.c:462 #4 0x00000000005657fd in free_to (tb=0x7f99b713d038) at parser/parse_to.c:839 #5 0x0000000000544ee5 in clean_hdr_field (hf=0x7f99b6eea038) at parser/hf.c:113 #6 0x000000000054515a in free_hdr_field_lst (hf=0x7f99b6ee94f0) at parser/hf.c:223 #7 0x00000000005499e5 in free_sip_msg (msg=0x7f99b713b778) at parser/msg_parser.c:729 #8 0x000000000049f89d in receive_msg ( buf=0x910e20 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK5d98.917ccf34.0\r\nVia: SIP/2.0/UDP 208.94.157.10:5060;branch=z9hG4bK-2d1a-521da9d0-89ce197-48a5d43\r\nRecord-Route: <sip:"..., len=860, rcv_info=0x7fffd1c69db0) at receive.c:296 #9 0x0000000000532665 in udp_rcv_loop () at udp_server.c:557 #10 0x00000000004688a1 in main_loop () at main.c:1638 #11 0x000000000046b84a in main (argc=13, argv=0x7fffd1c6a0e8) at main.c:2566
I did wander into frame 3 here and printed the dereferenced value of 'prev' as requested:
(gdb) frame 3 #3 0x000000000053de76 in qm_free (qm=0x7f99b6e80010, p=0x7f99b713d038, file=0x6165b1 "<core>: parser/parse_to.c", func=0x617e40 "free_to", line=839) at mem/q_malloc.c:462 462 qm_debug_frag(qm, f); (gdb) print *prev $1 = {size = 16160473784116415304, u = {nxt_free = 0x48bf7500e07d8348, is_free = 5242037137909318472}, file = 0x1c880c748d8458b <Address 0x1c880c748d8458b out of bounds>, func = 0x8348000000000000 <Address 0x8348000000000000 out of bounds>, line = 9892250880904472772, check = 9892183985613198309}
-- Alex
As expected, a buffer overflow somewhere.
Look in logs for a message like:
BUG: qm_*: prev. fragm. tail overwritten ...
and give also the output of 'bt full'.
Cheers, Daniel
On 8/28/13 9:47 AM, Alex Balashov wrote:
On 08/28/2013 03:43 AM, Daniel-Constantin Mierla wrote:
Hello,
one more thing, in frame 0, do:
p *prev
In the core dump whose 'bt full' output I put into pastebin, right?
In the case of this crash (with the head/tail message with MEMDBG=1), the backtrace was a bit more conventional:
(gdb) where #0 0x0000003dbae328a5 in raise () from /lib64/libc.so.6 #1 0x0000003dbae34085 in abort () from /lib64/libc.so.6 #2 0x000000000053c4c1 in qm_debug_frag (qm=0x7f99b6e80010, f=0x7f99b713d008) at mem/q_malloc.c:161 #3 0x000000000053de76 in qm_free (qm=0x7f99b6e80010, p=0x7f99b713d038, file=0x6165b1 "<core>: parser/parse_to.c", func=0x617e40 "free_to", line=839) at mem/q_malloc.c:462 #4 0x00000000005657fd in free_to (tb=0x7f99b713d038) at parser/parse_to.c:839 #5 0x0000000000544ee5 in clean_hdr_field (hf=0x7f99b6eea038) at parser/hf.c:113 #6 0x000000000054515a in free_hdr_field_lst (hf=0x7f99b6ee94f0) at parser/hf.c:223 #7 0x00000000005499e5 in free_sip_msg (msg=0x7f99b713b778) at parser/msg_parser.c:729 #8 0x000000000049f89d in receive_msg ( buf=0x910e20 "SIP/2.0 480 Temporarily Unavailable\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK5d98.917ccf34.0\r\nVia: SIP/2.0/UDP 208.94.157.10:5060;branch=z9hG4bK-2d1a-521da9d0-89ce197-48a5d43\r\nRecord-Route: <sip:"..., len=860, rcv_info=0x7fffd1c69db0) at receive.c:296 #9 0x0000000000532665 in udp_rcv_loop () at udp_server.c:557 #10 0x00000000004688a1 in main_loop () at main.c:1638 #11 0x000000000046b84a in main (argc=13, argv=0x7fffd1c6a0e8) at main.c:2566
I did wander into frame 3 here and printed the dereferenced value of 'prev' as requested:
(gdb) frame 3 #3 0x000000000053de76 in qm_free (qm=0x7f99b6e80010, p=0x7f99b713d038, file=0x6165b1 "<core>: parser/parse_to.c", func=0x617e40 "free_to", line=839) at mem/q_malloc.c:462 462 qm_debug_frag(qm, f); (gdb) print *prev $1 = {size = 16160473784116415304, u = {nxt_free = 0x48bf7500e07d8348, is_free = 5242037137909318472}, file = 0x1c880c748d8458b <Address 0x1c880c748d8458b out of bounds>, func = 0x8348000000000000 <Address 0x8348000000000000 out of bounds>, line = 9892250880904472772, check = 9892183985613198309}
-- Alex
On 08/28/2013 03:55 AM, Daniel-Constantin Mierla wrote:
As expected, a buffer overflow somewhere.
Look in logs for a message like:
BUG: qm_*: prev. fragm. tail overwritten ...
and give also the output of 'bt full'.
Here's the 'bt full' from the core dump whose backtrace I last pasted (the one that crashes in free_to()) and to which the head/tail overwritten message corresponds:
-- Alex
Where are the log messages? They are important too.
Daniel
On 8/28/13 10:00 AM, Alex Balashov wrote:
On 08/28/2013 03:55 AM, Daniel-Constantin Mierla wrote:
As expected, a buffer overflow somewhere.
Look in logs for a message like:
BUG: qm_*: prev. fragm. tail overwritten ...
and give also the output of 'bt full'.
Here's the 'bt full' from the core dump whose backtrace I last pasted (the one that crashes in free_to()) and to which the head/tail overwritten message corresponds:
-- Alex
On 08/28/2013 04:03 AM, Daniel-Constantin Mierla wrote:
Where are the log messages? They are important too.
The only log message I got from Kamailio itself was:
Aug 28 03:42:09 gw1 /usr/local/sbin/kamailio[7913]: : <core> [mem/q_malloc.c:159]: qm_debug_frag(): BUG: qm_*: prev. fragm. tail overwritten(3630643536313a64, 2d303634622d3965 [0x7f99b713d008:0x7f99b713d038]!
Can you set memdbg and memlog lower than debug and reproduce the case again? That results in logging all memory operations, as I need to know what part of code allocated previous fragment, because it does a overflow write, going over the head of next fragment, thus corrupting it structure.
All logs from start to stop are required in this case, so it can be quite big file. You can email directly to me or provide some temporary access to troubleshoot the system (if that is a more convenient option).
Cheers, Daniel
On 8/28/13 10:04 AM, Alex Balashov wrote:
On 08/28/2013 04:03 AM, Daniel-Constantin Mierla wrote:
Where are the log messages? They are important too.
The only log message I got from Kamailio itself was:
Aug 28 03:42:09 gw1 /usr/local/sbin/kamailio[7913]: : <core> [mem/q_malloc.c:159]: qm_debug_frag(): BUG: qm_*: prev. fragm. tail overwritten(3630643536313a64, 2d303634622d3965 [0x7f99b713d008:0x7f99b713d038]!
Daniel,
On 08/28/2013 05:26 AM, Daniel-Constantin Mierla wrote:
Can you set memdbg and memlog lower than debug and reproduce the case again? That results in logging all memory operations, as I need to know what part of code allocated previous fragment, because it does a overflow write, going over the head of next fragment, thus corrupting it structure.
All logs from start to stop are required in this case, so it can be quite big file. You can email directly to me or provide some temporary access to troubleshoot the system (if that is a more convenient option).
In the instances I provided you, I had MEMDBG set to 1, and debug set to 2. Should I set to debug level 4, or is 3 sufficient?
Might be enough, you can send me the log and if I need more I'll ask afterwards.
Daniel
On 8/28/13 11:45 AM, Alex Balashov wrote:
Daniel,
On 08/28/2013 05:26 AM, Daniel-Constantin Mierla wrote:
Can you set memdbg and memlog lower than debug and reproduce the case again? That results in logging all memory operations, as I need to know what part of code allocated previous fragment, because it does a overflow write, going over the head of next fragment, thus corrupting it structure.
All logs from start to stop are required in this case, so it can be quite big file. You can email directly to me or provide some temporary access to troubleshoot the system (if that is a more convenient option).
In the instances I provided you, I had MEMDBG set to 1, and debug set to 2. Should I set to debug level 4, or is 3 sufficient?
Daniel,
Sorry that I haven't provided these logs yet. I'm working on it. There are only limited hours during a maintenance window to reproduce the problem freely.
One thing I can say is that the problem is definitely tied to the 480 Temporarily Unavailable sent from the PBX somehow. If I repoint that DID to my own Asterisk 1.6.x PBX and send some other feedback, OR send a 480 (like Do Not Disturb from a handset), the crash does not present.
I've managed to reproduce this scenario in a more general case. Even when I disable my "problematic" route sequence, a 404 Not Found response from this same PBX causes Kamailio to crash:
(gdb) frame 2 #2 0x000000000049e39d in receive_msg ( buf=0x9065c0 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK022f.7350a5c7.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK0cBc7d5138cbe6ccf3f\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=gK0c0e4941;vsf=", 'A' <repeats 38 times>, "--;dlgcor=fae.b321>\r\nFrom: sip:16784841111@68.68.120.41;tag=gK0c0e4941\r\nTo: sip:12066513489@55.177.31.199;tag=6011e34882\r\nCall-ID: 1963749434_98793621@68.68.120.41\r\nCSeq: 4418 INVITE\r\nContact: sip:5282797646@195.105.225.111:5060;transport=udp\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n", len=717, rcv_info=0x7fff96469750) at receive.c:296 296 free_sip_msg(msg);
This time, it's not in dealing with the To bits, but freeing the SIP message in general:
(gdb) where #0 0x0000000000435908 in free_lump_list (l=0x7f4f81626730) at data_lump.c:510 #1 0x0000000000542d6a in free_sip_msg (msg=0x7f4f8180e970) at parser/msg_parser.c:731 #2 0x000000000049e39d in receive_msg ( buf=0x9065c0 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK022f.7350a5c7.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK0cBc7d5138cbe6ccf3f\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=gK0c0e4941;vsf=", 'A' <repeats 38 times>, "--;dlgcor=fae.b321>\r\nFrom: sip:16784841111@68.68.120.41;tag=gK0c0e4941\r\nTo: sip:12066513489@55.177.31.199;tag=6011e34882\r\nCall-ID: 1963749434_98793621@68.68.120.41\r\nCSeq: 4418 INVITE\r\nContact: sip:5282797646@195.105.225.111:5060;transport=udp\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n", len=717, rcv_info=0x7fff96469750) at receive.c:296 #3 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #4 0x0000000000467de2 in main_loop () at main.c:1638 #5 0x000000000046ad8b in main (argc=13, argv=0x7fff96469a88) at main.c:2566
That leads me to believe that something in these replies from the PBX in general is causing memory corruption.
But, sometimes I get this crash (in the same scenario as below), too:
(gdb) where #0 0x000000000055e602 in free_to_params (tb=0x7f95ab472950) at parser/parse_to.c:827 #1 0x000000000055e658 in free_to (tb=0x7f95ab472950) at parser/parse_to.c:838 #2 0x000000000053e2a9 in clean_hdr_field (hf=0x7f95ab4722a0) at parser/hf.c:113 #3 0x000000000053e51d in free_hdr_field_lst (hf=0x7f95ab46f1c0) at parser/hf.c:223 #4 0x0000000000542d04 in free_sip_msg (msg=0x7f95ab471970) at parser/msg_parser.c:729 #5 0x000000000049e39d in receive_msg ( buf=0x9065c0 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKa744.4c8811f1.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK02B15f46caff804796d\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=g"..., len=715, rcv_info=0x7fff05e5dbc0) at receive.c:296 #6 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #7 0x0000000000467de2 in main_loop () at main.c:1638 #8 0x000000000046ad8b in main (argc=13, argv=0x7fff05e5def8) at main.c:2566
-- Alex
On 09/05/2013 06:25 AM, Alex Balashov wrote:
I've managed to reproduce this scenario in a more general case. Even when I disable my "problematic" route sequence, a 404 Not Found response from this same PBX causes Kamailio to crash:
(gdb) frame 2 #2 0x000000000049e39d in receive_msg ( buf=0x9065c0 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK022f.7350a5c7.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK0cBc7d5138cbe6ccf3f\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=gK0c0e4941;vsf=", 'A' <repeats 38 times>, "--;dlgcor=fae.b321>\r\nFrom: sip:16784841111@68.68.120.41;tag=gK0c0e4941\r\nTo: sip:12066513489@55.177.31.199;tag=6011e34882\r\nCall-ID: 1963749434_98793621@68.68.120.41\r\nCSeq: 4418 INVITE\r\nContact: sip:5282797646@195.105.225.111:5060;transport=udp\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n", len=717, rcv_info=0x7fff96469750) at receive.c:296 296 free_sip_msg(msg);
This time, it's not in dealing with the To bits, but freeing the SIP message in general:
(gdb) where #0 0x0000000000435908 in free_lump_list (l=0x7f4f81626730) at data_lump.c:510 #1 0x0000000000542d6a in free_sip_msg (msg=0x7f4f8180e970) at parser/msg_parser.c:731 #2 0x000000000049e39d in receive_msg ( buf=0x9065c0 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK022f.7350a5c7.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK0cBc7d5138cbe6ccf3f\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=gK0c0e4941;vsf=", 'A' <repeats 38 times>, "--;dlgcor=fae.b321>\r\nFrom: sip:16784841111@68.68.120.41;tag=gK0c0e4941\r\nTo: sip:12066513489@55.177.31.199;tag=6011e34882\r\nCall-ID: 1963749434_98793621@68.68.120.41\r\nCSeq: 4418 INVITE\r\nContact: sip:5282797646@195.105.225.111:5060;transport=udp\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n", len=717, rcv_info=0x7fff96469750) at receive.c:296 #3 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #4 0x0000000000467de2 in main_loop () at main.c:1638 #5 0x000000000046ad8b in main (argc=13, argv=0x7fff96469a88) at main.c:2566
That leads me to believe that something in these replies from the PBX in general is causing memory corruption.
On 09/05/2013 06:27 AM, Alex Balashov wrote:
But, sometimes I get this crash (in the same scenario as below), too:
(gdb) where #0 0x000000000055e602 in free_to_params (tb=0x7f95ab472950) at parser/parse_to.c:827 #1 0x000000000055e658 in free_to (tb=0x7f95ab472950) at parser/parse_to.c:838 #2 0x000000000053e2a9 in clean_hdr_field (hf=0x7f95ab4722a0) at parser/hf.c:113 #3 0x000000000053e51d in free_hdr_field_lst (hf=0x7f95ab46f1c0) at parser/hf.c:223 #4 0x0000000000542d04 in free_sip_msg (msg=0x7f95ab471970) at parser/msg_parser.c:729 #5 0x000000000049e39d in receive_msg ( buf=0x9065c0 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKa744.4c8811f1.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK02B15f46caff804796d\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=g"..., len=715, rcv_info=0x7fff05e5dbc0) at receive.c:296 #6 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #7 0x0000000000467de2 in main_loop () at main.c:1638 #8 0x000000000046ad8b in main (argc=13, argv=0x7fff05e5def8) at main.c:2566
This is the crash I get ~80-90% of the time, though. This leads me to believe that the To params probably have more to do with it than anything else, unless the structure of the memory corruption is such that it just happens to explode there.
Don't forget that what I actually asked are the log messages with memory operations. It was an indication of buffer overflow.
Cheers, Daniel
On 9/5/13 12:36 PM, Alex Balashov wrote:
On 09/05/2013 06:27 AM, Alex Balashov wrote:
But, sometimes I get this crash (in the same scenario as below), too:
(gdb) where #0 0x000000000055e602 in free_to_params (tb=0x7f95ab472950) at parser/parse_to.c:827 #1 0x000000000055e658 in free_to (tb=0x7f95ab472950) at parser/parse_to.c:838 #2 0x000000000053e2a9 in clean_hdr_field (hf=0x7f95ab4722a0) at parser/hf.c:113 #3 0x000000000053e51d in free_hdr_field_lst (hf=0x7f95ab46f1c0) at parser/hf.c:223 #4 0x0000000000542d04 in free_sip_msg (msg=0x7f95ab471970) at parser/msg_parser.c:729 #5 0x000000000049e39d in receive_msg ( buf=0x9065c0 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bKa744.4c8811f1.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK02B15f46caff804796d\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=g"..., len=715, rcv_info=0x7fff05e5dbc0) at receive.c:296 #6 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 #7 0x0000000000467de2 in main_loop () at main.c:1638 #8 0x000000000046ad8b in main (argc=13, argv=0x7fff05e5def8) at main.c:2566
This is the crash I get ~80-90% of the time, though. This leads me to believe that the To params probably have more to do with it than anything else, unless the structure of the memory corruption is such that it just happens to explode there.
On 09/05/2013 07:00 AM, Daniel-Constantin Mierla wrote:
Don't forget that what I actually asked are the log messages with memory operations. It was an indication of buffer overflow.
Yeah, I know. They're just hard to get in this situation. I will do my best to get them to you ASAP.
Hello Daniel,
I sent you the ultra-high-verbosity memory log privately, since it's 95 MB (uncompressed). I also sent you the backtrace and the full backtrace, but here they are again for edification:
sasha@tambourine:~$ cat x2 | sed 's/208.94.159.10/192.13.219.87/g' | sed 's/6789540671/6784841111/g' | sed 's/192.168.47.72/175.154.21.89/g' | sed 's/67.231.4.102/12.131.2.144/g' | sed 's/65.254.44.194/55.177.31.199/g' | sed 's/9189211710/6504578199/g' | sed 's/184.178.234.121/195.105.225.111/g' (gdb) where #0 0x0000003dbae328a5 in raise () from /lib64/libc.so.6 #1 0x0000003dbae34085 in abort () from /lib64/libc.so.6 #2 0x000000000053c112 in qm_debug_frag (qm=0x7fa2e71d2010, f=0x7fa2e7484060) at mem/q_malloc.c:142 #3 0x000000000053de76 in qm_free (qm=0x7fa2e71d2010, p=0x7fa2e7484090, file=0x6165b1 "<core>: parser/parse_to.c", func=0x617e48 "free_to_params", line=828) at mem/q_malloc.c:462 #4 0x0000000000565799 in free_to_params (tb=0x7fa2e7456b70) at parser/parse_to.c:828 #5 0x00000000005657d7 in free_to (tb=0x7fa2e7456b70) at parser/parse_to.c:838 #6 0x0000000000544fef in clean_hdr_field (hf=0x7fa2e723c030) at parser/hf.c:169 #7 0x000000000054515a in free_hdr_field_lst (hf=0x7fa2e723bbe0) at parser/hf.c:223 #8 0x00000000005499e5 in free_sip_msg (msg=0x7fa2e74824a8) at parser/msg_parser.c:729 #9 0x000000000049f89d in receive_msg ( buf=0x910e20 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK11f7.06fc4f95.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK04B41bd5b77895524b3\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=g"..., len=716, rcv_info=0x7fffd0c14960) at receive.c:296 #10 0x0000000000532665 in udp_rcv_loop () at udp_server.c:557 #11 0x00000000004688a1 in main_loop () at main.c:1638 #12 0x000000000046b84a in main (argc=13, argv=0x7fffd0c14c98) at main.c:2566
#0 0x0000003dbae328a5 in raise () from /lib64/libc.so.6 No symbol table info available. #1 0x0000003dbae34085 in abort () from /lib64/libc.so.6 No symbol table info available. #2 0x000000000053c112 in qm_debug_frag (qm=0x7fa2e71d2010, f=0x7fa2e7484060) at mem/q_malloc.c:142 __FUNCTION__ = "qm_debug_frag" #3 0x000000000053de76 in qm_free (qm=0x7fa2e71d2010, p=0x7fa2e7484090, file=0x6165b1 "<core>: parser/parse_to.c", func=0x617e48 "free_to_params", line=828) at mem/q_malloc.c:462 f = 0x7fa2e7484060 size = 2825200 next = 0x7fa2e723ac58 prev = 0x157 __FUNCTION__ = "qm_free" #4 0x0000000000565799 in free_to_params (tb=0x7fa2e7456b70) at parser/parse_to.c:828 tp = 0x7fa2e7484090 foo = 0x0 __FUNCTION__ = "free_to_params" #5 0x00000000005657d7 in free_to (tb=0x7fa2e7456b70) at parser/parse_to.c:838 __FUNCTION__ = "free_to" #6 0x0000000000544fef in clean_hdr_field (hf=0x7fa2e723c030) at parser/hf.c:169 h_parsed = 0x7fa2e723c060 __FUNCTION__ = "clean_hdr_field" #7 0x000000000054515a in free_hdr_field_lst (hf=0x7fa2e723bbe0) at parser/hf.c:223 foo = 0x7fa2e723c030 __FUNCTION__ = "free_hdr_field_lst" #8 0x00000000005499e5 in free_sip_msg (msg=0x7fa2e74824a8) at parser/msg_parser.c:729 __FUNCTION__ = "free_sip_msg" #9 0x000000000049f89d in receive_msg ( buf=0x910e20 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK11f7.06fc4f95.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK04B41bd5b77895524b3\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=gK045cb2b5;vsf=", 'A' <repeats 38 times>, "--;dlgcor=48a.e5>\r\nFrom: sip:16784841111@68.68.120.41;tag=gK045cb2b5\r\nTo: sip:12066513489@55.177.31.199;tag=8200ff18e3\r\nCall-ID: 184818744_124682822@68.68.120.41\r\nCSeq: 25781 INVITE\r\nContact: <sip:528---Type <return> to continue, or q <return> to quit--- 2797646@195.105.225.111:5060;transport=udp>\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n", len=716, rcv_info=0x7fffd0c14960) at receive.c:296 msg = 0x7fa2e74824a8 ctx = {rec_lev = 8793288, run_flags = 0, last_retcode = 0, jmp_env = {{ __jmpbuf = {0, 0, 0, 265124110288, 1, 0, 140337638875928, 1}, __mask_was_saved = -508184336, __saved_mask = {__val = { 140337641517152, 3502328080, 1024, 7797295296, 140333253245168, 140736695716032, 5433321, 0, 140333253245168, 44055, 140333253245168, 140736695716296, 8093911720, 140736695716112, 5433449, 7}}}}} ret = 32674 inb = { s = 0x910e20 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK11f7.06fc4f95.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK04B41bd5b77895524b3\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=gK045cb2b5;vsf=", 'A' <repeats 38 times>, "--;dlgcor=48a.e5>\r\nFrom: sip:16784841111@68.68.120.41;tag=gK045cb2b5\r\nTo: sip:12066513489@55.177.31.199;tag=8200ff18e3\r\nCall-ID: 184818744_124682822@68.68.120.41\r\nCSeq: 25781 INVITE\r\nContact: sip:5282797646@195.105.225.111:5060;transport=udp\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n", len = 716} __FUNCTION__ = "receive_msg" #10 0x0000000000532665 in udp_rcv_loop () at udp_server.c:557 len = 716 buf = "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK11f7.06fc4f95.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK04B41bd5b77895524b3\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=gK045cb2b5;vsf=", 'A' <repeats 38 times>, "--;dlgcor=48a.e5>\r\nFrom: sip:16784841111@68.68.120.41;tag=gK045cb2b5\r\nTo: sip:12066513489@55.177.31.199;tag=8200ff18e3\r\nCall-ID: 184818744_124682822@68.68.120.41\r\nCSeq: 25781 INVITE\r\nContact: sip:5282797646@195.105.225.111:5060;transport=udp\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n\000dp\r\n\r\nv=0\r\no=Sonus_UAC 828 24583 IN IP4 68.68.120.41\r\ns=SIP Media Capabilities\r\nc=IN IP4 68.68.120.35\r\nt=0 0\r\nm=audio 28632 RTP/AVP 0 8 101 13\r\na=rtpmap:0 PCMU/8000\r\na=rtpmap:8 PCMA/8000\r\na=rtpmap:101 telephone-event/8000\r\na=fmtp:101 0-15\r\na=rtpmap:13 CN/8000\r\na=sendrecv\r\na=ptime:20\r\na=rtcp:28633\n", '\000' <repeats 64520 times> tmp = 0x5d0648 "udp receiver child=%d sock=%s:%s" from = 0x7fa2e7445908 fromlen = 16 ri = {src_ip = {af = 2, len = 4, u = {addrl = {2045424312, 140337641517152}, addr32 = {2045424312, 0, 3880087648, 32674}, addr16 = {45752, 31210, 0, 0, 28768, 59205, 32674, 0}, addr = "\270\262\352y\000\000\000\000`pE\347\242\177\000"}}, dst_ip = {af = 2, len = 4, u = {addrl = {3257728577, 0}, addr32 = { 3257728577, 0, 0, 0}, addr16 = {65089, 49708, 0, 0, 0, 0, 0, 0}, addr = "A\376,\302", '\000' <repeats 11 times>}}, src_port = 5060, dst_port = 5060, proto_reserved1 = 0, proto_reserved2 = 0, src_su = {s = {sa_family = 2, sa_data = "\023ĸ\262\352y\000\000\000\000\000\000\000"}, sin = { sin_family = 2, sin_port = 50195, sin_addr = { s_addr = 2045424312}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = { sin6_family = 2, sin6_port = 50195, sin6_flowinfo = 2045424312, sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}, bind_address = 0x7fa2e7456ef8, proto = 1 '\001'} __FUNCTION__ = "udp_rcv_loop" #11 0x00000000004688a1 in main_loop () at main.c:1638 i = 7 pid = 0 si = 0x7fa2e7456ef8 si_desc = "udp receiver child=7 sock=55.177.31.199:5060\000\177\000\000\200\261D\347\242\177\000\000\026\000\000\000\000\000\000\000\360v]\000\000\000\000\000\207\000\000\000\000\000\000\000pDA\000\000\000\000\000\220L\301\320\377\177", '\000' <repeats 18 times>"\320, J\301\320\377\177\000\000d;K\000\000\000\000" nrprocs = 8 __FUNCTION__ = "main_loop" #12 0x000000000046b84a in main (argc=13, argv=0x7fffd0c14c98) at main.c:2566 cfg_stream = 0x13b6010 c = -1 r = 0 tmp_len = 0 port = 0 proto = 0 options = 0x5d0a40 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:" ret = -1 seed = 3436287552 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x3dbae0fb88 p = 0x5bb480 "H\211l$\330L\211d$\340H\215-\273s*" __FUNCTION__ = "main"
Here's the head/tail overwritten message:
Sep 6 04:09:28 gw1 /usr/local/sbin/kamailio[18694]: : <core> [mem/q_malloc.c:140]: qm_debug_frag(): BUG: qm_*: fragm. 0x7fa2e7484060 (address 0x7fa2e7484090) beginning overwritten(6975753a6e72753c)!
Grepping for the fragment address:
Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 168) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=320) on 1 -th hit Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:472]: qm_free(): qm_free: freeing frag. 0x7fa2e7484060 alloc'ed from sqlops: sql_api.c: sql_do_query(315) Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 320) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=320) on 1 -th hit Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:472]: qm_free(): qm_free: freeing frag. 0x7fa2e7484060 alloc'ed from <core>: db_row.c: db_allocate_row(114) Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:472]: qm_free(): qm_free: freeing frag. 0x7fa2e7484060 alloc'ed from <core>: db_res.c: db_allocate_rows(179) Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 24) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:472]: qm_free(): qm_free: freeing frag. 0x7fa2e7484060 alloc'ed from <core>: db_val.c: db_str2val(122) Sep 6 04:09:19 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:19 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:472]: qm_free(): qm_free: freeing frag. 0x7fa2e7484060 alloc'ed from <core>: parser/msg_parser.c: get_hdr_field(152) Sep 6 04:09:22 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:22 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:472]: qm_free(): qm_free: freeing frag. 0x7fa2e7484060 alloc'ed from <core>: parser/msg_parser.c: get_hdr_field(152) Sep 6 04:09:24 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:24 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:472]: qm_free(): qm_free: freeing frag. 0x7fa2e7484060 alloc'ed from <core>: parser/msg_parser.c: get_hdr_field(152) Sep 6 04:09:25 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:25 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:472]: qm_free(): qm_free: freeing frag. 0x7fa2e7484060 alloc'ed from <core>: parser/msg_parser.c: get_hdr_field(152) Sep 6 04:09:27 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:27 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:472]: qm_free(): qm_free: freeing frag. 0x7fa2e7484060 alloc'ed from <core>: parser/msg_parser.c: get_hdr_field(152) Sep 6 04:09:28 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:28 gw1 /usr/local/sbin/kamailio[18694]: : <core> [mem/q_malloc.c:140]: qm_debug_frag(): BUG: qm_*: fragm. 0x7fa2e7484060 (address 0x7fa2e7484090) beginning overwritten(6975753a6e72753c)! Sep 6 04:09:28 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: qm_status: 6781. N address=0x7fa2e7484090 frag=0x7fa2e7484060 size=48 used=1
And for the address itself:
Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 168) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=320) on 1 -th hit Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7fa2e71d2010, 0x7fa2e7484090), called from sqlops: sql_api.c: sql_reset_result(237) Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 320) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=320) on 1 -th hit Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [db_row.c:119]: db_allocate_row(): allocate 320 bytes for row values at 0x7fa2e7484090 Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [db_row.c:97]: db_free_row(): freeing row values at 0x7fa2e7484090 Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7fa2e71d2010, 0x7fa2e7484090), called from <core>: db_row.c: db_free_row(98) Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [db_res.c:184]: db_allocate_rows(): allocate 48 bytes for rows at 0x7fa2e7484090 Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [db_res.c:62]: db_free_rows(): freeing rows at 0x7fa2e7484090 Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7fa2e71d2010, 0x7fa2e7484090), called from <core>: db_res.c: db_free_rows(63) Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 24) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [db_val.c:127]: db_str2val(): allocate 23 bytes memory for STRING at 0x7fa2e7484090 Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [db_row.c:68]: db_free_row(): free VAL_STRING[13] 'uloc-5215a229-4e26-dc1' at 0x7fa2e7484090 Sep 6 04:09:18 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7fa2e71d2010, 0x7fa2e7484090), called from <core>: db_row.c: db_free_row(69) Sep 6 04:09:19 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:19 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7fa2e71d2010, 0x7fa2e7484090), called from <core>: parser/parse_cseq.c: free_cseq(105) Sep 6 04:09:22 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:22 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7fa2e71d2010, 0x7fa2e7484090), called from <core>: parser/parse_cseq.c: free_cseq(105) Sep 6 04:09:24 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:24 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7fa2e71d2010, 0x7fa2e7484090), called from <core>: parser/parse_cseq.c: free_cseq(105) Sep 6 04:09:25 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:25 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7fa2e71d2010, 0x7fa2e7484090), called from <core>: parser/parse_cseq.c: free_cseq(105) Sep 6 04:09:27 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:27 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7fa2e71d2010, 0x7fa2e7484090), called from <core>: parser/parse_cseq.c: free_cseq(105) Sep 6 04:09:28 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7fa2e71d2010, 48) returns address 0x7fa2e7484090 frag. 0x7fa2e7484060 (size=48) on 1 -th hit Sep 6 04:09:28 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7fa2e71d2010, 0x7fa2e7484090), called from <core>: parser/parse_to.c: free_to_params(828) Sep 6 04:09:28 gw1 /usr/local/sbin/kamailio[18694]: : <core> [mem/q_malloc.c:140]: qm_debug_frag(): BUG: qm_*: fragm. 0x7fa2e7484060 (address 0x7fa2e7484090) beginning overwritten(6975753a6e72753c)! Sep 6 04:09:28 gw1 /usr/local/sbin/kamailio[18694]: DEBUG: qm_status: 6781. N address=0x7fa2e7484090 frag=0x7fa2e7484060 size=48 used=1
-- Alex
Just for fun, I caused another crash, and this time got this backtrace:
#0 0x000000000054274d in qm_status (qm=0x7f7a3a24a010) at mem/q_malloc.c:775 #1 0x000000000053c10d in qm_debug_frag (qm=0x7f7a3a24a010, f=0x7f7a3a4fc918) at mem/q_malloc.c:141 #2 0x000000000053d32a in qm_malloc (qm=0x7f7a3a24a010, size=952, file=0x5e5434 "<core>: rvalue.c", func=0x5e9fb4 "rval_new_empty", line=236) at mem/q_malloc.c:384 #3 0x00000000004bddee in rval_new_empty (extra_size=102) at rvalue.c:236 #4 0x00000000004bde65 in rval_new_str (s=0x7fffcb1c9110, extra_size=80) at rvalue.c:260 #5 0x00000000004c057d in rval_convert (h=0x7fffcb1caf50, msg=0x7f7a39799920, type=RV_STR, v=0x7f7a3a3fd3d0, c=0x7fffcb1c92d0) at rvalue.c:1321 #6 0x00000000004c1a31 in rval_str_lop2 (h=0x7fffcb1caf50, msg=0x7f7a39799920, res=0x7fffcb1c9778, op=RVE_EQ_OP, l=0x7f7a3a3fd3d0, c1=0x7fffcb1c92d0, r=0x7f7a3a3fdbd0, c2=0x0) at rvalue.c:1752 #7 0x00000000004c2667 in rval_expr_eval_int (h=0x7fffcb1caf50, msg=0x7f7a39799920, res=0x7fffcb1c9778, rve=0x7f7a3a3fe2e8) at rvalue.c:2058 #8 0x0000000000418dca in do_action (h=0x7fffcb1caf50, a=0x7f7a3a4001e8, msg=0x7f7a39799920) at action.c:1050 #9 0x0000000000421ce7 in run_actions (h=0x7fffcb1caf50, a=0x7f7a3a3fd1a8, msg=0x7f7a39799920) at action.c:1573 #10 0x00000000004206af in do_action (h=0x7fffcb1caf50, a=0x7f7a3a4006c0, msg=0x7f7a39799920) at action.c:1374 #11 0x0000000000421ce7 in run_actions (h=0x7fffcb1caf50, a=0x7f7a3a3f3958, msg=0x7f7a39799920) at action.c:1573 #12 0x0000000000419012 in do_action (h=0x7fffcb1caf50, a=0x7f7a3a405f80, msg=0x7f7a39799920) at action.c:1065 #13 0x0000000000421ce7 in run_actions (h=0x7fffcb1caf50, a=0x7f7a3a405f80, msg=0x7f7a39799920) at action.c:1573 #14 0x000000000041906b in do_action (h=0x7fffcb1caf50, a=0x7f7a3a4060c0, msg=0x7f7a39799920) at action.c:1069 #15 0x0000000000421ce7 in run_actions (h=0x7fffcb1caf50, a=0x7f7a3a3da6a8, msg=0x7f7a39799920) at action.c:1573 #16 0x0000000000416f5a in do_action (h=0x7fffcb1caf50, a=0x7f7a3a4125d0, msg=0x7f7a39799920) at action.c:690 #17 0x0000000000421ce7 in run_actions (h=0x7fffcb1caf50, a=0x7f7a3a40b828, msg=0x7f7a39799920) at action.c:1573 #18 0x0000000000422471 in run_top_route (a=0x7f7a3a40b828, msg=0x7f7a39799920, c=0x0) at action.c:1658 #19 0x00007f7a3954ce90 in run_failure_handlers (t=0x7f7935779ad8, rpl=0x7f7a3a4fa4a8, code=404, extra_flags=64) at t_reply.c:1024 #20 0x00007f7a3954e00b in t_should_relay_response (Trans=0x7f7935779ad8, new_code=404, branch=0, should_store=0x7fffcb1cb230, should_relay=0x7fffcb1cb234, cancel_data=0x7fffcb1cb440, reply=0x7f7a3a4fa4a8) at t_reply.c:1300 #21 0x00007f7a3954fb98 in relay_reply (t=0x7f7935779ad8, p_msg=0x7f7a3a4fa4a8, branch=0, msg_status=404, cancel_data=0x7fffcb1cb440, do_put_on_wait=1) at t_reply.c:1703 #22 0x00007f7a39552cce in reply_received (p_msg=0x7f7a3a4fa4a8) at t_reply.c:2370 #23 0x00000000004591b1 in do_forward_reply (msg=0x7f7a3a4fa4a8, mode=0) at forward.c:799 #24 0x0000000000459a40 in forward_reply (msg=0x7f7a3a4fa4a8) at forward.c:882 #25 0x000000000049f776 in receive_msg ( buf=0x910e20 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK80fd.c731a837.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK0cB61aa236182c0cf4d\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=gK0c275042;vsf=", 'A' <repeats 38 times>, "--;dlgcor=a9.aca2>\r\nFrom: sip:16784841111@68.68.120.41;tag=gK0c275042\r\nTo: sip:12066513489@55.177.31.199;tag=ed7d54e69a\r\nCall-ID: 185377618_113020350@68.68.120.41\r\nCSeq: 15495 INVITE\r\nContact: sip:5282797646@195.105.225.111:5060;transport=udp\r\nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n", len=717, rcv_info=0x7fffcb1cb7c0) at receive.c:272 #26 0x0000000000532665 in udp_rcv_loop () at udp_server.c:557 #27 0x00000000004688a1 in main_loop () at main.c:1638 #28 0x000000000046b84a in main (argc=13, argv=0x7fffcb1cbaf8) at main.c:2566
This time, the overwritten fragment was:
Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[19629]: : <core> [mem/q_malloc.c:140]: qm_debug_frag(): BUG: qm_*: fragm. 0x7f7a3a4fc918 (address 0x7f7a3a4fc948) beginning overwritten(3e653834323335)!
This time, the pedigree of the fragment address was much shorter:
Sep 6 04:47:50 gw1 /usr/local/sbin/kamailio[19631]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7f7a3a24a010, 56) returns address 0x7f7a3a4fc918 frag. 0x7f7a3a4fc8e8 (size=96) on 1 -th hit Sep 6 04:47:50 gw1 /usr/local/sbin/kamailio[19631]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7f7a3a24a010, 0x7f7a3a4fc918), called from <core>: data_lump.c: free_duped_lump_list(626) Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[19627]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7f7a3a24a010, 48) returns address 0x7f7a3a4fc918 frag. 0x7f7a3a4fc8e8 (size=48) on 1 -th hit Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[19627]: DEBUG: <core> [db_res.c:184]: db_allocate_rows(): allocate 48 bytes for rows at 0x7f7a3a4fc918 Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[19627]: DEBUG: <core> [db_res.c:62]: db_free_rows(): freeing rows at 0x7f7a3a4fc918 Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[19627]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7f7a3a24a010, 0x7f7a3a4fc918), called from <core>: db_res.c: db_free_rows(63) Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[19627]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7f7a3a24a010, 48) returns address 0x7f7a3a4fc918 frag. 0x7f7a3a4fc8e8 (size=48) on 1 -th hit Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[19627]: DEBUG: <core> [db_val.c:127]: db_str2val(): allocate 48 bytes memory for STRING at 0x7f7a3a4fc918 Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[19627]: DEBUG: <core> [db_row.c:68]: db_free_row(): free VAL_STRING[14] 'urn:uuid:d3bcbc0b-7fde-4db5-9871-79efb9b07aab' at 0x7f7a3a4fc918 Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[19627]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7f7a3a24a010, 0x7f7a3a4fc918), called from <core>: db_row.c: db_free_row(69) Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[19627]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7f7a3a24a010, 48) returns address 0x7f7a3a4fc918 frag. 0x7f7a3a4fc8e8 (size=48) on 1 -th hit Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[19629]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7f7a3a24a010, 952) returns address 0x7f7a3a4fc948 frag. 0x7f7a3a4fc918 (size=952) on 1 -th hit Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[19629]: DEBUG: <core> [mem/q_malloc.c:472]: qm_free(): qm_free: freeing frag. 0x7f7a3a4fc918 alloc'ed from <core>: rvalue.c: rval_new_empty(236) Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[19629]: : <core> [mem/q_malloc.c:140]: qm_debug_frag(): BUG: qm_*: fragm. 0x7f7a3a4fc918 (address 0x7f7a3a4fc948) beginning overwritten(3e653834323335)! Sep 6 04:48:06 gw1 /usr/local/sbin/kamailio[19627]: DEBUG: qm_status: 6792. N address=0x7f7a3a4fc918 frag=0x7f7a3a4fc8e8 size=48 used=1
But the common factor is database involvement beforehand.
Am I right to suppose that this points to a buffer overflow in db_postgres or libsrdb1 somewhere?
-- Alex
Thanks, I'll analyze the logs as I get a chance very soon.
Cheers, Daniel
On Fri, Sep 6, 2013 at 10:52 AM, Alex Balashov abalashov@evaristesys.comwrote:
Just for fun, I caused another crash, and this time got this backtrace:
#0 0x000000000054274d in qm_status (qm=0x7f7a3a24a010) at mem/q_malloc.c:775 #1 0x000000000053c10d in qm_debug_frag (qm=0x7f7a3a24a010, f=0x7f7a3a4fc918) at mem/q_malloc.c:141 #2 0x000000000053d32a in qm_malloc (qm=0x7f7a3a24a010, size=952, file=0x5e5434 "<core>: rvalue.c", func=0x5e9fb4 "rval_new_empty", line=236) at mem/q_malloc.c:384 #3 0x00000000004bddee in rval_new_empty (extra_size=102) at rvalue.c:236 #4 0x00000000004bde65 in rval_new_str (s=0x7fffcb1c9110, extra_size=80) at rvalue.c:260 #5 0x00000000004c057d in rval_convert (h=0x7fffcb1caf50, msg=0x7f7a39799920, type=RV_STR, v=0x7f7a3a3fd3d0, c=0x7fffcb1c92d0) at rvalue.c:1321 #6 0x00000000004c1a31 in rval_str_lop2 (h=0x7fffcb1caf50, msg=0x7f7a39799920, res=0x7fffcb1c9778, op=RVE_EQ_OP, l=0x7f7a3a3fd3d0, c1=0x7fffcb1c92d0, r=0x7f7a3a3fdbd0, c2=0x0) at rvalue.c:1752 #7 0x00000000004c2667 in rval_expr_eval_int (h=0x7fffcb1caf50, msg=0x7f7a39799920, res=0x7fffcb1c9778, rve=0x7f7a3a3fe2e8) at rvalue.c:2058 #8 0x0000000000418dca in do_action (h=0x7fffcb1caf50, a=0x7f7a3a4001e8, msg=0x7f7a39799920) at action.c:1050 #9 0x0000000000421ce7 in run_actions (h=0x7fffcb1caf50, a=0x7f7a3a3fd1a8, msg=0x7f7a39799920) at action.c:1573 #10 0x00000000004206af in do_action (h=0x7fffcb1caf50, a=0x7f7a3a4006c0, msg=0x7f7a39799920) at action.c:1374 #11 0x0000000000421ce7 in run_actions (h=0x7fffcb1caf50, a=0x7f7a3a3f3958, msg=0x7f7a39799920) at action.c:1573 #12 0x0000000000419012 in do_action (h=0x7fffcb1caf50, a=0x7f7a3a405f80, msg=0x7f7a39799920) at action.c:1065 #13 0x0000000000421ce7 in run_actions (h=0x7fffcb1caf50, a=0x7f7a3a405f80, msg=0x7f7a39799920) at action.c:1573 #14 0x000000000041906b in do_action (h=0x7fffcb1caf50, a=0x7f7a3a4060c0, msg=0x7f7a39799920) at action.c:1069 #15 0x0000000000421ce7 in run_actions (h=0x7fffcb1caf50, a=0x7f7a3a3da6a8, msg=0x7f7a39799920) at action.c:1573 #16 0x0000000000416f5a in do_action (h=0x7fffcb1caf50, a=0x7f7a3a4125d0, msg=0x7f7a39799920) at action.c:690 #17 0x0000000000421ce7 in run_actions (h=0x7fffcb1caf50, a=0x7f7a3a40b828, msg=0x7f7a39799920) at action.c:1573 #18 0x0000000000422471 in run_top_route (a=0x7f7a3a40b828, msg=0x7f7a39799920, c=0x0) at action.c:1658 #19 0x00007f7a3954ce90 in run_failure_handlers (t=0x7f7935779ad8, rpl=0x7f7a3a4fa4a8, code=404, extra_flags=64) at t_reply.c:1024 #20 0x00007f7a3954e00b in t_should_relay_response (Trans=0x7f7935779ad8, new_code=404, branch=0, should_store=0x7fffcb1cb230, should_relay=0x7fffcb1cb234, cancel_data=0x7fffcb1cb440, reply=0x7f7a3a4fa4a8) at t_reply.c:1300 #21 0x00007f7a3954fb98 in relay_reply (t=0x7f7935779ad8, p_msg=0x7f7a3a4fa4a8, branch=0, msg_status=404, cancel_data=0x7fffcb1cb440, do_put_on_wait=1) at t_reply.c:1703 #22 0x00007f7a39552cce in reply_received (p_msg=0x7f7a3a4fa4a8) at t_reply.c:2370 #23 0x00000000004591b1 in do_forward_reply (msg=0x7f7a3a4fa4a8, mode=0) at forward.c:799 #24 0x0000000000459a40 in forward_reply (msg=0x7f7a3a4fa4a8) at forward.c:882 #25 0x000000000049f776 in receive_msg ( buf=0x910e20 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199 ;branch=z9hG4bK80fd.c731a837.**0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060 ;branch=**z9hG4bK0cB61aa236182c0cf4d\r**nRecord-Route: <sip:55.177.31.199 ;lr=on;ftag=gK0c275042;**vsf=", 'A' <repeats 38 times>, "--;dlgcor=a9.aca2>\r\nFrom: sip:16784841111@68.68.120.41**;tag=gK0c275042\r\nTo: sip:12066513489@55.177.31.199**;tag=ed7d54e69a\r\nCall-ID: 185377618_113020350@68.68.120.**41 185377618_113020350@68.68.120.41\r\nCSeq: 15495 INVITE\r\nContact: sip:5282797646@195.105.225.** 111:5060;transport=udp\r**nSupported: 100rel, replaces, norefersub\r\nAllow-Events: refer\r\nAllow: INVITE, ACK, CANCEL, BYE, REFER, PRACK, INFO, UPDATE\r\nAccept: application/sdp\r\nUser-Agent: snomONE/4.5.1.1107 Zeta Perseids\r\nContent-Length: 0\r\n\r\n", len=717, rcv_info=0x7fffcb1cb7c0) at receive.c:272 #26 0x0000000000532665 in udp_rcv_loop () at udp_server.c:557 #27 0x00000000004688a1 in main_loop () at main.c:1638 #28 0x000000000046b84a in main (argc=13, argv=0x7fffcb1cbaf8) at main.c:2566
This time, the overwritten fragment was:
Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[**19629]: : <core> [mem/q_malloc.c:140]: qm_debug_frag(): BUG: qm_*: fragm. 0x7f7a3a4fc918 (address 0x7f7a3a4fc948) beginning overwritten(3e653834323335)!
This time, the pedigree of the fragment address was much shorter:
Sep 6 04:47:50 gw1 /usr/local/sbin/kamailio[**19631]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7f7a3a24a010, 56) returns address 0x7f7a3a4fc918 frag. 0x7f7a3a4fc8e8 (size=96) on 1 -th hit Sep 6 04:47:50 gw1 /usr/local/sbin/kamailio[**19631]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7f7a3a24a010, 0x7f7a3a4fc918), called from <core>: data_lump.c: free_duped_lump_list(626) Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[**19627]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7f7a3a24a010, 48) returns address 0x7f7a3a4fc918 frag. 0x7f7a3a4fc8e8 (size=48) on 1 -th hit Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[**19627]: DEBUG: <core> [db_res.c:184]: db_allocate_rows(): allocate 48 bytes for rows at 0x7f7a3a4fc918 Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[**19627]: DEBUG: <core> [db_res.c:62]: db_free_rows(): freeing rows at 0x7f7a3a4fc918 Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[**19627]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7f7a3a24a010, 0x7f7a3a4fc918), called from <core>: db_res.c: db_free_rows(63) Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[**19627]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7f7a3a24a010, 48) returns address 0x7f7a3a4fc918 frag. 0x7f7a3a4fc8e8 (size=48) on 1 -th hit Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[**19627]: DEBUG: <core> [db_val.c:127]: db_str2val(): allocate 48 bytes memory for STRING at 0x7f7a3a4fc918 Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[**19627]: DEBUG: <core> [db_row.c:68]: db_free_row(): free VAL_STRING[14] 'urn:uuid:d3bcbc0b-7fde-4db5-**9871-79efb9b07aab' at 0x7f7a3a4fc918 Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[**19627]: DEBUG: <core> [mem/q_malloc.c:437]: qm_free(): qm_free(0x7f7a3a24a010, 0x7f7a3a4fc918), called from <core>: db_row.c: db_free_row(69) Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[**19627]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7f7a3a24a010, 48) returns address 0x7f7a3a4fc918 frag. 0x7f7a3a4fc8e8 (size=48) on 1 -th hit Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[**19629]: DEBUG: <core> [mem/q_malloc.c:413]: qm_malloc(): qm_malloc(0x7f7a3a24a010, 952) returns address 0x7f7a3a4fc948 frag. 0x7f7a3a4fc918 (size=952) on 1 -th hit Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[**19629]: DEBUG: <core> [mem/q_malloc.c:472]: qm_free(): qm_free: freeing frag. 0x7f7a3a4fc918 alloc'ed from <core>: rvalue.c: rval_new_empty(236) Sep 6 04:47:54 gw1 /usr/local/sbin/kamailio[**19629]: : <core> [mem/q_malloc.c:140]: qm_debug_frag(): BUG: qm_*: fragm. 0x7f7a3a4fc918 (address 0x7f7a3a4fc948) beginning overwritten(3e653834323335)! Sep 6 04:48:06 gw1 /usr/local/sbin/kamailio[**19627]: DEBUG: qm_status: 6792. N address=0x7f7a3a4fc918 frag=0x7f7a3a4fc8e8 size=48 used=1
But the common factor is database involvement beforehand.
Am I right to suppose that this points to a buffer overflow in db_postgres or libsrdb1 somewhere?
-- Alex
-- Alex Balashov - Principal Evariste Systems LLC 235 E Ponce de Leon Ave Suite 106 Decatur, GA 30030 United States Tel: +1-678-954-0670 Web: http://www.evaristesys.com/, http://www.alexbalashov.com/
______________________________**_________________ sr-dev mailing list sr-dev@lists.sip-router.org http://lists.sip-router.org/**cgi-bin/mailman/listinfo/sr-**devhttp://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
Hi Daniel,
I don't suppose you've had a chance to take a look at this?
I don't mean to pester, and you certainly are under no obligation to look at it at all, least of all on a schedule not of your choosing. However, it is a critical crash bug here (Kamailio can be crashed by calling a certain phone number at any time) and it would just help me out a lot of it were not forgotten. :-)
Cheers,
-- Alex
Hi Alex,
it's on my list for today, as I got my computer back after few days in service.
Cheers, Daniel
On 9/13/13 11:19 AM, Alex Balashov wrote:
Hi Daniel,
I don't suppose you've had a chance to take a look at this?
I don't mean to pester, and you certainly are under no obligation to look at it at all, least of all on a schedule not of your choosing. However, it is a critical crash bug here (Kamailio can be crashed by calling a certain phone number at any time) and it would just help me out a lot of it were not forgotten. :-)
Cheers,
-- Alex
Hello, a question not to look at wrong version, is it latest 4.0? I saw that master has changes in To parser done by Hugh...
Cheers, Daniel
On 9/13/13 11:27 AM, Daniel-Constantin Mierla wrote:
Hi Alex,
it's on my list for today, as I got my computer back after few days in service.
Cheers, Daniel
On 9/13/13 11:19 AM, Alex Balashov wrote:
Hi Daniel,
I don't suppose you've had a chance to take a look at this?
I don't mean to pester, and you certainly are under no obligation to look at it at all, least of all on a schedule not of your choosing. However, it is a critical crash bug here (Kamailio can be crashed by calling a certain phone number at any time) and it would just help me out a lot of it were not forgotten. :-)
Cheers,
-- Alex
On 09/13/2013 07:07 AM, Daniel-Constantin Mierla wrote:
Hello, a question not to look at wrong version, is it latest 4.0? I saw that master has changes in To parser done by Hugh...
[root@gw1 sip-router]# git branch * 4.0 [root@gw1 sip-router]# git log | head -n 4 commit 56e808078325810c7c8c0d436a0f4f51350680b5 Merge: 10da17d 2a224a5 Author: root root@gw1.sip.us Date: Tue Aug 27 05:54:20 2013 -0400
-- Alex
What is the last merge? The backport of the commit I did last time for this issue? Just checking to see if there is some custome development/modules around.
Daniel
On 9/13/13 1:09 PM, Alex Balashov wrote:
On 09/13/2013 07:07 AM, Daniel-Constantin Mierla wrote:
Hello, a question not to look at wrong version, is it latest 4.0? I saw that master has changes in To parser done by Hugh...
[root@gw1 sip-router]# git branch
- 4.0
[root@gw1 sip-router]# git log | head -n 4 commit 56e808078325810c7c8c0d436a0f4f51350680b5 Merge: 10da17d 2a224a5 Author: root root@gw1.sip.us Date: Tue Aug 27 05:54:20 2013 -0400
-- Alex
On 09/13/2013 07:12 AM, Daniel-Constantin Mierla wrote:
What is the last merge? The backport of the commit I did last time for this issue? Just checking to see if there is some custome development/modules around.
The patch you made (setting To param list to NULL) was manually applied. Here's the git diff for the entire tree:
[root@gw1 sip-router]# git diff diff --git a/Makefile.defs b/Makefile.defs index 15f42ae..c17992b 100644 --- a/Makefile.defs +++ b/Makefile.defs @@ -172,7 +172,7 @@ MEMMNG ?= 1 # memory debugger switcher # 0 - off (no-debug mode) # 1 - on (debug mode) -MEMDBG ?= 0 +MEMDBG = 1
SER_VER = $(shell expr $(VERSION) * 1000000 + $(PATCHLEVEL) * 1000 + \ $(SUBLEVEL) ) @@ -976,7 +976,7 @@ ifeq ($(ARCH), x86_64) ifeq ($(CC_NAME), gcc) C_DEFS+=-DCC_GCC_LIKE_ASM #common stuff - CFLAGS=-g -O9 -funroll-loops -Wcast-align $(PRO + CFLAGS=-g -ggdb3 -ggdb -Wcast-align $(PROFILE) #if gcc 4.5+ # don't add '-mtune=$(CPU)' - gcc failure ifeq ($(CC_SHORTVER), 4.5+) diff --git a/parser/parse_to.c b/parser/parse_to.c index b97f8b0..93b792c 100644 --- a/parser/parse_to.c +++ b/parser/parse_to.c @@ -828,6 +828,8 @@ void free_to_params(struct to_body* const tb) pkg_free(tp); tp=foo; } + + tb->param_lst = NULL; }
Here's the recent git log:
commit 56e808078325810c7c8c0d436a0f4f51350680b5 Merge: 10da17d 2a224a5 Author: root root@gw1.sip.us Date: Tue Aug 27 05:54:20 2013 -0400
Merge branch '4.0' of git://git.sip-router.org/sip-router into 4.0
commit 2a224a569cea270d8db84438f163b9f309569df9 Author: Daniel-Constantin Mierla miconda@gmail.com Date: Thu Aug 22 00:14:53 2013 +0200
core: print src address details if initial message parsing fails
- reported by Juha Heinanen
(cherry picked from commit 3ccf4b43e81bd2654cb306a3c2cc21b97cb51f62)
commit fd4a2dde96a692c165f382839c3bef8636dfd9e2 Author: Juha Heinanen jh@tutpro.com Date: Thu Aug 22 08:20:52 2013 +0300
modules/lcr: added some linefeed chars missing from syslog messages
- Patch provided by Kevin Scott Adams. (cherry picked from commit d03651fb4c3a6b50923029e121eed201fb1ff550)
commit 10da17d5c4261775bd6c354faa17159971df1332 Merge: 9a90ca2 090035c Author: root root@gw1.sip.us Date: Sun Aug 18 10:44:58 2013 -0400
Merge branch '4.0' of git://git.sip-router.org/sip-router into 4.0
commit 090035cafa01bf54c41bc9f1b145fc7b4dbf0e4b Author: Daniel-Constantin Mierla miconda@gmail.com Date: Thu Aug 15 15:38:09 2013 +0200
ChangeLog: updated for v4.0.3
commit a75a8a140e53ef8cee58d670b3954e7ac4cdb1b8 Author: Daniel-Constantin Mierla miconda@gmail.com Date: Thu Aug 15 15:33:55 2013 +0200
Makefile.defs: version set to 4.0.3
-- Alex
Daniel,
If it helps, I did try the master at the time that I submitted those core dumps in the last week or two. It exhibited the same crash behaviour, and that incorporated Hugh's changes (which we made back in March 2013).
-- Alex
Hello,
I wrote few days ago as reply to your private email with logs, asking for few more details from the core. Pinging here to be sure you haven't missed them.
Cheers, Daniel
On 9/13/13 4:26 PM, Alex Balashov wrote:
Daniel,
If it helps, I did try the master at the time that I submitted those core dumps in the last week or two. It exhibited the same crash behaviour, and that incorporated Hugh's changes (which we made back in March 2013).
-- Alex
Hi Daniel,
On 09/16/2013 04:01 AM, Daniel-Constantin Mierla wrote:
I wrote few days ago as reply to your private email with logs, asking for few more details from the core. Pinging here to be sure you haven't missed them.
Thank you, I haven't missed them. I should be able to have them to you today. To get the information you requested, I need to go back and correlate it to the original core dump (I have dozens) I quoted at the time, or reproduce the problem again. It's going to take me a bit, and I'm sorry for the delay.
Cheers,
-- Alex
Hello,
On 9/16/13 10:02 AM, Alex Balashov wrote:
Hi Daniel,
On 09/16/2013 04:01 AM, Daniel-Constantin Mierla wrote:
I wrote few days ago as reply to your private email with logs, asking for few more details from the core. Pinging here to be sure you haven't missed them.
Thank you, I haven't missed them. I should be able to have them to you today. To get the information you requested, I need to go back and correlate it to the original core dump (I have dozens) I quoted at the time, or reproduce the problem again. It's going to take me a bit, and I'm sorry for the delay.
no problem.
If you can't corelate with old logs and plan reproduce it, then let me know. I may give some extra instructions to get even more information.
Cheers, Daniel
On 09/16/2013 04:05 AM, Daniel-Constantin Mierla wrote:
Hello,
On 9/16/13 10:02 AM, Alex Balashov wrote:
Hi Daniel,
On 09/16/2013 04:01 AM, Daniel-Constantin Mierla wrote:
I wrote few days ago as reply to your private email with logs, asking for few more details from the core. Pinging here to be sure you haven't missed them.
Thank you, I haven't missed them. I should be able to have them to you today. To get the information you requested, I need to go back and correlate it to the original core dump (I have dozens) I quoted at the time, or reproduce the problem again. It's going to take me a bit, and I'm sorry for the delay.
no problem.
If you can't corelate with old logs and plan reproduce it, then let me know. I may give some extra instructions to get even more information.
I think reproducing it is easier at this point. Please let me know what other info you'd like, beyond what you requested in the private e-mail.
On 9/16/13 10:06 AM, Alex Balashov wrote:
On 09/16/2013 04:05 AM, Daniel-Constantin Mierla wrote:
[...]
If you can't corelate with old logs and plan reproduce it, then let me know. I may give some extra instructions to get even more information.
I think reproducing it is easier at this point. Please let me know what other info you'd like, beyond what you requested in the private e-mail.
The issue seems to be a write of data before the allocated pointer or more than allocated. From the logs, the chunk before is used for $var(...) and the sources doesn't reveal any bug, furthermore, the chunk with issue has its beginning ok, thus it is very likely to be a write before the pointer. The chuck with issues is from the To header parser, also with low chances for issues, because it just contain pointers, so a write will be at the addresses pointed from here. The next chunk is from db_postgres and might be an issue to write at invalid row index, but I couldn't spot where that can happen.
Anyhow, my plan was to replace memcpy, strcpy and strncpy function to write in logs the pointers they work with, in order to see what code is overwriting the chunk head. (I hope is not a memove or some internal copy function)
The procedure is not that complex. Attached is a file crepl.c, copy it on the same system and compile it with:
gcc -shared -ldl -fPIC crepl.c -o libcrepl.so
You have to start kamailio from command line, also with log_stderror=yes and stderr redirected to a file:
LD_PRELOAD=/path/to/libcrepl.so /path/to/kamailio -f /path/to/kamailio.cfg -E -ddd 2>/tmp/kamailio.log
(-f, -E, -ddd are optional, as they can be default value or what is in config file). I haven't made the functions to write to syslog, thus you have to configure kamailio to write to stderror and save the output in a file. Or you change the crepl.c file to write to syslog.
You should see in logs a lot of messages with mem copy operations, prefixed with '======...'.
Send me all the logs, full backtrace as well as the other details I asked for.
Cheers, Daniel
Daniel,
Thank you; should this be run in high debug (3 or 4) mode?
-- Alex
On 09/16/2013 04:26 AM, Daniel-Constantin Mierla wrote:
On 9/16/13 10:06 AM, Alex Balashov wrote:
On 09/16/2013 04:05 AM, Daniel-Constantin Mierla wrote:
[...]
If you can't corelate with old logs and plan reproduce it, then let me know. I may give some extra instructions to get even more information.
I think reproducing it is easier at this point. Please let me know what other info you'd like, beyond what you requested in the private e-mail.
The issue seems to be a write of data before the allocated pointer or more than allocated. From the logs, the chunk before is used for $var(...) and the sources doesn't reveal any bug, furthermore, the chunk with issue has its beginning ok, thus it is very likely to be a write before the pointer. The chuck with issues is from the To header parser, also with low chances for issues, because it just contain pointers, so a write will be at the addresses pointed from here. The next chunk is from db_postgres and might be an issue to write at invalid row index, but I couldn't spot where that can happen.
Anyhow, my plan was to replace memcpy, strcpy and strncpy function to write in logs the pointers they work with, in order to see what code is overwriting the chunk head. (I hope is not a memove or some internal copy function)
The procedure is not that complex. Attached is a file crepl.c, copy it on the same system and compile it with:
gcc -shared -ldl -fPIC crepl.c -o libcrepl.so
You have to start kamailio from command line, also with log_stderror=yes and stderr redirected to a file:
LD_PRELOAD=/path/to/libcrepl.so /path/to/kamailio -f /path/to/kamailio.cfg -E -ddd 2>/tmp/kamailio.log
(-f, -E, -ddd are optional, as they can be default value or what is in config file). I haven't made the functions to write to syslog, thus you have to configure kamailio to write to stderror and save the output in a file. Or you change the crepl.c file to write to syslog.
You should see in logs a lot of messages with mem copy operations, prefixed with '======...'.
Send me all the logs, full backtrace as well as the other details I asked for.
Cheers, Daniel
sr-dev mailing list sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
On 09/16/2013 04:28 AM, Alex Balashov wrote:
Daniel,
Thank you; should this be run in high debug (3 or 4) mode?
Sorry, I guess -ddd answers that.
Hi Daniel,
I sent you the requested information. However, this time the crash was on the processing of a REGISTER (after the familiar steps taken to reproduce it), so I'm going to try to get another one that follows the more familiar pattern of crashing on To params free.
Hello,
for the records, the issue was fixed (master and 4.0). It was related to running lookup location again in failure route for records that have instance value.
Many thanks to Alex for testing and providing lot and useful logs, runing custom patches.
Cheers, Daniel
On 9/16/13 10:54 AM, Alex Balashov wrote:
Hi Daniel,
I sent you the requested information. However, this time the crash was on the processing of a REGISTER (after the familiar steps taken to reproduce it), so I'm going to try to get another one that follows the more familiar pattern of crashing on To params free.
To note (from 'bt full'):
#0 0x0000000000435908 in free_lump_list (l=0x7f4f81626730) at data_lump.c:510 t = 0x0 r = 0x3731396430392d64 foo = 0x3731396430392d64 crt = 0x7f4f81626730
And nothing seems to be out of bounds or unaddressable here:
(gdb) bt full #0 0x0000000000435908 in free_lump_list (l=0x7f4f81626730) at data_lump.c:510 t = 0x0 r = 0x3731396430392d64 foo = 0x3731396430392d64 crt = 0x7f4f81626730 #1 0x0000000000542d6a in free_sip_msg (msg=0x7f4f8180e970) at parser/msg_parser.c:731 No locals. #2 0x000000000049e39d in receive_msg ( buf=0x9065c0 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK022f.7350a5c7.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK0cBc7d5138cbe6ccf3f\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=g"..., len=717, rcv_info=0x7fff96469750) at receive.c:296 msg = 0x7f4f8180e970 ctx = {rec_lev = 8750200, run_flags = 0, last_retcode = 0, jmp_env = {{ __jmpbuf = {0, 0, 0, 265124110288, 1, 0, 167503724545, 9463168}, __mask_was_saved = -1773758632, __saved_mask = {__val = {1, 12884901901, 139979451761824, 4277328, 140735714597504, 140735714596560, 5426752, 140735714596544, 5423489, 50195, 169583417968, 9463168, 140735714596688, 80, 5423617, 4277328}}}}} ret = 32591 inb = { s = 0x9065c0 "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK022f.7350a5c7.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK0cBc7d5138cbe6ccf3f\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=g"..., len = 717} __FUNCTION__ = "receive_msg" #3 0x000000000052ffa1 in udp_rcv_loop () at udp_server.c:557 len = 717 buf = "SIP/2.0 404 Not Found\r\nVia: SIP/2.0/UDP 55.177.31.199;branch=z9hG4bK022f.7350a5c7.0\r\nVia: SIP/2.0/UDP 68.68.120.41:5060;branch=z9hG4bK0cBc7d5---Type <return> to continue, or q <return> to quit--- 138cbe6ccf3f\r\nRecord-Route: <sip:55.177.31.199;lr=on;ftag=g"... tmp = 0x906580 "96.237.173.61" from = 0x7f4f817e8510 fromlen = 16 ri = {src_ip = {af = 2, len = 4, u = {addrl = {2045424312, 139979451761824}, addr32 = {2045424312, 0, 2172617888, 32591}, addr16 = {45752, 31210, 0, 0, 33952, 33151, 32591, 0}, addr = "\270\262\352y\000\000\000\000\240\204\177\201O\177\000"}}, dst_ip = {af = 2, len = 4, u = {addrl = {3257728577, 0}, addr32 = { 3257728577, 0, 0, 0}, addr16 = {65089, 49708, 0, 0, 0, 0, 0, 0}, addr = "A\376,\302", '\000' <repeats 11 times>}}, src_port = 5060, dst_port = 5060, proto_reserved1 = 0, proto_reserved2 = 0, src_su = {s = {sa_family = 2, sa_data = "\023ĸ\262\352y\000\000\000\000\000\000\000"}, sin = { sin_family = 2, sin_port = 50195, sin_addr = { s_addr = 2045424312}, sin_zero = "\000\000\000\000\000\000\000"}, sin6 = { sin6_family = 2, sin6_port = 50195, sin6_flowinfo = 2045424312, sin6_addr = {__in6_u = {__u6_addr8 = '\000' <repeats 15 times>, __u6_addr16 = {0, 0, 0, 0, 0, 0, 0, 0}, __u6_addr32 = {0, 0, 0, 0}}}, sin6_scope_id = 0}}, bind_address = 0x7f4f817f8370, proto = 1 '\001'} __FUNCTION__ = "udp_rcv_loop" #4 0x0000000000467de2 in main_loop () at main.c:1638 i = 4 pid = 0 si = 0x7f4f817f8370 si_desc = "udp receiver child=4 sock=55.177.31.199:5060\000\177\000\000(\335~\201O\177\000\000\270\334\302*\000\000\000\000\220\230F\226\377\177\000\000\270\334\302*\000\000\000\000PDA\000\000\000\000\000\200\232F\226\377\177", '\000' <repeats 18 times>"\300, \230F\226\377\177\000\000\337"K\000\000\000\000" nrprocs = 8 ---Type <return> to continue, or q <return> to quit--- __FUNCTION__ = "main_loop" #5 0x000000000046ad8b in main (argc=13, argv=0x7fff96469a88) at main.c:2566 cfg_stream = 0x1b10010 c = -1 r = 0 tmp = 0x7fff9646b414 "" tmp_len = 0 port = 0 proto = 0 options = 0x5c86f8 ":f:cm:M:dVIhEeb:l:L:n:vKrRDTN:W:w:t:u:g:P:G:SQ:O:a:A:" ret = -1 seed = 4063171243 rfd = 4 debug_save = 0 debug_flag = 0 dont_fork_cnt = 0 n_lst = 0x3dbae0fb88 p = 0x5b3450 "H\211l$\330L\211d$\340H\215-\237K*" __FUNCTION__ = "main"
-- Alex
Hi Daniel,
A single head/tail message does appear:
Aug 28 03:42:09 gw1 /usr/local/sbin/kamailio[7913]: : <core> [mem/q_malloc.c:159]: qm_debug_frag(): BUG: qm_*: prev. fragm. tail overwritten(3630643536313a64, 2d303634622d3965)[0x7f99b713d008:0x7f99b713d038]!
Shortly before a crash.
-- Alex