On Mittwoch, 30. September 2009, Alex Balashov wrote:
BTW, for what it's worth: these effects have not
been observed when
using db_postgres instead.
Hi Alex,
ok, this is perhaps because the postgres code uses another escape function
which is probably more stable against invalid input, or the escape is not used
in this case at all. (It looks like your crash was in the function
mysql_real_escape_string)
One reason that it needs some time to crash could be that every process has
its own database handle open, and because of the blocking different processes
gets the request from time to time, until it find the "right" one that cause a
crash.
Do you use allow_trusted in caching mode, or does it query the db for every
invocation? If the latter, i'd take a closer look to the failure handling
logic in the function, perhaps it lets some variables staying in a unclean
state after a failure.
Cheers,
Henning
> I've run into a bug in Kamailio 1.4.4 where,
if using the permissions
> module with db_mysql the database becomes unavailable, Kamailio will
> crash (SIGSEGV) somewhat arbitrarily after about 30-45 seconds. The
> process is a little random; sometimes it can take longer.
>
> At first, I suspected the problem may be with 1.4.x, but got the exact
> same effect when trying 1.5.2. I also assumed the problem may be with
> db_mysql or even libmysqlclient, but when I remove the permissions
> functionality from the configuration the process does not crash. This
> is a configuration from the 1.4.x era with a lot of custom DB queries
> via avp_db_query() (pre-sqlops); when the database disappears, Kamailio
> will complain loudly in the logs, of course, but it will not crash
> unless I am calling allow_trusted() as part of the configuration and am
> using the permissions module with database backing.
>
> The test system is CentOS 4.7 64-bit, but I've reproduced the problem on
> Debian lenny and unstable just as easily with libmysqlclient15.
>
> Here are my steps for producing the bug:
>
> 1) Initialise Kamailio (the only modules that use database are
> permissions and avpops);
>
> 2) Firewall MySQL database off or shut it down;
>
> 3) Place a call - even one will do;
>
> 4) Kamailio will not crash immediately; it usually takes several
> request retransmissions or successive attempts to contact the database
> for this to happen, and it can take close to a minute. Eventually,
> though, there is a segfault that appears to be bubbling up from
> allow_trusted() in the call stack.
>
> Here is the backtrace. IP addresses and phone numbers are sanitised to
> protect confidentiality, but have been substituted consistently:
>
> (gdb) where
> 0 0x00000030eba21b70 in mysql_real_escape_string ()
> from /usr/lib64/mysql/libmysqlclient.so.15
> 1 0x0000002a95be64c1 in db_mysql_val2str (_c=0x686e60, _v=0x7fbfffd900,
> _s=0x6331a7 "222.22.222.22'", _len=0x7fbfffd2cc) at val.c:233
> 0000002 0x00000000004b2f14 in db_print_where (_c=0x686e60,
> _b=0x63319f "src_ip='222.22.222.22'", _l=65473,
_k=0x7fbfffd440,
> _o=0x0,
> _v=0x7fbfffd900, _n=1, val2str=0x2a95be62ca <db_mysql_val2str>)
> at db/db_ut.c:275
> 0000003 0x00000000004affb8 in db_do_query (_h=0x686e60, _k=0x7fbfffd440,
> _op=0x0,
> _v=0x7fbfffd900, _c=Variable "_c" is not available.
> ) at db/db_query.c:78
> 0000004 0x0000002a95be2dbf in db_mysql_query (_h=0x0, _k=0x6331a7,
> _op=0x2a962492a0, _v=0xe, _c=0x7fbfffd900, _n=6844000, _nc=0, _o=0x0,
> _r=0x0) at dbase.c:249
> 0000005 0x0000002a961428c0 in allow_trusted (msg=0x687038,
> src_ip=0x2a962492a0 "222.22.223.23", proto=1) at trusted.c:422
> 0000006 0x0000002a96143356 in allow_trusted_0 (_msg=0x687038,
> str1=Variable "str1" is not available.
> )
> at ../../parser/../ip_addr.h:398
> 0000007 0x000000000040d4df in do_action (a=0x6566f8, msg=0x687038) at
> action.c:850
> 0000008 0x000000000040ec2b in run_action_list (a=Variable "a" is not
> available.
> ) at action.c:138
> 0000009 0x00000000004523dc in eval_expr (e=0x6567c8, msg=0x687038,
> val=0x0) at route.c:1116
> 0000010 0x00000000004527c2 in eval_expr (e=0x656810, msg=0x687038,
> val=0x0) at route.c:1429
> 11 0x0000000000452159 in eval_expr (e=0x656858, msg=0x687038, val=0x0)
> at route.c:1434
> 12 0x000000000040c933 in do_action (a=0x656e70, msg=0x687038) at
> action.c:705
> 13 0x000000000040ec2b in run_action_list (a=Variable "a" is not
> available. ) at action.c:138
> 14 0x000000000040e95f in do_action (a=0x657f00, msg=0x687038) at
> action.c:728
> 15 0x000000000040ec2b in run_action_list (a=Variable "a" is not
> available. ) at action.c:138
> 16 0x000000000040e95f in do_action (a=0x657fd0, msg=0x687038) at
> action.c:728
> 17 0x000000000040ec2b in run_action_list (a=Variable "a" is not
> available. ) at action.c:138
> 18 0x000000000040ef30 in run_top_route (a=0x651008, msg=0x687038)
> at action.c:118
> ---Type <return> to continue, or q <return> to quit---
> 19 0x00000000004467ef in receive_msg (
> buf=0x622ac0 "INVITE sip:8005551212@123.132.132.7:5060;transport=udp
> SIP/2.0\r\nRecord-Route:
> <sip:222.22.223.23;lr;rpdicor=VPSF506071629460;vsf=AAAAAB0GDgULBQIABgJ3AW
>4CFhgKGwIbARoJNDg->\r\nRecord-Route: <sip:90.148.216.254"..., len=922,
> rcv_info=0x7fbffff700) at receive.c:165 0000020 0x0000000000480c44
> in udp_rcv_loop () at udp_server.c:449 21 0x0000000000422b4f in main
> (argc=Variable "argc" is not available. ) at main.c:692