Dear All,
It looks like there is a serious error inside db_mongodb module in the latest versions of
Kamailio 4.4.7 and 5.1.2.
During query operations string fields inside result structure db1_res_t could be
corrupted because of incorrect usage of bson memory.
The issue is located in the function db_mongodb_convert_bson() around line 614.
This issue is reproduced when we need to read more than 150 documents from the database.
So that is why this issue still exists.
Firstly bson_iter_utf8 function returns pointer to memory which is allocated inside bson
memory and which is used multiple times.
So the data must be copied into Kamailio memory instead storing that pointer. The pointer
becomes invalid after some time.
Secondly, this pointer must not be freed by the caller of bson_iter_utf8 . At the current
version bson allocated pointer is freed by Kamailio function pkg_free like it is was
allocated by pkg_malloc .
So I am going to create ticket in the github and propose the fix.
It looks like nobody uses mongodb with Kamailio in db_mode=1 :)
Best regards
Konstantin
----------------------------------------------------------------------
Message: 1
Date: Sat, 21 Apr 2018 23:37:26 +0300
From: Konstantin Polyakov < piligrim_pk(a)mail.ru >
To: sr-users(a)lists.kamailio.org
Subject: [SR-Users] Kamailio doesn't start with db_mongodb and usrloc.
Message-ID: < 1524343046.238272820(a)f356.i.mail.ru >
Content-Type: text/plain; charset="utf-8"
Hello!
Similar issue but with different symptoms.
kamailio 4.4.7 (tried to use db_mongodb module from 5.1.2)
usrloc is configured to use db_mode=1 and db_insert_null=1.
MongoDB 3.6.3
mongo-c-driver was tested 1.9.2 and 1.9.4.
location table contains 300 records.
The issue was not reproduced since February when we started a new feature. Now it is
reproduced constantly, but on some machines.
Logs:
[0;39;49m[0;39;49m 1(13958) DEBUG: usrloc [udomain.c:313]: dbrow2info(): non-local
socket <all.com>...ignoring
[0;39;49m[0;33;49m 1(13958) CRITICAL: usrloc [udomain.c:476]: preload_udomain(): empty
username record in table location...skipping
[0;39;49m[0;33;49m 1(13958) CRITICAL: usrloc [udomain.c:256]: dbrow2info(): bad callid
[0;39;49m[0;31;49m 1(13958) ERROR: usrloc [udomain.c:484]: preload_udomain(): skipping
record for agent in table location
[0;39;49m[0;33;49m 1(13958) CRITICAL: usrloc [udomain.c:476]: preload_udomain(): empty
username record in table location...skipping
[0;39;49m[0;33;49m 1(13958) CRITICAL: usrloc [udomain.c:231]: dbrow2info(): bad contact
[0;39;49m[0;31;49m 1(13958) ERROR: usrloc [udomain.c:484]: preload_udomain(): skipping
record for cflags in table location
[0;39;49m[0;39;49m 1(13958) DEBUG: usrloc [udomain.c:313]: dbrow2info(): non-local
socket <��keepalive>...ignoring
[0;39;49m[0;35;49m 1(13958) WARNING: usrloc [../../ut.h:717]: shm_str_dup(): shm_str_dup
fallback; dup called for src->s == NULL or src->len < 0
[0;39;49m[0;35;49m 1(13958) WARNING: usrloc [../../ut.h:731]: shm_str_dup(): shm_str_dup
fallback; skip memcpy for src->s == NULL
[0;39;49m[0;33;49m 1(13958) CRITICAL: usrloc [udomain.c:476]: preload_udomain(): empty
username record in table location...skipping
[0;39;49m[0;33;49m 1(13958) CRITICAL: usrloc [udomain.c:231]: dbrow2info(): bad contact
[0;39;49m[0;31;49m 1(13958) ERROR: usrloc [udomain.c:484]: preload_udomain(): skipping
record for cseq in table location
[0;39;49m[0;33;49m 1(13958) CRITICAL: usrloc [udomain.c:231]: dbrow2info(): bad contact
[0;39;49m[0;31;49m 1(13958) ERROR: usrloc [udomain.c:484]: preload_udomain(): skipping
record for 5.17.xxx in table location
[0;39;49m[0;33;49m 1(13958) CRITICAL: usrloc [udomain.c:231]: dbrow2info(): bad contact
[0;39;49m[0;31;49m 1(13958) ERROR: usrloc [udomain.c:484]: preload_udomain(): skipping
record for .17.x in table location
[0;39;49m[0;33;49m 1(13958) CRITICAL: usrloc [udomain.c:476]: preload_udomain(): empty
username record in table location...skipping
[0;39;49m[0;39;49m 1(13958) DEBUG: usrloc [udomain.c:313]: dbrow2info(): non-local
socket <nce>...ignoring
[0;39;49m[0;35;49m 1(13958) WARNING: usrloc [../../ut.h:717]: shm_str_dup(): shm_str_dup
fallback; dup called for src->s == NULL or src->len < 0
[0;39;49m[0;35;49m 1(13958) WARNING: usrloc [../../ut.h:731]: shm_str_dup(): shm_str_dup
fallback; skip memcpy for src->s == NULL
[0;39;49m[0;33;49m 1(13958) CRITICAL: usrloc [udomain.c:476]: preload_udomain(): empty
username record in table location...skipping
[0;39;49m[0;33;49m 1(13958) CRITICAL: usrloc [udomain.c:476]: preload_udomain(): empty
username record in table location...skipping
[0;39;49m[0;33;49m 1(13958) CRITICAL: usrloc [udomain.c:476]: preload_udomain(): empty
username record in table location...skipping
[0;39;49m[0;39;49m 1(13958) DEBUG: usrloc [udomain.c:313]: dbrow2info(): non-local
socket <c-89-5ada654b-35a2-42>...ignoring
[0;39;49m[0;39;49m 1(13958) DEBUG: usrloc [udomain.c:313]: dbrow2info(): non-local
socket <>...ignoring
[0;39;49m[0;33;49m 1(13958) CRITICAL: usrloc [udomain.c:231]: dbrow2info(): bad contact
[0;39;49m[0;31;49m 1(13958) ERROR: usrloc [udomain.c:484]: preload_udomain(): skipping
record for 38.146:5360 in table location
[0;39;49m[0;39;49m 1(13958) DEBUG: usrloc [udomain.c:313]: dbrow2info(): non-local
socket <ified>...ignoring
[0;39;49m[0;31;49m 1(13958) ERROR: usrloc [ucontact.c:94]: new_ucontact(): no ruid for
aor: lQ.com@5.17.60@8/Ղusername
[0;39;49m[0;31;49m 1(13958) ERROR: usrloc [urecord.c:149]: mem_insert_ucontact(): failed
to create new contact
[0;39;49m[0;31;49m 1(13958) ERROR: usrloc [udomain.c:517]: preload_udomain(): inserting
contact failed
[0;39;49m[0;31;49m 1(13958) ERROR: usrloc [ul_mod.c:457]: child_init(): child(1): failed
to preload domain 'location'
[0;39;49m[0;31;49m 1(13958) ERROR: <core> [sr_module.c:923]: init_mod_child():
Error while initializing module usrloc (/usr/ecsp/lib64/kamailio/modules/usrloc.so)
[0;39;49m[0;31;49m 1(13958) ERROR: <core> [pt.c:329]: fork_process(): init_child
failed for process 1, pid 13958, "udp receiver child=0 sock=5.17.138.137:5060
(5.17.138.137:5060)"
[0;39;49m[0;33;49m 1(13958) CRITICAL: <core> [main.c:1615]: main_loop(): Cannot
fork
[0;39;49m[1;31;49m 0(13956) ALERT: <core> [main.c:740]: handle_sigs(): child
process 13958 exited normally, status=255
[0;39;49m[0;36;49m 0(13956) INFO: <core> [main.c:768]: handle_sigs(): terminating
due to SIGCHLD
[0;39;49m[0;39;49m 0(13956) DEBUG: <core> [main.c:770]: handle_sigs(): terminating
due to SIGCHLD
Could you please help us to resolve the issue?
Thank you in advance.
Best regards
Konstantin