I've got some more debug information,
but it almost looks like a race condition.

When I turn of all debugging the problem occurs more often.
When I turn on debugging (level 4 or 5) I almost never see the error.

But I finally have some debugging info:

Oct 18 19:40:53 localhost /sbin/kamailio[29227]: DEBUG: <core> [sr_module.c:828]: DEBUG: init_mod_child (96): permissions
Oct 18 19:40:53 localhost /sbin/kamailio[29227]: DEBUG: <core> [db.c:294]: connection 0x958390 found in pool
Oct 18 19:40:53 localhost /sbin/kamailio[29227]: DEBUG: db_postgres [km_val.c:163]: PQescapeStringConn: in: 7 chars, out: 7 chars
Oct 18 19:40:53 localhost /sbin/kamailio[29227]: DEBUG: db_postgres [km_dbase.c:149]: 0x959aa0 PQsendQuery(select table_version from version where table_name='trusted')
Oct 18 19:40:53 localhost /sbin/kamailio[29227]: DEBUG: <core> [db_res.c:118]: allocate 48 bytes for result set at 0x94d790

Then 47 seconds later (in the mean time, nothing is reported for pid 29227):

Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: db_postgres [km_dbase.c:403]: 0x959aa0 PQresultStatus(PGRES_TUPLES_OK) PQgetResult(0x139aca20)
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: db_postgres [km_res.c:108]: 1 columns returned from the query
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: <core> [db_res.c:155]: allocate 8 bytes for result names at 0x94b1f0
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: <core> [db_res.c:165]: allocate 4 bytes for result types at 0x942140
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: db_postgres [km_res.c:126]: allocate 16 bytes for RES_NAMES[0] at 0x9599b0
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: db_postgres [km_res.c:133]: RES_NAMES(0x9599b0)[0]=[password]
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: db_postgres [km_res.c:166]: use DB1_STRING result type
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: db_postgres [km_res.c:222]: allocate for 1 columns 8 bytes in row buffer at 0x959538
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: <core> [db_res.c:184]: allocate 16 bytes for rows at 0x9598d8
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: db_postgres [km_res.c:243]: PQgetvalue(0x959aa0,0,0)=[mAC4QMyW262E]
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: db_postgres [km_res.c:252]: [0][0] Column[password]=[mAC4QMyW262E]
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: <core> [db_row.c:119]: allocate 32 bytes for row values at 0x94d820
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: <core> [db_val.c:117]: converting STRING [mAC4QMyW262E]
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: <core> [db_val.c:127]: allocate 13 bytes memory for STRING at 0x9596d0
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: db_postgres [km_res.c:267]: freeing row buffer at 0x959538
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: DEBUG: db_postgres [km_dbase.c:302]: PQclear(0x139aca20) result set
Oct 18 19:41:40 localhost /sbin/kamailio[29227]: ERROR: <core> [db.c:387]: invalid type (3) or nul (0) version columns for trusted

I have more debugging lines if someone has an idea.

But it appears that pid 29227 is sending a query, and when it continues it's working with the results from a completely different query.

Regards,
Robert

On 10/12/2010 04:30 PM, Robert Verspuy wrote:
I've got a strange issue with the latest kamailio. I also had this problem with kamailio 3.0.3 downloaded from the webpage.

I can start kamailio, but after a while (sometimes within one minuten, sometimes after an hour),
I get the following log errors:

Oct 12 15:46:24 sp01 /sbin/kamailio[26621]: ERROR: <core> [db.c:387]: invalid type (3) or nul (0) version columns for trusted
Oct 12 15:46:24 sp01 /sbin/kamailio[26621]: ERROR: <core> [db.c:405]: querying version for table trusted
Oct 12 15:46:24 sp01 /sbin/kamailio[26621]: ERROR: permissions [trusted.c:249]: error during table version check.
Oct 12 15:46:24 sp01 /sbin/kamailio[26621]: ERROR: <core> [sr_module.c:832]: init_mod_child(): Error while initializing module permissions (//lib/kamailio/modules_k/permissions.so)
Oct 12 15:46:24 sp01 /sbin/kamailio[26621]: ERROR: <core> [pt.c:481]: ERROR: fork_tcp_process(): init_child failed for process 10, pid 26621, "tcp receiver child=0"
Oct 12 15:46:24 sp01 /sbin/kamailio[26621]: ERROR: <core> [tcp_main.c:4811]: ERROR: tcp_main: fork failed: Success
Oct 12 15:46:24 sp01 /sbin/kamailio[26625]: : <core> [pass_fd.c:103]: ERROR: recv_all: 1st recv on 21 failed: Connection reset by peer
Oct 12 15:46:24 sp01 /sbin/kamailio[26625]: : <core> [tcp_main.c:3323]: ERROR: handle_tcp_child: read from tcp child 0  (pid 26621, no 10) Connection reset by peer [104]
Oct 12 15:46:24 sp01 /sbin/kamailio[26625]: : <core> [pass_fd.c:293]: ERROR: receive_fd: EOF on 19
Oct 12 15:46:24 sp01 /sbin/kamailio[26611]: ALERT: <core> [main.c:738]: child process 26621 exited normally, status=255
Oct 12 15:46:24 sp01 /sbin/kamailio[26611]: INFO: <core> [main.c:756]: INFO: terminating due to SIGCHLD
Oct 12 15:46:24 sp01 /sbin/kamailio[26625]: INFO: <core> [main.c:807]: INFO: signal 15 received

Looking at the error message, it thinks that the column type of table_version in the table version is a string (type 3), and it should be an int (type 1).
But in the database it really is an int.

When I run kamailio with debugging on level 5, I see that it's trying to run the "select table_version from version where table_name='trusted'" several times when kamailio starts (for every child thread).
And then minutes, of even an hour later, it suddenly continues with that check, but based on a database response from another query.
This is the "select password from subscriber ....etc...".
And that column is indeed an string.

I'm using PostgreSQL 8.4 as server,
but maybe someone has seen this before, or now where I have to start looking?
I first saw it yesterday when started to use the permissions module.
I only see this with the trusted table,
And always it uses the "select password from subscriber ... " response when giving this error.

So is this an error in the permissions module / trusted.c file?
or something in auth_db that's not freeing the response correctly?

Strange thing is, that is sometimes happens very quick, and sometimes it can take an half hour or hour.

I can mail the full log if anybody is interested and thinks he/she can help me.

With kind regards,
Robert Verspuy


--
Exa-Omicron
Patroonsweg 10
3892 DB Zeewolde
Tel.: 088-OMICRON (66 427 66)
http://www.exa-omicron.nl
_______________________________________________ sr-dev mailing list sr-dev@lists.sip-router.org http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev


--
Exa-Omicron
Patroonsweg 10
3892 DB Zeewolde
Tel.: 088-OMICRON (66 427 66)
http://www.exa-omicron.nl