An update regarding this item:

I have tested release 5.1.x and 5.2.x and neither release resolves the issue.  

However I did notice in the master branch that there is new code that is related to this issue.  


In issue 1681 there is code that allows Kamailio to start even if a database connection can not be established.  Queries attempting to run against the offline database fail gracefully.  And once the database is back online, a connection is established and queries against it are successful.

However, if at some later point I shut down the database, we're back to the original issue that i reported.  Kamailio crashes with the same output as listed before except the first query that is attempted against the offline db causes the crash in this master branch unlike previously (branch 5.0.x, 5.1.x, 5.2.x) the first attempt fails, tries again and fails, and the second attempt causes the crash.  Regardless, the output is more or less the same and Kamailio is down.

I suspect this might be the same behavior even if one is not using an odbc driver; but maybe not. 

Anyways, i will open an issue on github for this and hopefully the code change to resolve this is relatively straightforward.  

Henning, thanks again for your feedback on this.

Karthik




On Mon, Jan 21, 2019 at 9:09 AM Karthik Srinivasan <ksriniva2002@gmail.com> wrote:
Henning,

Thank you for the response.

I will open an issue and test out the latest releases.

Thanks again for the feedback.

Karthik

On Sun, Jan 20, 2019 at 9:31 AM Henning Westerholt <hw@kamailio.org> wrote:
Am Freitag, 18. Januar 2019, 18:28:09 CET schrieb Karthik Srinivasan:
> I am testing how kamailio reacts to various database conditions.   One such
> condition is if the database engine is simply shut down (that is, database
> server process no longer running, tcp listening socket closed, etc...)
>
> I am utilizing the db_unixodbc module to connect to an Informix database
> engine.
>
> I am currently running on Kamailio version 5.0.
>
> I have a test query that executes against the database engine every 10
> seconds.
>
> Here is what i have noticed if i shut down the database engine at some
> point after i run Kamailio.
>
> the first test query that attempts to run against the db engine fails;  it
> tries to reconnect and fails.
>
> The second test query (10 seconds after the 1st) results in a SIG_CHILD and
> shuts down the entire Kamailio process.
>
> Has anyone experienced this?  Is there a solution to this?   Ideally the
> second query should also fail and return gracefully; and ideally queries
> continue to fail until the database engine is back up.

Hello Karthik,

Kamailio should not crash because of this error. The db_unixodbc module is not
that widely used (compared to db_mysql), but nevertheless it shouldn't crash.

Can you create an issue in our tracker on github for this:
https://github.com/kamailio/kamailio/issues

It would be great if you can also try with the latest stable version of 5.1.x
or 5.2.x, there have been some changes in the db_unixodbc module since the
release of 5.0.

Best regards,

Henning

> See logs below:
>
> Jan 17 20:07:25 [29297]: INFO: (s)  SQL query: FIRST TEST QUERY
> Jan 17 20:07:25 [29297]: ERROR: db_unixodbc [dbase.c:135]:
> db_unixodbc_submit_query(): rv=-1. Query= FIRST TEST QUERY
> Jan 17 20:07:25 [29297]: ERROR: db_unixodbc [connection.c:220]:
> db_unixodbc_extract_error():
> unixodbc:SQLExecDirect=08S01:1:-11020:[Informix][Informix ODBC
> Driver]Communication link failure.
> Jan 17 20:07:25 [29297]: ERROR: db_unixodbc [dbase.c:59]: reconnect():
> Attempting DB reconnect
> Jan 17 20:07:25 [29297]: ERROR: db_unixodbc [dbase.c:74]: reconnect():
> failed to connect
> Jan 17 20:07:25 [29297]: ERROR: db_unixodbc [connection.c:220]:
> db_unixodbc_extract_error():
> unixodbc:SQLDriverConnect=08002:1:0:[unixODBC][Driver Manager]Connection
> name in use
> Jan 17 20:07:25 [29297]: ERROR: db_unixodbc [connection.c:220]:
> db_unixodbc_extract_error():
> unixodbc:SQLDriverConnect=HY010:2:-11067:[Informix][Informix ODBC
> Driver]Function sequence error.
> Jan 17 20:07:25 [29297]: ERROR: <core> [db_query.c:181]: db_do_raw_query():
> error while submitting query
> Jan 17 20:07:25 [29297]: ERROR: sqlops [sql_api.c:265]: sql_do_query():
> cannot do the query FIRST TEST QUERY
> Jan 17 20:07:25 [29297]: INFO: (s) [123] SQL ret: fail (-1)
> Jan 17 20:07:25 [29297]: INFO: (s) [123] SQL res: no rows
> Jan 17 20:07:35 [29297]: INFO: (s) [123] SQL query: 10 seconds later the
> SECOND TEST QUERY (it's the same query as the first one)
> Jan 17 20:07:35 [29301]: CRITICAL: <core> [core/pass_fd.c:277]:
> receive_fd(): EOF on 28
> Jan 17 20:07:35 [29283]: ALERT: <core> [main.c:744]: handle_sigs(): child
> process 29297 exited by a signal 11
> Jan 17 20:07:35 [29283]: ALERT: <core> [main.c:747]: handle_sigs(): core
> was not generated
> Jan 17 20:07:35 [29283]: INFO: <core> [main.c:759]: handle_sigs():
> terminating due to SIGCHLD
> Jan 17 20:07:35 [29301]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29295]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29291]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29288]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29300]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29284]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29286]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29293]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29289]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29287]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29292]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29296]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29298]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29299]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29285]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29294]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29290]: INFO: <core> [main.c:814]: sig_usr(): signal 15
> received
> Jan 17 20:07:35 [29283]: INFO: <core> [core/sctp_core.c:53]:
> sctp_core_destroy(): SCTP API not initialized


--
Henning Westerholt - https://skalatan.de/blog/
Kamailio services - https://skalatan.de/services
Kamailio security assessment - https://skalatan.de/de/assessment