Hello Henning,

Regarding your question about whether the auto_reconnect parameter was set or not, I had it set to auto_reconnect but unfortunately it made no difference.

On another note,  I received your comment on github about core file and gdb trace.  I will look into getting that info to you.  

Thank you.

Karthik

On Fri, Jan 25, 2019 at 3:16 AM Henning Westerholt <hw@kamailio.org> wrote:
Am Mittwoch, 23. Januar 2019, 18:13:26 CET schrieb Karthik Srinivasan:
> An update regarding this item:
>
> I have tested release 5.1.x and 5.2.x and neither release resolves the
> issue.
>
> However I did notice in the master branch that there is new code that is
> related to this issue.
>
> https://github.com/kamailio/kamailio/issues/1681

Hello Karthik,

indeed in current git master there is an extension in the sqlops module to
allow the startup with an offline database.

> In issue 1681 there is code that allows Kamailio to start even if a
> database connection can not be established.  Queries attempting to run
> against the offline database fail gracefully.  And once the database is
> back online, a connection is established and queries against it are
> successful.
>
> However, if at some later point I shut down the database, we're back to the
> original issue that i reported.  Kamailio crashes with the same output as
> listed before except the first query that is attempted against the offline
> db causes the crash in this master branch unlike previously (branch 5.0.x,
> 5.1.x, 5.2.x) the first attempt fails, tries again and fails, and the
> second attempt causes the crash.  Regardless, the output is more or less
> the same and Kamailio is down.

This is probably because the purpose of this extensions was extending the
startup process and not especially targeted to this re-connection. This is
usually handled in the db_* modules.

> I suspect this might be the same behavior even if one is not using an odbc
> driver; but maybe not.

The db_unixodbc module supports the auto_reconnect (as other db_* modules). Do
you enabled or disabled this parameter?

> Anyways, i will open an issue on github for this and hopefully the code
> change to resolve this is relatively straightforward.

Thank you.

Best regards,

Henning

> Henning, thanks again for your feedback on this.
>
> Karthik
>
>
>
>
> On Mon, Jan 21, 2019 at 9:09 AM Karthik Srinivasan <ksriniva2002@gmail.com>
>
> wrote:
> > Henning,
> >
> > Thank you for the response.
> >
> > I will open an issue and test out the latest releases.
> >
> > Thanks again for the feedback.
> >
> > Karthik
> >
> > On Sun, Jan 20, 2019 at 9:31 AM Henning Westerholt <hw@kamailio.org>
> >
> > wrote:
> >> Am Freitag, 18. Januar 2019, 18:28:09 CET schrieb Karthik Srinivasan:
> >> > I am testing how kamailio reacts to various database conditions.   One
> >>
> >> such
> >>
> >> > condition is if the database engine is simply shut down (that is,
> >>
> >> database
> >>
> >> > server process no longer running, tcp listening socket closed, etc...)
> >> >
> >> > I am utilizing the db_unixodbc module to connect to an Informix
> >> > database
> >> > engine.
> >> >
> >> > I am currently running on Kamailio version 5.0.
> >> >
> >> > I have a test query that executes against the database engine every 10
> >> > seconds.
> >> >
> >> > Here is what i have noticed if i shut down the database engine at some
> >> > point after i run Kamailio.
> >> >
> >> > the first test query that attempts to run against the db engine fails;
> >>
> >> it
> >>
> >> > tries to reconnect and fails.
> >> >
> >> > The second test query (10 seconds after the 1st) results in a SIG_CHILD
> >>
> >> and
> >>
> >> > shuts down the entire Kamailio process.
> >> >
> >> > Has anyone experienced this?  Is there a solution to this?   Ideally
> >> > the
> >> > second query should also fail and return gracefully; and ideally
> >> > queries
> >> > continue to fail until the database engine is back up.
> >>
> >> Hello Karthik,
> >>
> >> Kamailio should not crash because of this error. The db_unixodbc module
> >> is not
> >> that widely used (compared to db_mysql), but nevertheless it shouldn't
> >> crash.
> >>
> >> Can you create an issue in our tracker on github for this:
> >> https://github.com/kamailio/kamailio/issues
> >>
> >> It would be great if you can also try with the latest stable version of
> >> 5.1.x
> >> or 5.2.x, there have been some changes in the db_unixodbc module since
> >> the
> >> release of 5.0.
> >>
> >> Best regards,
> >>
> >> Henning
> >>
> >> > See logs below:
> >> >
> >> > Jan 17 20:07:25 [29297]: INFO: (s)  SQL query: FIRST TEST QUERY
> >> > Jan 17 20:07:25 [29297]: ERROR: db_unixodbc [dbase.c:135]:
> >> > db_unixodbc_submit_query(): rv=-1. Query= FIRST TEST QUERY
> >> > Jan 17 20:07:25 [29297]: ERROR: db_unixodbc [connection.c:220]:
> >> > db_unixodbc_extract_error():
> >> > unixodbc:SQLExecDirect=08S01:1:-11020:[Informix][Informix ODBC
> >> > Driver]Communication link failure.
> >> > Jan 17 20:07:25 [29297]: ERROR: db_unixodbc [dbase.c:59]: reconnect():
> >> > Attempting DB reconnect
> >> > Jan 17 20:07:25 [29297]: ERROR: db_unixodbc [dbase.c:74]: reconnect():
> >> > failed to connect
> >> > Jan 17 20:07:25 [29297]: ERROR: db_unixodbc [connection.c:220]:
> >> > db_unixodbc_extract_error():
> >> > unixodbc:SQLDriverConnect=08002:1:0:[unixODBC][Driver
> >> > Manager]Connection
> >> > name in use
> >> > Jan 17 20:07:25 [29297]: ERROR: db_unixodbc [connection.c:220]:
> >> > db_unixodbc_extract_error():
> >> > unixodbc:SQLDriverConnect=HY010:2:-11067:[Informix][Informix ODBC
> >> > Driver]Function sequence error.
> >>
> >> > Jan 17 20:07:25 [29297]: ERROR: <core> [db_query.c:181]:
> >> db_do_raw_query():
> >> > error while submitting query
> >> > Jan 17 20:07:25 [29297]: ERROR: sqlops [sql_api.c:265]: sql_do_query():
> >> > cannot do the query FIRST TEST QUERY
> >> > Jan 17 20:07:25 [29297]: INFO: (s) [123] SQL ret: fail (-1)
> >> > Jan 17 20:07:25 [29297]: INFO: (s) [123] SQL res: no rows
> >> > Jan 17 20:07:35 [29297]: INFO: (s) [123] SQL query: 10 seconds later
> >> > the
> >> > SECOND TEST QUERY (it's the same query as the first one)
> >> > Jan 17 20:07:35 [29301]: CRITICAL: <core> [core/pass_fd.c:277]:
> >> > receive_fd(): EOF on 28
> >>
> >> > Jan 17 20:07:35 [29283]: ALERT: <core> [main.c:744]: handle_sigs():
> >> child
> >>
> >> > process 29297 exited by a signal 11
> >> > Jan 17 20:07:35 [29283]: ALERT: <core> [main.c:747]: handle_sigs():
> >> > core
> >> > was not generated
> >> > Jan 17 20:07:35 [29283]: INFO: <core> [main.c:759]: handle_sigs():
> >> > terminating due to SIGCHLD
> >> > Jan 17 20:07:35 [29301]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29295]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29291]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29288]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29300]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29284]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29286]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29293]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29289]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29287]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29292]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29296]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29298]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29299]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29285]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29294]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29290]: INFO: <core> [main.c:814]: sig_usr(): signal
> >> > 15
> >> > received
> >> > Jan 17 20:07:35 [29283]: INFO: <core> [core/sctp_core.c:53]:
> >> > sctp_core_destroy(): SCTP API not initialized

--
Henning Westerholt - https://skalatan.de/blog/
Kamailio services - https://skalatan.de/services
Kamailio security assessment - https://skalatan.de/de/assessment