### Description
Started from the [minimal proxy config](https://github.com/kamailio/kamailio/blob/master/misc/examples/mixed/kamaili...), we tried to run **more than 1** process using TLS, and, unforntunately it crashed.
Everything is working fine **using TLS with 1 child**. Everything is working fine with **more than 1 child process without TLS**.
To ease the reproduction, we set an env var called `TARGET` at container instanciation, its value is set to a reachable SIP address using TLS (without DNS resolution).
To reproduce issue you will need pjsip/pjsua with SSL/TLS enabled: * install OpenSSL (`sudo apt-get install openssl` for ubuntu) * [install pjsip using autoconf](https://trac.pjsip.org/repos/wiki/Getting-Started/Autoconf) * [manually configure TLS support if necessary](https://trac.pjsip.org/repos/wiki/TLS) * [build and install pjsua](https://trac.pjsip.org/repos/wiki/Python_SIP/Build_Install)
We've seen on some issues that a low memory allocation could be an issue using TLS, so we are running kamailio using more memory `kamailio -DD -E -M 256 -m 1024` (still crashing).
### Troubleshooting
#### Reproduction
Here are the steps to reproduce the issue: * start from [Kamailio minimal proxy config](https://github.com/kamailio/kamailio/blob/master/misc/examples/mixed/kamaili...) * enable TLS by adding/uncommenting following lines:
```diff @@ -1,4 +1,5 @@ #!KAMAILIO +#!define WITH_TLS # # Kamailio (OpenSER) SIP Server v5.2 - default configuration script # - web: https://www.kamailio.org @@ -28,7 +29,7 @@ ####### Global Parameters #########
### LOG Levels: 3=DBG, 2=INFO, 1=NOTICE, 0=WARN, -1=ERR -debug=2 +debug=3 log_stderror=no
memdbg=5 @@ -38,10 +39,13 @@ log_facility=LOG_LOCAL0 log_prefix="{$mt $hdr(CSeq) $ci} "
/* number of SIP routing processes */ -children=2 +children=1
/* uncomment the next line to disable TCP (default on) */ # disable_tcp=yes +#!ifdef WITH_TLS +enable_tls=yes +#!endif
/* uncomment the next line to disable the auto discovery of local aliases * based on reverse DNS on IPs (default on) */ @@ -65,6 +69,9 @@ children=2 ####### Modules Section ########
/* set paths to location of modules */ +#!ifdef WITH_TLS +loadmodule "tls.so" +#!endif loadmodule "jsonrpcs.so" loadmodule "kex.so" loadmodule "corex.so" @@ -85,6 +92,9 @@ loadmodule "counters.so"
# ----------------- setting module-specific parameters ---------------
+#!ifdef WITH_TLS +modparam("tls", "config", "/etc/kamailio/tls.cfg") +#!endif
# ----- jsonrpcs params ----- modparam("jsonrpcs", "pretty_format", 1) @@ -180,7 +190,7 @@ request_route { }
# update $du to set the destination address for proxying - $du = "sip:" + $rd + ":9"; + $du = $env(TARGET); route(RELAY); exit; } ```
* use a proper `tls.cfg` file with self-signed certificates: ```c [server:default] method = TLSv1 verify_certificate = no require_certificate = no private_key = /etc/kamailio/privkey.pem certificate = /etc/kamailio/server-cert.pem
[client:default] #method = TLSv1 verify_certificate = no require_certificate = no ```
* use following `Dockerfile`: ```docker FROM kamailio/kamailio-ci:master-alpine
# Override default kamailio configuration COPY kamailio.cfg /etc/kamailio/kamailio.cfg COPY tls.cfg /etc/kamailio/tls.cfg COPY privkey.pem /etc/kamailio/privkey.pem COPY server-cert.pem /etc/kamailio/server-cert.pem ``` * build docker image `docker build -f Dockerfile -t kamailio-tls-issue .` * launch docker container using your own `TARGET`: ```sh docker run -it -p 5061:5061/tcp -e "TARGET=sip:AA.BB.CC.DD;transport=tls" kamailio-tls-issue ``` * make a SIP call with pjsua: ```sh pjsua --stun-srv=stun.l.google.com:19302 --use-tls --no-tcp --no-udp --use-srtp=2 --local-port=5062 --id=sip:alice@localhost "sip:bob@localhost;transport=tls" ```
**Everything should work fine because we only used 1 child process**
* stop everything * update your `kamailio.cfg` by setting `children=2` (or above) * reproduce steps (build docker image, launch container, make a SIP call) * kamailio should crash with specific logs (see `Log messages` section)
#### Log Messages
Log messages from kamailio proxy: ```c 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <script>: new branch [0] to sip:bob@localhost;transport=tls 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/msg_translator.c:2933]: create_via_hf(): id added: <;i=1>, rcv proto=3 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/tcp_main.c:1747]: tcp_send(): no open tcp connection found, opening new one 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/ip_addr.c:229]: print_ip(): tcpconn_new: new tcp connection: 34.240.160.168 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/tcp_main.c:999]: tcpconn_new(): on port 5061, type 3 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/tcp_main.c:1305]: tcpconn_add(): hashes: 1752:0:0, 2 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} tls [tls_server.c:199]: tls_complete_init(): completing tls connection initialization 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} tls [tls_server.c:162]: tls_get_connect_server_name(): xavp with outbound server name not found 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} tls [tls_server.c:142]: tls_get_connect_server_id(): xavp with outbound server id not found 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} tls [tls_server.c:228]: tls_complete_init(): Using initial TLS domain TLSc<default> (dom 0x7fe7740bdaf0 ctx 0x55d98553ca40 sn []) 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} tls [tls_domain.c:1155]: tls_lookup_private_key(): Private key lookup for SSL_CTX-0x55d98553ca40: 0 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} tls [tls_domain.c:737]: sr_ssl_ctx_info_callback(): SSL handshake started 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/tcp_main.c:2641]: tcpconn_1st_send(): pending write on new connection 0x7fe7742cd200 (-1/122 bytes written) (err: 11 - Resource temporarily unavailable) 13(18) DEBUG: <core> [core/tcp_main.c:3540]: handle_ser_child(): read response= 7fe7742cd200, 5, fd 29 from 11 (16) 13(18) DEBUG: <core> [core/io_wait.h:380]: io_watch_add(): DBG: io_watch_add(0x55d984d52f20, 29, 2, 0x7fe7742cd200), fd_no=18 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} tm [t_funcs.c:372]: t_relay_to(): new transaction forwarded 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/receive.c:353]: receive_msg(): request-route executed in: 10827 usec 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/usr_avp.c:636]: destroy_avp_list(): destroying list 0 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/usr_avp.c:636]: destroy_avp_list(): destroying list 0 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/usr_avp.c:636]: destroy_avp_list(): destroying list 0 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/usr_avp.c:636]: destroy_avp_list(): destroying list 0 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/usr_avp.c:636]: destroy_avp_list(): destroying list 0 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/usr_avp.c:636]: destroy_avp_list(): destroying list 0 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/xavp.c:495]: xavp_destroy_list(): destroying xavp list 0 11(16) DEBUG: {1 32554 INVITE 1k3Hcf7IVCnjmhjZIJXU5S0x-LRhyB2x} <core> [core/receive.c:457]: receive_msg(): cleaning up 13(18) DEBUG: <core> [core/io_wait.h:777]: io_watch_chg(): DBG: io_watch_chg (0x55d984d52f20, 29, 0x1, 0xffffffff) fd_no=19 called 13(18) DEBUG: <core> [core/io_wait.h:602]: io_watch_del(): DBG: io_watch_del (0x55d984d52f20, 29, -1, 0x0) fd_no=19 called 13(18) DEBUG: <core> [core/tcp_main.c:4196]: handle_tcpconn_ev(): sending to child, events 1 13(18) DEBUG: <core> [core/tcp_main.c:3875]: send2child(): selected tcp worker 1 12(17) for activity on [tls:172.17.0.2:5061], 0x7fe7742cd200 12(17) DEBUG: <core> [core/tcp_read.c:1767]: handle_io(): received n=8 con=0x7fe7742cd200, fd=7 13(18) DEBUG: <core> [core/tcp_main.c:3279]: handle_tcp_child(): dead tcp child 1 (pid 17, no 12) (shutting down?) 13(18) DEBUG: <core> [core/io_wait.h:602]: io_watch_del(): DBG: io_watch_del (0x55d984d52f20, 27, -1, 0x0) fd_no=18 called 13(18) CRITICAL: <core> [core/pass_fd.c:277]: receive_fd(): EOF on 24 13(18) DEBUG: <core> [core/tcp_main.c:3512]: handle_ser_child(): dead child 12, pid 17 (shutting down?) 13(18) DEBUG: <core> [core/io_wait.h:602]: io_watch_del(): DBG: io_watch_del (0x55d984d52f20, 24, -1, 0x0) fd_no=17 called 0(1) ALERT: <core> [main.c:740]: handle_sigs(): child process 17 exited by a signal 11 0(1) ALERT: <core> [main.c:743]: handle_sigs(): core was generated 0(1) INFO: <core> [main.c:766]: handle_sigs(): terminating due to SIGCHLD 0(1) DEBUG: <core> [main.c:768]: handle_sigs(): terminating due to SIGCHLD 10(15) INFO: <core> [main.c:821]: sig_usr(): signal 15 received 1(6) INFO: <core> [main.c:821]: sig_usr(): signal 15 received 8(13) INFO: <core> [main.c:821]: sig_usr(): signal 15 received 7(12) INFO: <core> [main.c:821]: sig_usr(): signal 15 received 6(11) INFO: <core> [main.c:821]: sig_usr(): signal 15 received 5(10) INFO: <core> [main.c:821]: sig_usr(): signal 15 received 4(9) INFO: <core> [main.c:821]: sig_usr(): signal 15 received 3(8) INFO: <core> [main.c:821]: sig_usr(): signal 15 received 13(18) INFO: <core> [main.c:821]: sig_usr(): signal 15 received 9(14) INFO: <core> [main.c:821]: sig_usr(): signal 15 received 11(16) INFO: <core> [main.c:821]: sig_usr(): signal 15 received 2(7) INFO: <core> [main.c:821]: sig_usr(): signal 15 received 0(1) DEBUG: tm [t_funcs.c:85]: tm_shutdown(): start 0(1) DEBUG: tm [t_funcs.c:88]: tm_shutdown(): emptying hash table 0(1) DEBUG: tm [h_table.c:133]: free_cell_helper(): freeing transaction 0x7fe7742c9580 from h_table.c:465 0(1) DEBUG: tm [t_funcs.c:90]: tm_shutdown(): removing semaphores 0(1) DEBUG: tm [t_funcs.c:92]: tm_shutdown(): destroying tmcb lists 0(1) DEBUG: tm [t_funcs.c:95]: tm_shutdown(): done ```
### Additional Information
* **Kamailio Version** : Kamailio (OpenSER) SIP Server v5.2
* **Operating System**:
Kamailio was running on Linux Alpine from [kamailio/kamailio-ci:master-alpine](https://hub.docker.com/r/kamailio/kamailio-ci/) docker image.
As you use master branch (v5.2), this may be due to the latest changes I pushed to tls module in the past few weeks.
The log message is saying that a core file was generated, can you get the backtrace from it with gdb and post it here?
Not being familiar with Alpine, I tried to reproduce on Debian testing using master branch and all goes fine. I looked at the docker files for Alpine repo and it seems it uses libressl library, not openssl (libssl).
The tls module is developed for and tested only with libssl (openssl library).
Can you test with libssl library and report the results?
@miconda
I'm sorry for the non-anwer for your previous question, I did not have the time to check the logs, I'll try to test it with openssl and give you a feedback as soon as possible.
Thanks !
@miconda
I just tried with openssl and it outputs the same log messages...
Do you have any other idea where to look ?
I have also have kamailio crash on Alpien when used TLS. in my case i get crash with one client in case SIP client switch off WiFi/gprs network and after TCP connection timeout kamailio close connection.
This is reproduced on 5.0, 5.1 branches two. For me it is old bug, but i not reported it before.
Hello @amenella About four days ago Alpine is switched to used openssl instread of libressl. We also updated kamailio docker image `kamailio/kamailio-ci` to use openssl. Could you retest with current `kamailio/kamailio-ci:master-alpine.debug` docker image. Need to execute `docker pull kamailio/kamailio-ci:master-alpine.debug` before test.
This docker container supports creation of core files. If core files is generated ned to create backtrace using `gdb`. To get core file need start container using commands like ``` docker run -it --entrypoint=sh kamailio/kamailio-ci:master-alpine.debug kamailio -DD -E ```
Hi @sergey-safarov, thanks for the info !
I'll try to do it this week and let you know...
Hi @sergey-safarov @miconda
I just tried with `kamailio/kamailio-ci:master-alpine.debug` and I can not reproduce the issue (kamailio does not crash anymore).
However I can not reach the other party but I think it's because of wrong/misused certificates. I'll try to get it working properly and I'll let you know.
Thanks again !
Thanks for update, I am going to close this issue.
If you have troubles using the module, the right place to discuss about is on sr-users mailing list.
Closed #1670.
@miconda I have one questions. On OpenBSD dist also used `LibreSSL`. On this dist also crash happens. Is required to open other ticket to fix crash with `LibreSSL` on OpenBSD dist?
@sergey-safarov As Daniel mentioned, the TLS module supports currently the openssl library. If you observe another crash with libressl, you could open an issue about that. But then somebody from the community (e.g. as OpenBSD user) should support us here in the debugging process and ideally also with an eventual fix for the problem.
@henningw - maybe you didn't understand, but there was no fix for libressl, so your remark "If you observe another crash with libressl,..." is pointless, the crash reported initially in this issue is still there, nothing to wait for something else.
But libressl has diverged quite a lot from openssl -- I looked a bit like 2 years ago or so and iirc, they do not allow setting custom memory manager. That makes this case like implementing a new module, something similar to implemeting a tls module using gnutls. From this perspective, it is not a bug, it can be eventually a feature request. The request for gnutls was moved to wiki, this can be added there if someone wants to implement at some point:
* https://www.kamailio.org/wiki/devel/new-features-requests
Otherwise, to my rather limited knowledge about that OS, one can still install openssl on openbsd, so that should be the choice of use.
Thank you @henningw , @miconda I updated feature request page about OpenBSD with LibreSSL.
Thank you Daniel for the clarification