(Apologies for the long wall of text)
Ideas for developer meeting 2: Rearchitect TLS
Background: Kamailio uses OpenSSL in fork() for load balancing (TLS followed by SIP).The TLS state must be in shared memory so that each worker can pick up where the previous worker has left off.
Over the years this has caused much friction - only OpenSSL 3 and wolfSSL can be used in this manner because they include hooks for memory management
Problems with current architecture Original OpenSSL 3+ has gone heavily down the way of using pthreads, especially thread-local variables.
OpenSSL lookalikes - boringSSL, libreSSL, AWS-LC - don't support memory management to redirect functions to shared memory pools
A pure GPL library like GnuTLS also not enable memory management hooks
Today Kamailio is "fighting" its primary TLS provider, namely OpenSSL 3.
We require pthread symbol overrides (which is a code smell) and have to do thread local tricks for the workers to have a clean state.
With OpenSSL 3 - Kamailio users have an unhappy relationship with memory management. Structures are duplicated per worker(SSL_CTX) to avoid unexpected failures when an SSL object derived from worker A is used in worker B.
Using a single SSL_CTX in the main process - in an attempt to conserve memory - has not been solved yet (my assessment is that this problem can be solved within the current architecture).
History It seems that Kamailio code base has vestiges of when TLS was handled in core (possibly the TCP manager process?).
Proposal 2 This proposal has the spirit that we should not be fighting the libraries we use especially if this library is a key component of many distributions/containers with many eyes on the way we use and abuse OpenSSL.
The Zen of OpenSSL(I just made that up) would suggest we embrace their path forward and have Kamailio work within that boundary.
The GPL of Kamailio would also suggest that we should validate this approach with the ability to use something like GnuTLS.
This proposal is to perform TLS in a thread-pool in the TCP manager so that all TLS related operations are confined to a single process.
When the TCP manager terminates or initiates TLS it should perform TLS in a hairpin socketpair(managed by a thread pool). The socket fd used in sendmsg/recvmsg with the worker is this internal proxy'ied socket but carries metadata about the original connection.
In other words: Kamailio TCP manager internally implements a TLS/TCP bridge like Nginx/HAProxy do with HTTPS/HTTP or TLS/TCP. These proxies send headers so the worker is informed about the nature of the original connection.
Benefits - use OpenSSL/AWS-LC/boringSSL/libreSSL in the way they are intended to be used; even GnuTLS - no pthreads hackery - a well-defined boundary for TLS operations, certificate and key management: allowing for easier scrutiny and audit
The offloading of both TLS/SIP dates from the days when threading was poor on Linux. Today that is no longer the case, so the TLS Manager could handle all TLS and the worker should handle SIP/TCP.
In fact this proposal was inspired by the recent work on UDP enhancement.
Optional riff: we may be able to off-load TLS entirely to HAProxy(or any other TLS/TLS bridge). Instead of handling TLS ourselves Kamailio creates a new type of listener/speaker haproxy-in, haproxy-out. Kamailio "knows" that these sockets are not SIP traffic but internal hairpins to decrypt/encrypt the streams via HAProxy et al. Think of Kamailio as using HAProxy as a filter object: the sendmsg/recvmsg will use the haproxy-in socket (which is what the worker will see - TCP only). See the sidenode below.
t_relay would have to taught that it doesn't need to look for "real" TLS sockets but instead a proxy socket should suffice.
Sidenote:
Today TLS can be solved with external TLS/TCP bridges. The main issue with this approach is the config file occasionally needs to force TCP otherwise Kamailio will look for a non-existent TLS socket - so it is not an entirely happy experience for users.
Richard (Shih-Ping)