So none of my changes to the way I invoke boto3 avoided the issue.  Even setting use_tls=False doesn't stop it happening. So far when not loading the TLS, I haven't seen the issue, so far at any rate.

So if there is a conflict between boto3 and the Kamailio TLS one option is to move the AWS client to a separate service. Not ideal but I might have to go that route in the short term. I have used a node app in the past for this purpose but had thought to simplify by invoking the AWS API directly from the routing script.

Any suggestions on more diagnostic info I could collect?

Here's the latest crashes even with use_tls=False, seems like it's still setting up an SSL context for a connection pool.

Cheers
Mike

Core was generated by `/usr/local/sbin/kamailio -P /usr/local/kamailio/run/kamailio.pid -f /usr/local/'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __GI___pthread_rwlock_wrlock (rwlock=0x0) at pthread_rwlock_wrlock.c:100
100 pthread_rwlock_wrlock.c: No such file or directory.

(gdb) bt
#0  __GI___pthread_rwlock_wrlock (rwlock=0x0) at pthread_rwlock_wrlock.c:100
#1  0x00007f0b908eaee9 in CRYPTO_THREAD_write_lock () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#2  0x00007f0b90bf8e2c in SSL_CTX_flush_sessions () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#3  0x00007f0b90bf236a in SSL_CTX_free () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#4  0x00007f0b802fa436 in context_dealloc.lto_priv.1 (self=0x7f0b805eea28) at ./Modules/_ssl.c:2224
#5  0x00007f0b931ded32 in subtype_dealloc (self=<SSLContext at remote 0x7f0b805eea28>) at ../Objects/typeobject.c:1050
#6  0x00007f0b9326e8db in dict_dealloc.lto_priv.401 (mp=0x7f0b8059f910) at ../Objects/dictobject.c:1040
#7  0x00007f0b9326e8db in dict_dealloc.lto_priv.401 (mp=0x7f0b805ea7f8) at ../Objects/dictobject.c:1040
#8  0x00007f0b931dee1e in subtype_dealloc (
    self=<PoolManager(connection_pool_kw={'cert_file': None, 'ssl_context': <SSLContext at remote 0x7f0b805eea28>, 'strict': True, 'maxsize': 10, 'timeout': 2, 'socket_options': [], 'key_file': None}, headers={}, pools=<RecentlyUsedContainer(_maxsize=10, dispose_func=<function at remote 0x7f0b805eeaa0>, _container=<OrderedDict(_OrderedDict__root=[[...], [...], None], _OrderedDict__map={}) at remote 0x7f0b805a6050>, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f0b7fb2dcd0>, _RLock__count=0) at remote 0x7f0b8059d2d0>) at remote 0x7f0b8059d210>, key_fn_by_scheme={'http': <functools.partial at remote 0x7f0b7f737050>, 'https': <functools.partial at remote 0x7f0b7f7370a8>}, pool_classes_by_scheme={'http': <type at remote 0x5583c2e23ba0>, 'https': <type at remote 0x5583c2e23f60>}) at remote 0x7f0b8059d1d0>) at ../Objects/typeobject.c:1035
#9  0x00007f0b9326e8db in dict_dealloc.lto_priv.401 (mp=0x7f0b8059fe88) at ../Objects/dictobject.c:1040
#10 0x00007f0b931dee1e in subtype_dealloc (
    self=<URLLib3Session(_verify=True, _cert_file=None, _proxy_config=<ProxyConfiguration at remote 0x7f0b8059d190>, _socket_options=[], _pool_classes_by_scheme={'http': <type at remote 0x5583c2e23ba0>, 'https': <type at remote 0x5583c2e23f60>}, _max_pool_connections=10, _key_file=None, _proxy_managers={}, _timeout=2, _manager=<PoolManager(connection_pool_kw={'cert_file': None, 'ssl_context': <SSLContext at remote 0x7f0b805eea28>, 'strict': True, 'maxsize': 10, 'timeout': 2, 'socket_options': [...], 'key_file': None}, headers={}, pools=<RecentlyUsedContainer(_maxsize=10, dispose_func=<function at remote 0x7f0b805eeaa0>, _container=<OrderedDict(_OrderedDict__root=[[...], [...], None], _OrderedDict__map={}) at remote 0x7f0b805a6050>, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f0b7fb2dcd0>, _RLock__count=0) at remote 0x7f0b8059d2d0>) at remote 0x7f0b8059d210>, key_fn_by_scheme={'http': <functools.partial at remote 0x7f0b7f737050>, 'https': <functools.partial at remot...(truncated)) at ../Objects/typeobject.c:1035
#11 0x00007f0b9326e8db in dict_dealloc.lto_priv.401 (mp=0x7f0b805a1050) at ../Objects/dictobject.c:1040
#12 0x00007f0b931dee1e in subtype_dealloc (
    self=<ContainerMetadataFetcher(_session=<URLLib3Session(_verify=True, _cert_file=None, _proxy_config=<ProxyConfiguration at remote 0x7f0b8059d190>, _socket_options=[], _pool_classes_by_scheme={'http': <type at remote 0x5583c2e23ba0>, 'https': <type at remote 0x5583c2e23f60>}, _max_pool_connections=10, _key_file=None, _proxy_managers={}, _timeout=2, _manager=<PoolManager(connection_pool_kw={'cert_file': None, 'ssl_context': <SSLContext at remote 0x7f0b805eea28>, 'strict': True, 'maxsize': 10, 'timeout': 2, 'socket_options': [...], 'key_file': None}, headers={}, pools=<RecentlyUsedContainer(_maxsize=10, dispose_func=<function at remote 0x7f0b805eeaa0>, _container=<OrderedDict(_OrderedDict__root=[[...], [...], None], _OrderedDict__map={}) at remote 0x7f0b805a6050>, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f0b7fb2dcd0>, _RLock__count=0) at remote 0x7f0b8059d2d0>) at remote 0x7f0b8059d210>, key_fn_by_scheme={'http': <functools.partial at remote 0x7f0b7f737050>, '...(truncated)) at ../Objects/typeobject.c:1035
#13 0x00007f0b9326e8db in dict_dealloc.lto_priv.401 (mp=0x7f0b8059f7f8) at ../Objects/dictobject.c:1040
#14 0x00007f0b931dee1e in subtype_dealloc (
    self=<ContainerProvider(_fetcher=<ContainerMetadataFetcher(_session=<URLLib3Session(_verify=True, _cert_file=None, _proxy_config=<ProxyConfiguration at remote 0x7f0b8059d190>, _socket_options=[], _pool_classes_by_scheme={'http': <type at remote 0x5583c2e23ba0>, 'https': <type at remote 0x5583c2e23f60>}, _max_pool_connections=10, _key_file=None, _proxy_managers={}, _timeout=2, _manager=<PoolManager(connection_pool_kw={'cert_file': None, 'ssl_context': <SSLContext at remote 0x7f0b805eea28>, 'strict': True, 'maxsize': 10, 'timeout': 2, 'socket_options': [...], 'key_file': None}, headers={}, pools=<RecentlyUsedContainer(_maxsize=10, dispose_func=<function at remote 0x7f0b805eeaa0>, _container=<OrderedDict(_OrderedDict__root=[[...], [...], None], _OrderedDict__map={}) at remote 0x7f0b805a6050>, lock=<_RLock(_Verbose__verbose=False, _RLock__owner=None, _RLock__block=<thread.lock at remote 0x7f0b7fb2dcd0>, _RLock__count=0) at remote 0x7f0b8059d2d0>) at remote 0x7f0b8059d210>, key_fn_by_scheme={'http': <functools.partial ...(truncated)) at ../Objects/typeobject.c:1035
#15 0x00007f0b93236761 in list_dealloc.lto_priv.384 (op=0x7f0b805a01b8) at ../Objects/listobject.c:309
#16 0x00007f0b9326e8db in dict_dealloc.lto_priv.401 (mp=0x7f0b805a15c8) at ../Objects/dictobject.c:1040



On Wed, 10 Apr 2019 at 09:11, Michael Loughrey <mgloughrey@gmail.com> wrote:
Thanks for the prompt reply. Yes I do have TLS enabled and in fact the crash is inside one of the TCP receiver processes when it is handing a REGISTER message.

I will try with non TLS connections, however it is intermittent so the absence of crashes hasn't always been a good guide. Also I really need TLS for the client connections.

I might try something else first though. From Boto3 docs :

The boto3 module acts as a proxy to the default session

A session manages state about a particular configuration. By default a session is created for you when needed. However it is possible and recommended to maintain your own session(s) in some scenarios. 

It is also possible to manage your own session and create clients or resources from it:

# Creating your own session
session = boto3.session.Session()

sqs = session.client('sqs')
s3 = session.resource('s3')

Multithreading / Multiprocessing
--------------------------------
It is recommended to create a resource instance for each thread / process in a multithreaded or multiprocess application rather than sharing a single instance among the threads / processes. For example:

import boto3
import boto3.session
import threading

class MyTask(threading.Thread):
    def run(self):
        session = boto3.session.Session()
        s3 = session.resource('s3')
        # ... do some work with S3 ...


In the example above, each thread would have its own Boto 3 session and its own instance of the S3 resource. This is a good idea because resources contain shared data when loaded and calling actions, accessing properties, or manually loading or reloading the resource can modify this data.


So I could in __init__() initialise thus

self.session = null
self.sns_client = null

and on demand from child processes :

if not self.session:
  self.session = boto3.session.Session(region_name=MY_AWS_REGION,use_ssl=True)

if not self.client:
  self.sns_client = self.session.resource('sns')

So each child process that need it gets it's own boto3 session.

Another thing I can try is setting use_ssl=False.

Cheers
Mike



On Wed, 10 Apr 2019 at 03:34, Daniel-Constantin Mierla <miconda@gmail.com> wrote:

Hello,

do you have tls module loaded in kamailio.cfg? If yes, can you try without it, just to see if there is a conflict between our module and the boto3 client in use of libssl, because libssl creates global contexts per application, not per library/object.

Cheers,
Daniel

On 10.04.19 01:26, Michael Loughrey wrote:
Hi All,

looking for some advice regarding the proper way to initialise an AWS API boto3 client object for send SNS messages within a KEMI Python 2.7 routing script. 

I'm sure process forking has a major impact on how this works - from kamailio.cfg

     fork=yes
     children=4

-  but would greatly appreciate some guidance.

I have tried various methods to allocate an client object self.sns_client = boto3.client('sns', region_name=MY_AWS_REGION)

  1. within   __init__(), only once from module initialisation
  2. on demand within a function call by any client code e.g. called from ksr_route_request(), each creating it's own client object

 and so far all have resulted in intermittent crashes within botocore/client.py ultimately crashing thus :


Core was generated by `/usr/local/sbin/kamailio -P /usr/local/kamailio/run/kamailio.pid -f /usr/local/'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007f8f9734f754 in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
(gdb) bt
#0  0x00007f8f9734f754 in ?? () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#1  0x00007f8f9734f82e in X509_VERIFY_PARAM_free () from /usr/lib/x86_64-linux-gnu/libcrypto.so.1.1
#2  0x00007f8f97642f5c in SSL_free () from /usr/lib/x86_64-linux-gnu/libssl.so.1.1
#3  0x00007f8f86b09c64 in PySSL_dealloc.lto_priv.5 () at ./Modules/_ssl.c:1598
#4  0x00007f8f99cbb007 in insertdict_by_entry (mp=0x7f8f841f1e88, key='_sslobj', hash=<optimized out>, ep=<optimized out>, value=<optimized out>) at ../Objects/dictobject.c:519
#5  0x00007f8f99cbe2cf in insertdict (value=None, hash=1051385741686792393, key='_sslobj', mp=0x7f8f841f1e88) at ../Objects/dictobject.c:556
#6  dict_set_item_by_hash_or_entry (value=None, ep=0x0, hash=1051385741686792393, key='_sslobj', 
    op={'server_hostname': u'sns.us-east-1.amazonaws.com', '_connected': True, '_context': <SSLContext at remote 0x7f8f84232398>, 'server_side': False, '_makefile_refs': 0, '_closed': False, '_sslobj': None, 'do_handshake_on_connect': True, 'suppress_ragged_eofs': True}) at ../Objects/dictobject.c:795
#7  PyDict_SetItem (op=<optimized out>, key=<optimized out>, value=<optimized out>) at ../Objects/dictobject.c:848
#8  0x00007f8f99becec1 in _PyObject_GenericSetAttrWithDict (obj=<optimized out>, name='_sslobj', value=None, 
    dict={'server_hostname': u'sns.us-east-1.amazonaws.com', '_connected': True, '_context': <SSLContext at remote 0x7f8f84232398>, 'server_side': False, '_makefile_refs': 0, '_closed': False, '_sslobj': None, 'do_handshake_on_connect': True, 'suppress_ragged_eofs': True}) at ../Objects/object.c:1529
#9  0x00007f8f99bed437 in PyObject_SetAttr (
    v=<SSLSocket(server_hostname=u'sns.us-east-1.amazonaws.com', _connected=True, _context=<SSLContext at remote 0x7f8f84232398>, server_side=False, _makefile_refs=0, _closed=False, _sslobj=None, do_handshake_on_connect=True, suppress_ragged_eofs=True) at remote 0x7f8f841f62a8>, name=<optimized out>, value=None) at ../Objects/object.c:1247

Here's my setup

host : Debian9/stretch

python --version
Python 2.7.13

kamailio -v
version: kamailio 5.2.2 (x86_64/linux) 67f967-dirty
flags: STATS: Off, USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES, MEM_JOIN_FREE
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144 MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
id: 67f967 -dirty
compiled on 23:44:31 Apr  8 2019 with gcc 6.3.0


Cheers
Mike




_______________________________________________
Kamailio (SER) - Users Mailing List
sr-users@lists.kamailio.org
https://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
-- 
Daniel-Constantin Mierla -- www.asipto.com
www.twitter.com/miconda -- www.linkedin.com/in/miconda
Kamailio World Conference - May 6-8, 2019 -- www.kamailioworld.com