### Description
We had Kamailio 5.1.4 with websocket module. Unfortunately, our clients don't support websocket keepalive mechanism at all, so I used TCP keepalive instead with the following parameters:
``` tcp_keepalive=yes tcp_keepcnt=6 tcp_keepidle=60 tcp_keepintvl=10 ```
and set up KEEPALIVE_MECHANISM_NONE:
`modparam("websocket", "keepalive_mechanism", 0)`
During load testing and debugging, when 8k clients sent registrations, it was found out that shared memory was not freed after closing connections (_ws_connection_list_t *wsconn_used_list_ variable in ws_conn.c ).
### Possible Solutions
I've decided to add new keepalive mechanism that periodically checks TCP connection related to websocket:
``` enum { KEEPALIVE_MECHANISM_NONE = 0, KEEPALIVE_MECHANISM_PING = 1, KEEPALIVE_MECHANISM_PONG = 2, KEEPALIVE_MECHANISM_TCP_CONN_CHECK = 3 }; ```
and added the line to config:
``` # Enable custom tcp-connection-health based keepalive mechanism (3) # KEEPALIVE_MECHANISM_NONE = 0, # KEEPALIVE_MECHANISM_PING = 1, # KEEPALIVE_MECHANISM_PONG = 2 # KEEPALIVE_MECHANISM_TCP_CONN_CHECK = 3 modparam("websocket", "keepalive_mechanism", 3) ```
Also, I've implemented the mechanism in ws_keepalive function:
``` void ws_keepalive(unsigned int ticks, void *param) { int check_time = (int)time(NULL) - cfg_get(websocket, ws_cfg, keepalive_timeout);
ws_connection_t **list = NULL, **list_head = NULL; ws_connection_t *wsc = NULL;
/* get an array of pointer to all ws connection */ list_head = wsconn_get_list(); if(!list_head) return;
list = list_head; wsc = *list_head; while(wsc && wsc->last_used < check_time) { if (ws_keepalive_mechanism == KEEPALIVE_MECHANISM_TCP_CONN_CHECK) { struct tcp_connection *con = tcpconn_get(wsc->id, 0, 0, 0, 0); if(!con) { LM_INFO("tcp connection has been lost\n"); wsc->state = WS_S_CLOSING; } }
if(wsc->state == WS_S_CLOSING || wsc->awaiting_pong) { LM_INFO("forcibly closing connection\n"); wsconn_close_now(wsc); } else { int opcode = (ws_keepalive_mechanism == KEEPALIVE_MECHANISM_PING) ? OPCODE_PING : OPCODE_PONG; ping_pong(wsc, opcode); }
wsc = *(++list); }
wsconn_put_list(list_head); }
```
and changed memory allocation method in wsconn_get_list and wsconn_put_list methods from pkg to shm, because, as it turned out during load testing, using pkg_malloc (the C malloc) in this functions may cousing fails under serious loads.
These modifications solved the problem. But about a week ago we've started switching to ver. 5.2.1 and found a lot of changes in the websocket module. So, I've added my changes in this commit https://github.com/korizza/kamailio/commit/b3e03d03574ff4ff076005bb8a01d7461... . Please take a look.
### Additional Information
Adding ws_conn_put_id in this commit https://github.com/kamailio/kamailio/commit/a975bca1702ea2f3db47f834f7e4da27... did not solve problem with ref counter increasing.
I added the tcp connection check mechanism to websocket module. Can you test with git master branch and see if all goes fine now?
Hi Daniel! I was on PTO, will look into it asap.
Closing as the commit was pushed a while ago. If there are issues discovered, open a new one on the tracker.
Closed #1892.