Hi Daniel,
Thank you for the prompt response. We are looking into migrating to 5.5.x shortly.
We did add some additional debugging and got an interesting result.
```
Dec 31 18:29:08.819765 eqx-sjc-ecv3 kamailio_edge[19096]: WARNING: websocket
[ws_frame.c:810]: ws_keepalive(): forcibly closing connection
grep "tcp id = 19132"
/var/log/kamailio/kamailio_edge.log
Dec 31 18:29:07.815876 eqx-sjc-ecv3
kamailio_edge[19096]: INFO: websocket [ws_conn.c:371]: wsconn_close_now():
wsconn_close_now: wsc = 0xa2ac94e0, tcp con = 0x9edb0e80, tcp id = 19132
Dec 31 18:29:08.819965 eqx-sjc-ecv3 kamailio_edge[19096]: INFO: websocket [ws_conn.c:371]:
wsconn_close_now(): wsconn_close_now: wsc = 0xa2ac94e0, tcp con = 0x00000000, tcp id =
19132
Dec 31 18:29:09.824694 eqx-sjc-ecv3 kamailio_edge[19096]: INFO: websocket [ws_conn.c:371]:
wsconn_close_now(): wsconn_close_now: wsc = 0xa2ac94e0, tcp con = 0x00000000, tcp id =
19132
Dec 31 18:29:10.833513 eqx-sjc-ecv3 kamailio_edge[19096]: INFO: websocket [ws_conn.c:371]:
wsconn_close_now(): wsconn_close_now: wsc = 0xa2ac94e0, tcp con = 0x00000000, tcp id =
19132
Dec 31 18:29:11.838090 eqx-sjc-ecv3 kamailio_edge[19096]: INFO: websocket [ws_conn.c:371]:
wsconn_close_now(): wsconn_close_now: wsc = 0xa2ac94e0, tcp con = 0x00000000, tcp id =
19132
Dec 31 18:29:12.841246 eqx-sjc-ecv3 kamailio_edge[19096]: INFO: websocket [ws_conn.c:371]:
wsconn_close_now(): wsconn_close_now: wsc = 0xa2ac94e0, tcp con = 0x00000000, tcp id =
19132
Dec 31 18:38:20.299126 eqx-sjc-ecv3 kamailio_edge[19218]: INFO: <core>
[core/tcp_main.c:1584]: _tcpconn_detach(): _tcpconn_detach: tcpconn=0x9edb0e80, tcp id =
19132
```
I did want to get your thoughts on the current implementation of the `ws_keepalive`
functionality.
```
if(wsc->state == WS_S_CLOSING || wsc->awaiting_pong) {
LM_WARN("forcibly closing connection\n");
wsconn_close_now(wsc);
```
When `wsconn_close_now` is called, the connection state is set to `S_CONN_BAD`.
On subsequent `wsconn_close_now` calls, `tcpconn_get` -> `_tcpconn_find` calls will
return 0 due to [this
line](https://github.com/dialpad/kamailio/blob/master/src/core/tcp_main.c#L…: `if
((id==c->id)&&(c->state!=S_CONN_BAD)) {`.
Ie. `_tcpconn_find` skips over connections in the `S_CONN_BAD` state.
Why is `wsconn_close_now` being called multiple times? It looks like `ws_keepalive`
isn't removing the websocket connection from the list of open connections. Instead,
it's [de-referencing an extra
time](https://github.com/kamailio/kamailio/blob/master/src/modules/websocke….
```
void ws_keepalive(unsigned int ticks, void *param)
{
...
while(list_head[i].id!=-1) {
//For each websocket connection, add a refcount
wsc = wsconn_get(list_head[i].id);
if(wsc && wsc->last_used < check_time) {
if(wsc->state == WS_S_CLOSING || wsc->awaiting_pong) {
LM_WARN("forcibly closing connection\n");
//Remove extra refcount?
wsconn_close_now(wsc);
} else if (ws_keepalive_mechanism == KEEPALIVE_MECHANISM_CONCHECK) {
tcp_connection_t *con = tcpconn_get(wsc->id, 0, 0, 0, 0);
if(con==NULL) {
LM_INFO("tcp connection has been lost\n");
wsc->state = WS_S_CLOSING;
} else {
tcpconn_put(con);
}
} else {
int opcode = (ws_keepalive_mechanism == KEEPALIVE_MECHANISM_PING)
? OPCODE_PING
: OPCODE_PONG;
ping_pong(wsc, opcode);
}
}
if(wsc) {
//Remove refcount
wsconn_put_id(list_head[i].id);
}
i++;
}
wsconn_put_list_ids(list_head);
}
```
What about modifying the keepalive to do something like this instead?
```
for(i = 0; list_head[i].id!=-1; i++) {
wsc = wsconn_get(list_head[i].id);
if(wsc->state == WS_S_CLOSING) {
/* Skip closing connections */
continue;
} else if (wsc->awaiting_pong) {
LM_WARN("forcibly closing connection, no pong received\n");
wsconn_close_now(wsc);
continue;
} else if ...
```
On skipping `wsconn_close_now` in the `WS_S_CLOSING` case, should it really be the
responsibility of the keepalive to close the connection in that case?
--
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/2990#issuecomment-1006088547
You are receiving this because you are subscribed to this thread.
Message ID: <kamailio/kamailio/issues/2990/1006088547(a)github.com>