<!-- Kamailio Project uses GitHub Issues only for bugs in the code or feature requests. Please use this template only for bug reports.
If you have questions about using Kamailio or related to its configuration file, ask on sr-users mailing list:
* http://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-users
If you have questions about developing extensions to Kamailio or its existing C code, ask on sr-dev mailing list:
* http://lists.kamailio.org/cgi-bin/mailman/listinfo/sr-dev
Please try to fill this template as much as possible for any issue. It helps the developers to troubleshoot the issue.
If there is no content to be filled in a section, the entire section can be removed.
You can delete the comments from the template sections when filling.
You can delete next line and everything above before submitting (it is a comment). -->
### Description
The keepalive (ka*) features of usrloc does not work for TLS connections (and probably TCP). If a TLS connection is registered, the keepalive code attempts to send OPTIONS packets to the endpoint using UDP. Additionally, a TCP/TLS connection is not re-used and the code attempts to create a new one each request.
### Troubleshooting
#### Reproduction
Enable keepalive and register a phone using TLS
#### Log Messages
Kamailio Host: 10.15.5.116 Phone: 10.15.5.117
Phone is registered and seen in the registry
``` kamctl ul show { "jsonrpc": "2.0", "result": { "Domains": [{ "Domain": { "Domain": "location", "Size": 1024, "AoRs": [{ "Info": { "AoR": "anonymous", "HashID": 3819435025, "Contacts": [{ "Contact": { "Address": "sips:anonymous@10.15.5.117:5061", "Expires": 3531, "Q": -1, "Call-ID": "d06e987694afb559", "CSeq": 521595368, "User-Agent": "testclient/1.0", "Received": "[not set]", "Path": "[not set]", "State": "CS_NEW", "Flags": 0, "CFlags": 0, "Socket": "tls:10.15.5.116:5061", "Methods": 7903, "Ruid": "uloc-62c87f5c-1b6b-2", "Instance": "urn:uuid:00000000-0000-1000-8000-0010492e1879", "Reg-Id": 0, "Server-Id": 0, "Tcpconn-Id": 17, "Keepalive": 0, "Last-Keepalive": 1657310753, "KA-Roundtrip": 0, "Last-Modified": 1657310753 } }] } } ], "Stats": { "Records": 1, "Max-Slots": 1 } } }] }, "id": 14871 ```
When the keepalive code runs it shows the following:
``` Jul 8 14:27:52 localhost kamailio[6976]: 33(7012) DEBUG: usrloc [ul_keepalive.c:112]: ul_ka_urecord(): keepalive for aor: anonymous Jul 8 14:27:52 localhost kamailio[6976]: 33(7012) DEBUG: usrloc [ul_keepalive.c:235]: ul_ka_urecord(): keepalive request (len: 295) [[ Jul 8 14:27:52 localhost kamailio[6976]: OPTIONS sips:anonymous@10.15.5.117:5061 SIP/2.0 Jul 8 14:27:52 localhost kamailio[6976]: Via: SIP/2.0/UDP 10.15.5.116:5061;branch=z9hG4bKx.20.1.0 Jul 8 14:27:52 localhost kamailio[6976]: From: sips:10.15.5.116:5061;tag=uloc-62c87f5c-1b71-1-e3a7f411-62c88538-cc68d-14.1 Jul 8 14:27:52 localhost kamailio[6976]: To: sip:anonymous@kamailio.org Jul 8 14:27:52 localhost kamailio[6976]: Call-ID: ksrulka-20.1 Jul 8 14:27:52 localhost kamailio[6976]: CSeq: 80 OPTIONS Jul 8 14:27:52 localhost kamailio[6976]: Content-Length: 0 Jul 8 14:27:52 localhost kamailio[6976]: #015 Jul 8 14:27:52 localhost kamailio[6976]: ]] Jul 8 14:27:52 localhost kamailio[6976]: 33(7012) ERROR: <core> [core/udp_server.c:597]: udp_send(): sendto(sock, buf: 0x7fffe728ece0, len: 295, 0, dst: (10.15.5.117:5061), tolen: 16) - err: Bad file descriptor (9) ```
### Possible Solutions
There are 2 problems here.
First, the address record does not have a transport defined.
https://github.com/kamailio/kamailio/blob/a466d0b73a66b18419555eca437136f1bc... at this point, `sdst.s` is "sips:anonymous@10.15.5.117:5061" so parse_uri returns a proto of `UDP`. If you override the value to "sips:anonymous@10.15.5.117:5061;transport=tls" then the code works as expected. The protocol is detected as TLS and the OPTIONS message generated later has the correct `Via` header.
I'm not sure if the problem is that the address record is missing the transport, or if we need to infer the transport from the socket connection.
The second issue is that the keepalive code doesn't re-use TCP/TLS connections. In the ul_ka_send() function ) all of the socket IDs hardcoded to '0' https://github.com/kamailio/kamailio/blob/a466d0b73a66b18419555eca437136f1bc...
This is easily fixable by adding in ``` if(uc->tcpconn_id) { idst.id = uc->tcpconn_id; } ```
to `ul_ka_urecord()` and removing all of the `kadst->id=0;` lines from the `ul_ka_send()` function. This works for in-memory during testing. I believe the tcpconn_id field needs to be set in ul_ka_db_records if you're returning it from the DB. I don't have an environment setup to test that.
With those two changes made (hardcoding the transport and fixing the connection id) the keepalives are sent and work as expected:
``` 33(19413) DEBUG: usrloc [ul_keepalive.c:112]: ul_ka_urecord(): keepalive for aor: anonymous 33(19413) DEBUG: usrloc [ul_keepalive.c:169]: ul_ka_urecord(): MANUALLY OVERRIDING SDST: sips:anonymous@10.15.5.117:5061;transport=tls 33(19413) DEBUG: usrloc [ul_keepalive.c:247]: ul_ka_urecord(): keepalive request (len: 288) [[ OPTIONS sips:anonymous@10.15.5.117:5061 SIP/2.0 Via: SIP/2.0/TLS 10.15.5.116:5061;branch=z9hG4bKx.33.1.0 From: sips:10.15.5.116:5061;tag=uloc-62c8af83-4bd9-1-e3a7f411-62c8b37f-f2a22-21.1 To: sip:anonymous@kamailio.org Call-ID: ksrulka-33.1 CSeq: 80 OPTIONS Content-Length: 0
]] 33(19413) DEBUG: <core> [core/tcp_main.c:1644]: _tcpconn_find(): found connection by id: 1 33(19413) DEBUG: <core> [core/tcp_main.c:2539]: tcpconn_send_put(): found fd in cache (4, 0x7fbaef193db8, 1) 33(19413) DEBUG: <core> [core/tcp_main.c:2761]: tcpconn_do_send(): sending... 33(19413) DEBUG: <core> [core/tcp_main.c:2794]: tcpconn_do_send(): after real write: c= 0x7fbaef193db8 n=317 fd=4 ```
I assume that we'd want to re-use a TCP/TLS connection but maybe that needs to be an option (ka_reuse_tcp) if other people are expecting the current behavior of creating a new connection for each new keepalive request.
### Additional Information
``` version: kamailio 5.7.0-dev0 (x86_64/linux) a466d0 flags: USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLOCKLIST, HAVE_RESOLV_RES, TLS_PTHREAD_MUTEX_SHARED ADAPTIVE_WAIT_LOOPS 1024, MAX_RECV_BUFFER_SIZE 262144, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: a466d0 compiled on 11:27:35 Jul 8 2022 with gcc 8.3.0 ```
* **Operating System**:
Debian Buster
According [RFC3261](https://datatracker.ietf.org/doc/html/rfc3261#section-19.1.2)
The default transport is scheme dependent. For sip:, it is UDP.
For sips:, it is TCP.
According to the OPTIONS message above, Kamailio tries to use UDP transport for "sips" schema. Another approach, as a workaround, update the Contac header and add "transport=tcp" when using "sips" schema.
According to the OPTIONS message above, Kamailio tries to use UDP transport for "sips" schema.
I started debugging parse_uri() first and couldn't see any place where the proto was set to TLS (or TCP) when the schema was sips. Given how widely used parse_uri() is, I felt that change might be very disruptive without the feedback of more experienced people.
The only place in parse_uri() the proto is changed is based on the transport.
https://github.com/kamailio/kamailio/blob/a466d0b73a66b18419555eca437136f1bc...
Ultimately, it seems like parse_uri() is wrong based on RFC3261. Detecting 'sips' and setting proto to TLS would work for TLS connections. This bug would still exist for TCP connections if the transport is missing.
Another approach, as a workaround, update the Contac header and add "transport=tcp" when using "sips" schema.
I agree, I can work around part of the problem in the config. Not setting the tcpconn_id is still a problem for reusing the connection.
The `sips` schema is quite a mess, specs being sometimes hard to follow, i think there were also suggested to obsolete/remove it. In short, unlike in the web world where https means over tls, in sip, the sips means that the message is sent over a safe/secure medium, which can be also udp over a private link.
If there is a change on the current code related to handing sips, probably it should be made controlled by a parameter, to have a way of keeping current behaviour, it may break things, so a way to revert needs to be around.
UDP is not a valid transport for SIPS based on [RFC 3261](https://datatracker.ietf.org/doc/html/rfc3261#section-26.2.2) but it doesn't mean that it's TCP.
Note that in the SIPS URI scheme, transport is independent of TLS,
and thus "sips:alice@atlanta.com;transport=tcp" and "sips:alice@atlanta.com;transport=sctp" are both valid (although note that UDP is not a valid transport for SIPS). The use of "transport=tls" has consequently been deprecated, partly because it was specific to a single hop of the request. This is a change since [RFC 2543](https://datatracker.ietf.org/doc/html/rfc2543).
I think we have a few options to get the transport.
We could get the transport by calling fix_nated_register() to set `received` but that seems silly to have to include the nathelper module. On my test, that's set to `sip:10.15.5.117:36557;transport=tls` (which also doesn't seem possible to have the scheme set to "sip" when the transport is TLS.). Probably doesn't work for other transports. This seems ugly.
We could determine the transport based on the scheme and the tcpconn-id. - If scheme=sip and tcpconn_id is 0, then transport is UDP - If scheme=sip and tcpconn_id is not 0, then transport is TCP - If scheme=sips and tcpconn_id is not 0, then transport is TLS
Won't work for SCTP. Not sure about WSS.
We could store the transport on the ucontact struct. Probably the cleanest solution to have an explicit storage of the transport but some of the same problems might be pushed upstream to determine the correct transport to save in the first place.
Any thoughts on a go forward direction?
Resolved by https://github.com/kamailio/kamailio/pull/3219
Closed #3178 as completed.