### Description
The force_socket parameter is not always used when set. If I understand the module documentation correctly, when setting the force_socket parameter, then all nathelper udp traffic will be forced to use this socket definition.
The issue here is a result of another issue first described [here](http://sip-router.1086192.n5.nabble.com/NatHelper-ignoring-force-socket-modu...) in 2015 but is still present in current stable kamailio v5.0.4.
We have 3 registrars running in "memory only" mode, using dmq_usrloc to replicate registrations to the remaining two nodes. On the remaining two nodes there is no socket parameter set for the AOR, but nathelper still wants to ping these AOR's. It is on these two systems where this issue exhibits itself. I will open another ticket for this scenario and try to reference it here as they are related.
Each registrar has two interfaces, one is our "admin" lan, the other is our "voice" lan. The default route is set on these hosts and is a gateway on our "admin" lan. See issue #1297
When the registrars that are replicated to receive the AOR, they attempt to ping the endpoint (ideally they should not ping them). nathelper seems to think that the best interface to send them over is the "admin" lan even though force_socket is defined.
I would have expected that the message should have been sent via the socket defined in the "force_socket" parameter.
### Troubleshooting Module definitions
registrar ``` modparam("registrar", "method_filtering", 1) modparam("registrar", "case_sensitive", 1) modparam("registrar", "append_branches", 0) modparam("registrar", "use_path", 1) modparam("registrar", "path_mode", 0) modparam("registrar", "path_use_received", 1) modparam("registrar", "path_check_local", 1) modparam("registrar", "max_contacts", 1) ```
usrloc ``` modparam("usrloc", "db_mode", 0) modparam("usrloc", "use_domain", 1) modparam("usrloc", "timer_interval", 60) modparam("usrloc", "timer_procs", 4) modparam("usrloc", "nat_bflag", 6) ```
nathelper ``` modparam("nathelper", "natping_interval", 20) modparam("nathelper", "natping_processes", 4) modparam("nathelper", "ping_nated_only", 0) modparam("nathelper", "sipping_from", "sip:keepalive@example.com") modparam("nathelper", "sipping_method", "OPTIONS") modparam("nathelper", "sipping_bflag", 6) modparam("nathelper", "force_socket", "10.7.0.189:5060") modparam("nathelper", "udpping_from_path", 1) ```
Kamailio is listening on the same socket that is defined for the "force_socket" parameter above ``` listen=udp:10.6.0.189:5060 listen=udp:10.7.0.189:5060 listen=tcp:10.6.0.189:80 ```
dmq ``` modparam("dmq", "server_address", DMQ_ADDRESS) modparam("dmq", "notification_address", DMQ_NOTIFY_ADDRESS) modparam("dmq", "multi_notify", 1) modparam("dmq", "num_workers", 4) ```
dmq_usrloc ``` modparam("dmq_usrloc", "enable", 1) ``` #### Reproduction
Using the above settings register a user and once they are replicated to the registrar that did not service the request, the options pings on the nodes that were replicated to will exhibit this issue.
Example AOR where the registration was serviced (options pings should come from this host, and do, and flow as expected) ``` { "jsonrpc": "2.0", "result": { "Domain": "location", "Size": 1024, "AoRs": [{ "Info": { "AoR": "example_user@example.com", "HashID": -1389656423, "Contacts": [{ "Contact": { "Address": "sip:example_user@212.2.172.228:39808;rinstance=ed8aa63e90f53e97;transport=UDP", "Expires": 66, "Q": -1, "Call-ID": "wE-GD4GzkrtuDAJUBJf1Jg..", "CSeq": 58, "User-Agent": "Z 3.15.40006 rv2.8.20", "Received": "sip:212.2.172.228:39808", "Path": "sip:10.7.0.186:5062;lr;received=sip:212.2.172.228:39808", "State": "CS_NEW", "Flags": 0, "CFlags": 64, "Socket": "udp:10.7.0.190:5060", "Methods": -1, "Ruid": "uloc-2-59fa1f9d-714-17", "Instance": "[not set]", "Reg-Id": 0, "Last-Keepalive": 1509615136, "Last-Modified": 1509615136 } }] } } ], "Stats": { "Records": 1, "Max-Slots": 1 } }, "id": 4836 } ```
example AOR of the above registration on a registrar that was replicated to by dmq/dmq_usrloc, ideally these AOR's should not be pinged at all, but, currently they do, but, in this case, when they do get ping'd, they do not use the socket as defined in the "force_socket" parameter. ``` { "jsonrpc": "2.0", "result": { "Domain": "location", "Size": 1024, "AoRs": [{ "Info": { "AoR": "example_user@example.com", "HashID": -1389656423, "Contacts": [{ "Contact": { "Address": "sip:example_user@212.2.172.228:39808;rinstance=ed8aa63e90f53e97;transport=UDP", "Expires": 69, "Q": -1, "Call-ID": "wE-GD4GzkrtuDAJUBJf1Jg..", "CSeq": 91, "User-Agent": "Z 3.15.40006 rv2.8.20", "Received": "sip:212.2.172.228:39808", "Path": "sip:10.7.0.186:5062;lr;received=sip:212.2.172.228:39808", "State": "CS_NEW", "Flags": 2, "CFlags": 64, "Socket": "[not set]", "Methods": -1, "Ruid": "uloc-2-59fa1f9d-714-17", "Instance": "[not set]", "Reg-Id": 0, "Last-Keepalive": 1509617664, "Last-Modified": 1509617664 } }] } } ], "Stats": { "Records": 1, "Max-Slots": 1 } }, "id": 4571 } ```
#### Log Messages
There are no apparent error messages in the logs relating to this that I can see.
#### SIP Traffic
Here you can clearly see that the OPTIONS message is being sent over the "10.6.0.189" interface when the modules "force_socket" parameter is set to "10.7.0.189:5060" ``` U 2017/11/02 08:37:59.426608 10.6.0.189:5060 -> 10.7.0.186:5062
OPTIONS sip:example_user@212.2.172.228:39808;rinstance=ed8aa63e90f53e97;transport=UDP SIP/2.0. Via: SIP/2.0/UDP 10.6.0.189:5060;branch=z9hG4bK8416926. Route: sip:10.7.0.186:5062;lr;received=sip:212.2.172.228:39808. From: sip:keepalive@example.com;tag=uloc-2-59fa1f9d-714-17-9968b2da-13812ff4. To: sip:example_user@212.2.172.228:39808;rinstance=ed8aa63e90f53e97;transport=UDP. Call-ID: c0cec5f7-555ac383-68bb313@10.6.0.189. CSeq: 1 OPTIONS. Content-Length: 0. .
U 2017/11/02 08:38:19.431937 10.6.0.189:5060 -> 10.7.0.186:5062
OPTIONS sip:example_user@212.2.172.228:39808;rinstance=ed8aa63e90f53e97;transport=UDP SIP/2.0. Via: SIP/2.0/UDP 10.6.0.189:5060;branch=z9hG4bK8345318. Route: sip:10.7.0.186:5062;lr;received=sip:212.2.172.228:39808. From: sip:keepalive@example.com;tag=uloc-2-59fa1f9d-714-17-9968b2da-23812ff4. To: sip:example_user@212.2.172.228:39808;rinstance=ed8aa63e90f53e97;transport=UDP. Call-ID: c0cec5f7-655ac383-a9bb313@10.6.0.189. CSeq: 1 OPTIONS. Content-Length: 0. . ```
### Possible Solutions
Unknown
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
``` version: kamailio 5.0.4 (x86_64/linux) flags: STATS: Off, USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, USE_RAW_SOCKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: unknown compiled on 10:57:22 Oct 26 2017 with gcc 4.8.5 ```
* **Operating System**: ``` CentOS Linux release 7.4.1708 (Core) Linux localhost 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 20:32:50 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux ```
Can you try setting core parameter:
``` udp4_raw=0 ```
Hi, I just set udp4_raw=0 on all 3 registrars and remove udpping_from_path from the settings, but the same behaviour persists.
registrar that processed the request (ping is successfull) ``` U 2017/11/03 13:28:33.770754 10.7.0.190:5060 -> 10.7.0.186:5062
OPTIONS sip:example_user@212.2.172.228:39808;rinstance=ee53f90ba8a5d171;transport=UDP SIP/2.0. Via: SIP/2.0/UDP 10.7.0.190:5060;branch=z9hG4bK2645058. Route: sip:10.7.0.186:5062;lr;received=sip:212.2.172.228:39808. From: sip:keepalive@example.com;tag=uloc-2-59fc6eab-1ce0-1-9968b2da-3f810117. To: sip:example_user@212.2.172.228:39808;rinstance=ee53f90ba8a5d171;transport=UDP. Call-ID: 538b85e1-b0cecc34-ba1f4f1@10.7.0.190. CSeq: 1 OPTIONS. Content-Length: 0. .
U 2017/11/03 13:28:33.810519 10.7.0.186:5062 -> 10.7.0.190:5060
SIP/2.0 200 OK. Via: SIP/2.0/UDP 10.7.0.190:5060;rport=5060;branch=z9hG4bK2645058. Record-Route: sip:194.213.29.33:5062;r2=on;lr;ftag=uloc-2-59fc6eab-1ce0-1-9968b2da-3f810117. Record-Route: sip:10.7.0.186:5062;r2=on;lr;ftag=uloc-2-59fc6eab-1ce0-1-9968b2da-3f810117. Contact: sip:192.168.1.64:39808. To: sip:example_user@212.2.172.228:39808;rinstance=ee53f90ba8a5d171;transport=UDP;tag=134de75f. From: sip:keepalive@example.com;tag=uloc-2-59fc6eab-1ce0-1-9968b2da-3f810117. Call-ID: 538b85e1-b0cecc34-ba1f4f1@10.7.0.190. CSeq: 1 OPTIONS. Accept: application/sdp, application/sdp. Accept-Language: en. Allow: INVITE, ACK, CANCEL, BYE, NOTIFY, REFER, MESSAGE, OPTIONS, INFO, SUBSCRIBE. Supported: replaces, norefersub, extended-refer, timer, outbound, path, X-cisco-serviceuri. User-Agent: Z 3.15.40006 rv2.8.20. Allow-Events: presence, kpml, talk. Content-Length: 0. . ```
1st registrar replicated to (tries to ping, but wrong interface) ``` U 2017/11/03 13:28:49.675297 10.6.0.189:5060 -> 10.7.0.186:5062
OPTIONS sip:example_user@212.2.172.228:39808;rinstance=ee53f90ba8a5d171;transport=UDP SIP/2.0. Via: SIP/2.0/UDP 10.6.0.189:5060;branch=z9hG4bK2611596. Route: sip:10.7.0.186:5062;lr;received=sip:212.2.172.228:39808. From: sip:keepalive@example.com;tag=uloc-2-59fc6eab-1ce0-1-9968b2da-d35cdfe1. To: sip:example_user@212.2.172.228:39808;rinstance=ee53f90ba8a5d171;transport=UDP. Call-ID: dd9e8af5-a56efaa6-913d294@10.6.0.189. CSeq: 1 OPTIONS. Content-Length: 0. . ```
2nd registrar replicated to (tries to ping, but wrong interface) ``` U 2017/11/03 13:27:37.204775 10.6.0.191:5060 -> 10.7.0.186:5062
OPTIONS sip:example_user@212.2.172.228:39808;rinstance=ee53f90ba8a5d171;transport=UDP SIP/2.0. Via: SIP/2.0/UDP 10.6.0.191:5060;branch=z9hG4bK4520144. Route: sip:10.7.0.186:5062;lr;received=sip:212.2.172.228:39808. From: sip:keepalive@example.com;tag=uloc-2-59fc6eab-1ce0-1-9968b2da-1052c4a3. To: sip:example_user@212.2.172.228:39808;rinstance=ee53f90ba8a5d171;transport=UDP. Call-ID: ca2e8ec2-db7eb9a-63a17d6@10.6.0.191. CSeq: 1 OPTIONS. Content-Length: 0. . ```
Just thinking about this issue and #1297 a little and I have a question, when the registrar modules has "use_path" enabled, as we do here, is nathelper aware of this? If it is not, then, I'm guessing it will try to resolve the best interface to send over based on the "received" parameter, which, in our case, the OS will tell it to use the interface which has the default gateway, however, if nathelper is aware that we need to use the Path uri as the next hop, then nat helper does not need to decide which interface to use for the recieved parameter, it just needs to decide which interface to use for the destination in the Path uri. Could this be what is happening here?
Afaik, when sending SIP OPTIONS keepalive, nathelper is always using Path if it is set in the usrloc record. The use_path for registrar is only for lookup("location").
Ok, thanks for the clarification, however, if the Path header does exist in the userloc record, why does nathelper need to resolve the best interface to use based on the received parameter? Should it not need to resolve the interface to be used based on the first hop in the Path header as this is where it will be sending to directly, it is the next hop that then needs to decide where best to route the message?
I think I was taken by this once by surprise as well. Can you try and just put the IP address in the force_socket modparam? Without the port number.
See https://github.com/kamailio/kamailio/blob/master/src/modules/nathelper/nathe...
@tverlaan - good spot in the code that is using the parameter value as host/ip value. I am going to merge #1394, but it needs to be fixed better in order to detect and use the port.
Closed #1298 via #1394.