### Description I notice a test failure when validating Kamailio 5.4.x, and after looking at the SIP traces, Kamailio is does not forwarding the first ACK. However, it does it after receiving a 200 OK re-transmission .
``` 192.0.2.12:5062 192.0.2.24:5060 ──────────┬───────── ──────────┬───────── │ │ │ │ │ │ 200 coco (SDP │ │ <<<──────────────── 200 coco (SDP) │ <───────────────────────────────────────────── │ ACK │ ─────────────────────────────────────────────> │ ACK │ ───────────────────────────────────────────>>> │ │ │ 200 coco (SDP │ │ <<<──────────────── 200 coco (SDP) │ <───────────────────────────────────────────── │ │ │ ACK │ ─────────────────────────────────────────────> │ │ │ ACK │ │ ──────────────────> ```
### Troubleshooting I did proceed by isolation and ended up concluding that this commit was triggering this behavior. Simply, reverting this commit is fixing the problem. ``` commit 28049aafc8dd06c160ce5e7b8d5e4fc728441b0c Author: Semen Darienko semen.darienko@wildix.com Date: Sun May 3 12:26:45 2020 +0200
core: dns - use all NAPTR records
- enable using of all NAPTR records instead of the first one ordered by priority - GH #2290 ```
I did not look at the code modification and/or anything config specific yet, I will try to find some time to debug.
Thanks for the report, are you testing 5.4.2? Do you have NAPTR records configured in your setup? Did not noticed something in this regards on my setups.
Hi Henning,
I was testing 5.4.2 but I ended up going back as far a May 3.
While trying to reproduce I just noticed that now I see 300-400ms to relay the ACK. (not enough to trigger a re-transmission anymore)
- First INVITE and BYE are relayed in 1ms. - When I rollback the commit the ACK is also relayed in 1ms. - The same problem is found on another gitlab runner test box and the delay seems to be worst.
- No NAPTR resolution should be needed since the RURI is an IP.
``` 2020/10/30 03:02:48.442251 192.0.2.23:5060 -> 192.0.2.24:5060 ACK sip:192.0.2.2:5060 SIP/2.0 Record-Route: sip:192.0.2.23;lr Via: SIP/2.0/UDP 192.0.2.23:5060;branch=z9hG4bKa37.abcde5432ec8393b4ab82cf54fa2d04a.0 Via: SIP/2.0/UDP 192.0.2.7:5777;received=192.0.2.7;rport=5777;branch=z9hG4bKPj0bd8b830-4a10-467a-9984-3a8dceb15141 Max-Forwards: 69 From: sip:4468246@fakecustomer.xyz;tag=5f1e815a-3786-499a-ba48-6b1dfde0544d To: sip:+11234567890@proxy1;tag=778aaba7-ab0e-4a60-9596-65d28d86a43e Call-ID: a7319394-6352-4465-b673-d49ea8de8190 CSeq: 10636 ACK Route: sip:192.0.2.24;lr Content-Length: 0 ```
It seems it is reproducible as it is always adding a significant delay, sometimes triggering a 200 OK re-transmission. Not sure if it may be more complicated to reproduce due to config params. I will not be able to debug to confirm more into details until next Wednesday.
Interesting, have not seen this in my sngrep observations. Looking forward to get more details.
One example with +50ms, I need to figure out why there is not a fluctuating delay it seems like we are doing more resolution.
``` 2020-11-04T17:15:46.613616287Z 1(175) INFO: <script>: [MAIN][ACK][301bf833-a402-4f89-9d10-f3f3bd7822fb] from[4468246] to[+11234567890] ruri[sip:192.0.2.2:5060] 2020-11-04T17:15:46.613918650Z 1(175) INFO: <core> [core/dns_cache.c:3296]: dns_naptr_sip_resolve(): do naptr: dns_get_entry: 192.0.2.2 2020-11-04T17:15:46.660225641Z 1(175) INFO: <core> [core/dns_cache.c:3338]: dns_naptr_sip_resolve(): do naptr: not found: 192.0.2.2 2020-11-04T17:15:46.660247791Z 1(175) INFO: <core> [core/dns_cache.c:3340]: dns_naptr_sip_resolve(): do naptr: not found dns_srv_resolve: 192.0.2.2 ```
Are you still observing the error when NAPTR is disabled (its disabled by default) dns_try_naptr = no
Indeed I have it enabled, `dns_try_naptr = yes` Will not execute this code, I just double checked since you asked.
it seems like something bypassing the cache.
Ok, have a look to the DNS queries its actually does on network level.
let me try to identify why there is an excessive delay when getting it out of the cache (probably another resolution) and see how this can be avoided.
or if this is in fact the expected behavior, confirm it was not working as expected before.
Closed #2539.
Reopened #2539.
found it, we disabled the check
``` /* check if it's an ip address */ if ( ((tmp_ip=str2ip(name))!=0) || ((tmp_ip=str2ip6(name))!=0)
```
and always do an
``` if ((e=dns_get_entry(name, T_NAPTR))==0) ```
I will propose an MR
If you are referring to src/core/dns_cache.c - this code is from 2006. I has not been changed in the 28049aafc8dd06c16 commit referred above.
I did mention the commit in question, I can now confirm reading the code that Kamailio is doing DNS NAPTR resolution on IP addresses, this was not intended but the desired improvement.
``` commit 28049aafc8dd06c160ce5e7b8d5e4fc728441b0c Author: Semen Darienko semen.darienko@wildix.com Date: Sun May 3 12:26:45 2020 +0200
core: dns - use all NAPTR records
- enable using of all NAPTR records instead of the first one ordered by priority - GH #2290 ```
We mean different things probably. But thank you, lets just look at your proposed merge request. :-)
fix merged into master and cherry-picked in 5.4, thank to Semen for the review.
Closed #2539.