Description

DNS core resolver fails in returning a valid IP when there are too many SRV results in the DNS reply.
It acts like if no records were found, so request is not relayed and a 478 reply is generated instead (in the example of a DNS name in $ru or $du).

Troubleshooting

Reproduction

It is easy to reproduce with DNS failover + NAPTR enabled (cf parameters used far below)
and with such DNS records:

# dig +short NAPTR ko.sip.provider.com
50 30 "S" "SIP+D2U" "" _sip._udp.ko.sip.provider.com.

# dig +short SRV _sip._udp.ko.sip.provider.com.
10 10 5060 endpoint-01.k0.sip.provider.com.
10 10 5060 endpoint-02.k0.sip.provider.com.
10 10 5060 endpoint-03.k0.sip.provider.com.
10 10 5060 endpoint-04.k0.sip.provider.com.
10 10 5060 endpoint-05.k0.sip.provider.com.
10 10 5060 endpoint-06.k0.sip.provider.com.
10 10 5060 endpoint-07.k0.sip.provider.com.
10 10 5060 endpoint-08.k0.sip.provider.com.
10 10 5060 endpoint-09.k0.sip.provider.com.

# Each SRV result above has a corresponding
# 'A' record so that command below gives a correct IP:
# dig +short A endpoint-01.k0.sip.provider.com.

To reproduce, relay a request towards it, like:
$du="sip:ko.sip.provider.com"

Debugging data

One interesting thing is that Kamailio behaves exactly the same as the sip-dig tool.
But sip-dig seems to be limited on the DNS reply size it can handle (cf my comment below about the RFC).
Does Kamailio have this same kind of limitation regarding DNS resolution?

Log Messages

Failure example: with 9 SRV records
DEBUG: <core> [core/dns_cache.c:527]: _dns_hash_find(): (ko.sip.provider.com(26), 35), h=275
DEBUG: <core> [core/mem/q_malloc.c:482]: qm_free(): qm_free(0x7ff6a20f0000, 0x7ff6a27777d8), called from core: core/dns_cache.c: dns_destroy_entry(151)
DEBUG: <core> [core/mem/q_malloc.c:526]: qm_free(): freeing frag. 0x7ff6a27777a0 alloc'ed from core: core/dns_cache.c: dns_cache_mk_rd_entry(1110)
DEBUG: <core> [core/mem/q_malloc.c:374]: qm_malloc(): qm_malloc(0x7ff7234f4010, 58) called from core: core/resolve.c: get_record(862)
DEBUG: <core> [core/mem/q_malloc.c:419]: qm_malloc(): qm_malloc(0x7ff7234f4010, 64) returns address 0x7ff72363d8f8 frag. 0x7ff72363d8c0 (size=64) on 1 -th hit
DEBUG: <core> [core/mem/q_malloc.c:374]: qm_malloc(): qm_malloc(0x7ff7234f4010, 92) called from core: core/resolve.c: dns_naptr_parser(405)
DEBUG: <core> [core/mem/q_malloc.c:419]: qm_malloc(): qm_malloc(0x7ff7234f4010, 96) returns address 0x7ff72363d9a0 frag. 0x7ff72363d968 (size=96) on 1 -th hit
DEBUG: <core> [core/resolve.c:984]: get_record(): skipping 0 NS (p=0x558fb300dba7, end=0x558fb300dba7)
DEBUG: <core> [core/resolve.c:997]: get_record(): parsing 0 ARs (p=0x558fb300dba7, end=0x558fb300dba7)
DEBUG: <core> [core/mem/q_malloc.c:374]: qm_malloc(): qm_malloc(0x7ff6a20f0000, 216) called from core: core/dns_cache.c: dns_cache_mk_rd_entry(1110)
DEBUG: <core> [core/mem/q_malloc.c:419]: qm_malloc(): qm_malloc(0x7ff6a20f0000, 216) returns address 0x7ff6a27748a8 frag. 0x7ff6a2774870 (size=232) on 1 -th hit
DEBUG: <core> [core/mem/q_malloc.c:482]: qm_free(): qm_free(0x7ff7234f4010, 0x7ff72363d9a0), called from core: core/resolve.c: free_rdata_list(678)
DEBUG: <core> [core/mem/q_malloc.c:526]: qm_free(): freeing frag. 0x7ff72363d968 alloc'ed from core: core/resolve.c: dns_naptr_parser(405)
DEBUG: <core> [core/mem/q_malloc.c:482]: qm_free(): qm_free(0x7ff7234f4010, 0x7ff72363d8f8), called from core: core/resolve.c: free_rdata_list(679)
DEBUG: <core> [core/mem/q_malloc.c:526]: qm_free(): freeing frag. 0x7ff72363d8c0 alloc'ed from core: core/resolve.c: get_record(862)
DEBUG: <core> [core/dns_cache.c:1633]: dns_get_related(): (0x7ff6a27748a8 (ko.sip.provider.com, 35), 35, *(nil)) (0)
DEBUG: <core> [core/dns_cache.c:739]: dns_cache_add_unsafe(): adding ko.sip.provider.com(26) 35 (flags=0) at 275
DEBUG: <core> [core/dns_cache.c:2614]: dns_naptr_sip_iterate(): found a valid sip NAPTR rr _sip._udp.ko.sip.provider.com, proto 1
DEBUG: <core> [core/resolve.c:1182]: naptr_choose(): o:-1 w:-1 p:0, o:50 w:30 p:1
DEBUG: <core> [core/resolve.c:1197]: naptr_choose(): changed
DEBUG: <core> [core/dns_cache.c:2625]: dns_naptr_sip_iterate(): choosed NAPTR rr _sip._udp.ko.sip.provider.com, proto 1 tried: 0x0
DEBUG: <core> [core/dns_cache.c:527]: _dns_hash_find(): (_sip._udp.ko.sip.provider.com(36), 33), h=989
DEBUG: <core> [core/dns_cache.c:3041]: dns_srv_resolve_ip(): ("_sip._udp.ko.sip.provider.com", 0, 0), ret=-5, ip=
DEBUG: <core> [core/dns_cache.c:527]: _dns_hash_find(): (_sip._udp.ko.sip.provider.com(36), 33), h=989
DEBUG: <core> [core/dns_cache.c:3041]: dns_srv_resolve_ip(): ("_sip._udp.ko.sip.provider.com", 0, 0), ret=-5, ip=
DEBUG: <core> [core/dns_cache.c:527]: _dns_hash_find(): (_sip._tcp.ko.sip.provider.com(36), 33), h=772
DEBUG: <core> [core/dns_cache.c:3041]: dns_srv_resolve_ip(): ("_sip._tcp.ko.sip.provider.com", 0, 0), ret=-5, ip=
DEBUG: <core> [core/dns_cache.c:527]: _dns_hash_find(): (_sips._tcp.ko.sip.provider.com(37), 33), h=786
DEBUG: <core> [core/dns_cache.c:3041]: dns_srv_resolve_ip(): ("_sips._tcp.ko.sip.provider.com", 0, 0), ret=-5, ip=
DEBUG: <core> [core/dns_cache.c:527]: _dns_hash_find(): (ko.sip.provider.com(26), 1), h=275
DEBUG: <core> [core/dns_cache.c:2803]: dns_a_resolve(): (ko.sip.provider.com, 0) returning -7
DEBUG: <core> [core/dns_cache.c:3167]: dns_srv_sip_resolve(): (ko.sip.provider.com, 0, 0), ip, ret=-7
ERROR: tm [ut.h:284]: uri2dst2(): failed to resolve "ko.sip.provider.com" :unresolvable A or AAAA request (-7)
Comparison with a working example (only 3 SRV records)
DEBUG: <core> [core/dns_cache.c:527]: _dns_hash_find(): (ok.sip.provider.com(26), 35), h=275
DEBUG: <core> [core/mem/q_malloc.c:374]: qm_malloc(): qm_malloc(0x7ff7234f4010, 58) called from core: core/resolve.c: get_record(862)
DEBUG: <core> [core/mem/q_malloc.c:419]: qm_malloc(): qm_malloc(0x7ff7234f4010, 64) returns address 0x7ff723613ff8 frag. 0x7ff723613fc0 (size=64) on 1 -th hit
DEBUG: <core> [core/mem/q_malloc.c:374]: qm_malloc(): qm_malloc(0x7ff7234f4010, 92) called from core: core/resolve.c: dns_naptr_parser(405)
DEBUG: <core> [core/mem/q_malloc.c:419]: qm_malloc(): qm_malloc(0x7ff7234f4010, 96) returns address 0x7ff7236140a0 frag. 0x7ff723614068 (size=96) on 1 -th hit
DEBUG: <core> [core/resolve.c:984]: get_record(): skipping 0 NS (p=0x558fb300dba7, end=0x558fb300dba7)
DEBUG: <core> [core/resolve.c:997]: get_record(): parsing 0 ARs (p=0x558fb300dba7, end=0x558fb300dba7)
DEBUG: <core> [core/mem/q_malloc.c:374]: qm_malloc(): qm_malloc(0x7ff6a20f0000, 216) called from core: core/dns_cache.c: dns_cache_mk_rd_entry(1110)
DEBUG: <core> [core/mem/q_malloc.c:419]: qm_malloc(): qm_malloc(0x7ff6a20f0000, 216) returns address 0x7ff6a27755b8 frag. 0x7ff6a2775580 (size=376) on 1 -th hit
DEBUG: <core> [core/mem/q_malloc.c:482]: qm_free(): qm_free(0x7ff7234f4010, 0x7ff7236140a0), called from core: core/resolve.c: free_rdata_list(678)
DEBUG: <core> [core/mem/q_malloc.c:526]: qm_free(): freeing frag. 0x7ff723614068 alloc'ed from core: core/resolve.c: dns_naptr_parser(405)
DEBUG: <core> [core/mem/q_malloc.c:482]: qm_free(): qm_free(0x7ff7234f4010, 0x7ff723613ff8), called from core: core/resolve.c: free_rdata_list(679)
DEBUG: <core> [core/mem/q_malloc.c:526]: qm_free(): freeing frag. 0x7ff723613fc0 alloc'ed from core: core/resolve.c: get_record(862)
DEBUG: <core> [core/dns_cache.c:1633]: dns_get_related(): (0x7ff6a27755b8 (ok.sip.provider.com, 35), 35, *(nil)) (0)
DEBUG: <core> [core/dns_cache.c:739]: dns_cache_add_unsafe(): adding ok.sip.provider.com(26) 35 (flags=0) at 275
DEBUG: <core> [core/dns_cache.c:2614]: dns_naptr_sip_iterate(): found a valid sip NAPTR rr _sip._udp.ok.sip.provider.com, proto 1
DEBUG: <core> [core/resolve.c:1182]: naptr_choose(): o:-1 w:-1 p:0, o:50 w:30 p:1
DEBUG: <core> [core/resolve.c:1197]: naptr_choose(): changed
DEBUG: <core> [core/dns_cache.c:2625]: dns_naptr_sip_iterate(): choosed NAPTR rr _sip._udp.ok.sip.provider.com, proto 1 tried: 0x0
DEBUG: <core> [core/dns_cache.c:527]: _dns_hash_find(): (_sip._udp.ok.sip.provider.com(36), 33), h=989
DEBUG: <core> [core/mem/q_malloc.c:374]: qm_malloc(): qm_malloc(0x7ff7234f4010, 68) called from core: core/resolve.c: get_record(862)
DEBUG: <core> [core/mem/q_malloc.c:419]: qm_malloc(): qm_malloc(0x7ff7234f4010, 72) returns address 0x7ff723613ff8 frag. 0x7ff723613fc0 (size=72) on 1 -th hit
DEBUG: <core> [core/mem/q_malloc.c:374]: qm_malloc(): qm_malloc(0x7ff7234f4010, 46) called from core: core/resolve.c: dns_srv_parser(318)
DEBUG: <core> [core/mem/q_malloc.c:419]: qm_malloc(): qm_malloc(0x7ff7234f4010, 48) returns address 0x7ff7236140a8 frag. 0x7ff723614070 (size=48) on 1 -th hit
DEBUG: <core> [core/resolve.c:984]: get_record(): skipping 0 NS (p=0x558fb300dbb4, end=0x558fb300dbb4)
DEBUG: <core> [core/resolve.c:997]: get_record(): parsing 0 ARs (p=0x558fb300dbb4, end=0x558fb300dbb4)
DEBUG: <core> [core/mem/q_malloc.c:374]: qm_malloc(): qm_malloc(0x7ff6a20f0000, 176) called from core: core/dns_cache.c: dns_cache_mk_rd_entry(1110)
DEBUG: <core> [core/mem/q_malloc.c:419]: qm_malloc(): qm_malloc(0x7ff6a20f0000, 176) returns address 0x7ff6a2775900 frag. 0x7ff6a27758c8 (size=176) on 1 -th hit
DEBUG: <core> [core/mem/q_malloc.c:482]: qm_free(): qm_free(0x7ff7234f4010, 0x7ff7236140a8), called from core: core/resolve.c: free_rdata_list(678)
DEBUG: <core> [core/mem/q_malloc.c:526]: qm_free(): freeing frag. 0x7ff723614070 alloc'ed from core: core/resolve.c: dns_srv_parser(318)
DEBUG: <core> [core/mem/q_malloc.c:482]: qm_free(): qm_free(0x7ff7234f4010, 0x7ff723613ff8), called from core: core/resolve.c: free_rdata_list(679)
DEBUG: <core> [core/mem/q_malloc.c:526]: qm_free(): freeing frag. 0x7ff723613fc0 alloc'ed from core: core/resolve.c: get_record(862)
DEBUG: <core> [core/dns_cache.c:1633]: dns_get_related(): (0x7ff6a2775900 (_sip._udp.ok.sip.provider.com, 33), 33, *(nil)) (0)
DEBUG: <core> [core/dns_cache.c:739]: dns_cache_add_unsafe(): adding _sip._udp.ok.sip.provider.com(36) 33 (flags=0) at 989
DEBUG: <core> [core/dns_cache.c:2222]: dns_srv_get_nxt_rr(): (0x7ff6a2775900, 0, 0, 1457300027): selected 0/1 in grp. 0 (rand_w=0, rr=0x7ff6a2775968 rd=0x7ff6a2775980 p=10 w=10 rsum=10)
DEBUG: <core> [core/dns_cache.c:527]: _dns_hash_find(): (endpoint.ok.sip.provider.com(38), 1), h=530
DEBUG: <core> [core/mem/q_malloc.c:374]: qm_malloc(): qm_malloc(0x7ff7234f4010, 70) called from core: core/resolve.c: get_record(862)
DEBUG: <core> [core/mem/q_malloc.c:419]: qm_malloc(): qm_malloc(0x7ff7234f4010, 72) returns address 0x7ff723613ff8 frag. 0x7ff723613fc0 (size=72) on 1 -th hit
DEBUG: <core> [core/mem/q_malloc.c:374]: qm_malloc(): qm_malloc(0x7ff7234f4010, 4) called from core: core/resolve.c: dns_a_parser(474)
DEBUG: <core> [core/mem/q_malloc.c:419]: qm_malloc(): qm_malloc(0x7ff7234f4010, 8) returns address 0x7ff7236140a8 frag. 0x7ff723614070 (size=8) on 1 -th hit
DEBUG: <core> [core/resolve.c:984]: get_record(): skipping 0 NS (p=0x558fb300db8e, end=0x558fb300db8e)
DEBUG: <core> [core/resolve.c:997]: get_record(): parsing 0 ARs (p=0x558fb300db8e, end=0x558fb300db8e)
DEBUG: <core> [core/mem/q_malloc.c:374]: qm_malloc(): qm_malloc(0x7ff6a20f0000, 136) called from core: core/dns_cache.c: dns_cache_mk_rd_entry(1110)
DEBUG: <core> [core/mem/q_malloc.c:419]: qm_malloc(): qm_malloc(0x7ff6a20f0000, 136) returns address 0x7ff6a2775a18 frag. 0x7ff6a27759e0 (size=136) on 1 -th hit
DEBUG: <core> [core/mem/q_malloc.c:482]: qm_free(): qm_free(0x7ff7234f4010, 0x7ff7236140a8), called from core: core/resolve.c: free_rdata_list(678)
DEBUG: <core> [core/mem/q_malloc.c:526]: qm_free(): freeing frag. 0x7ff723614070 alloc'ed from core: core/resolve.c: dns_a_parser(474)
DEBUG: <core> [core/mem/q_malloc.c:482]: qm_free(): qm_free(0x7ff7234f4010, 0x7ff723613ff8), called from core: core/resolve.c: free_rdata_list(679)
DEBUG: <core> [core/mem/q_malloc.c:526]: qm_free(): freeing frag. 0x7ff723613fc0 alloc'ed from core: core/resolve.c: get_record(862)
DEBUG: <core> [core/dns_cache.c:1633]: dns_get_related(): (0x7ff6a2775a18 (endpoint.ok.sip.provider.com, 1), 1, *(nil)) (0)
DEBUG: <core> [core/dns_cache.c:739]: dns_cache_add_unsafe(): adding endpoint.ok.sip.provider.com(38) 1 (flags=0) at 530
DEBUG: <core> [core/dns_cache.c:2803]: dns_a_resolve(): (endpoint.ok.sip.provider.com, 0) returning 0
DEBUG: <core> [core/dns_cache.c:3041]: dns_srv_resolve_ip(): ("_sip._udp.ok.sip.provider.com", 0, 0), ret=0, ip=[RESOLVED_IP]
DEBUG: <core> [core/dns_cache.c:3241]: dns_naptr_sip_resolve(): (ok.sip.provider.com, 0, 0), srv0, ret=0

Possible Solutions

I had a quick look inside the code and did not find any limitation about a maximum number of records.
There are some max defined in dns_cache.c but I did not found a relation between them and my issue.

Could there be a limitation in result size? Here is what I got from my RFCs reading regarding that:

Currently there's a practical limit of 512 bytes for DNS replies.
Until all resolvers can handle larger responses, domain administrators are strongly advised to keep their SRV replies below 512 bytes.

There is a RFC about how to deal with truncated messages:

If a truncated response comes back from an SRV query, the rules described in RFC 2181 (https://tools.ietf.org/html/rfc2181#page-11) shall apply.

Additional Information

dns_try_naptr=yes
dns_tcp_pref = 1
dns_udp_pref = 1
dns_tls_pref = 1
dns_srv_lb=yes
use_dns_failover=yes
use_dns_cache=yes
dns_cache_max_ttl=30

Thanks


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.