It seems I found the problem and I have a fix.
The root cause is probably that the locally generated 408 is not updating
the dialog to-tag.
However, always checking for a to-tag match, before a non to-tag match will
fix any such issue.
I will prepare a merge request on Monday to start discussing the option
always matching to-tag first.
On Fri, Sep 25, 2020 at 11:27 AM Julien Chavanton <jchavanton(a)gmail.com>
wrote:
I did catch the logs, and after looking at the trace,
it seems like dialog
mismatch with a serial forking scenario :
- log line 3 is telling us that a NO-ACK disconnection should be triggered
- log line 1-2 is telling us what happened when the ACK was received in
dlg_onroute(), oddly enough state 5 was old and new, could it be a
mismatch/confusio with the previous dialog, looking in this direction ...
1: 2020-09-25T16:30:16.896: dialog [dlg_handlers.c:1273]:
extra_ack_debug_info(): [ACK][1] state not changed >>>
call-id[562419_125824138_2072238224] to-tag[<sip:+14019991904@anon.com
;tag=gK02b68836]
2:
2020-09-25T16:30:16.896: dialog [dlg_handlers.c:1440]: dlg_onroute():
[ACK] state not changed old[5]new[5]
...
3: 2020-09-25T16:32:22.674: dialog [dlg_hash.c:247]: dlg_clean_run():
dialog disconnection no-ACK
call-id[562419_125824138_2072238224][1601051416]<[1601051542 - 60]
After looking at the pcap trace, call-id 562419_125824138_2072238224 was
involved in serial forking :
call attempt #1
X >> INVITE >> Y // no to-tag
X << 100
...
X << 408 // to-tag=594d50c3218065a60bb91fd47a70fbc1-59edef02
(locally generated)
X >> ACK // to-tag=594d50c3218065a60bb91fd47a70fbc1-59edef02
call attempt #2
X >> INVITE >> Z // no to-tag
X << 100
X << 200 << Z // to-tag=gK02b68836
X >> ACK >> Z // to-tag=gK02b68836 (Should be state old[3]new[4], I
wonder how it could possibly be state old[5]new[5])
I did look at several occurrences and there is always a locally generated
408/to-tag before, seems like I have a good lead to investigate further.