### Description The issue started with this improvement: https://github.com/kamailio/kamailio/commit/839cf89b02f8817156487a960ff62013... (it works fine for single dialog bodies, but make a mess with multiple-dialog bodies)
### Troubleshooting **Test scheme, preconditions:** - SIP server: kamailio + Asterisk - 3 SIP phones. UA-A, UA-B, UA-C
Kamailio is configured with presence+presence_dialoginfo modules. Asterisk sends PUBLISHes regarding changes of dialog's state, kamailio process it (handle_publish), and send NOTIFIYs to all subscribed endpoints
**Bug description** (the issue is considered from the point of view of UA-B's state):
- UA-A calls to UA-B => Asterisk send PUBLISHes (dialog UA-B1), UA-B (states on-the Trying -> Proceeding -> Confirmed) - UA-B makes attended transfer to UA-C (call answered) => Asterisk send PUBLISHes with complex XML (2 dialogs state in a body. UA-B1(Confirmed) and new UA-B2 (Confirmed)) - UA-B completes the transfer, as a result, Asterisk sends (bodies from this 2 PUBLISHes are below, in the SIP traffic section): 1. PUBLISH with multiple dialog body (UA-B1 - terminated, UA-B2 - confirmed) => processed OK 2. PUBLISH with single dialog body (UA-B2 - terminated.) => **Bug is here** so now: ``` presentity.c:1084 checks is_dialog_terminated // check is dialog terminated for event: dialog presentity.c:549 calls get_dialog_state // make a db query with the previous state (that query returnes: (UA-B1 - terminated, UA-B2 - confirmed)) // loop through result returned from DB, call parse_dialog_state_from_body presentity.c:307 - at this function: ... node = xmlNodeGetChildByName(node, "dialog"); // return part of xml with 2 dialogs ... childNode = xmlNodeGetChildByName(node, "state"); // return state of the first! dialog in a list, not the current dialog ... due to obtained state is "terminated", if(is_dialog_terminated(presentity)) returnes TRUE and subsequent code sends silently 200OK. ``` **Result:** - those subscribed to dialog state of UA-B, do not recieve NOTIFY with a valid state (BLFs remain "lighting")
**Attempts of workaround:** - I tried to go to "send_412" logic when is_dialog_terminated (TRUE). - But in the subsequent PUBLISH (state=terminated) Asterisk does not insert "call-id, to-tag, from-tag" parameters, and endpoint fails to recognize which dialog is terminated, and BLF remains to light.
#### SIP Traffic not SIP, but only relevant part of XML bodies. Multiple dialogs body, this one is last to succeed (look at the state of **first** dialog) ``` <?xml version="1.0"?> <dialog-info xmlns="urn:ietf:params:xml:ns:dialog-info" version="36" state="full" entity="252"> <dialog id="3bf1a5de53394e394f7b0241067faccf" call-id="3bf1a5de53394e394f7b0241067faccf" local-tag="as27f0e6f0" remote-tag="1349583034" remote-uri="sip:252@10.100.1.24:5064" local-uri="sip:251@127.0.0.1:6050" direction="recipient"> <note>Ready</note> <remote> <identity>252</identity> <target uri="252"> </target> </remote> <local> <identity>251</identity> <target uri="251"/> </local> <state>terminated</state> </dialog> <dialog id="4070150262@10.100.1.24" call-id="4070150262@10.100.1.24" local-tag="as7c372911" remote-tag="3861741811" remote-uri="sip:252@10.100.1.24:5064" local-uri="sip:254@10.100.1.85:6050" direction="initiator"> <note>On the phone</note> <remote> <identity display="user252">sip:252@10.100.1.85</identity> <target uri="sip:252@10.100.1.85"> </target> </remote> <local> <identity display="user254">sip:254@10.100.1.85</identity> <target uri="sip:254@10.100.1.85"/> </local> <state>confirmed</state> </dialog> </dialog-info> ```
Next PUBLISH is with single dialog, and its update is ignored (as incorrect state evaluated)
``` <?xml version="1.0"?> <dialog-info xmlns="urn:ietf:params:xml:ns:dialog-info" version="37" state="full" entity="252"> <dialog id="4070150262@10.100.1.24" call-id="4070150262@10.100.1.24" local-tag="as7c372911" remote-tag="3861741811" remote-uri="sip:252@10.100.1.24:5064" local-uri="sip:254@10.100.1.85:6050" direction="initiator"> <note>Ready</note> <remote> <identity>252</identity> <target uri="252"> </target> </remote> <local> <identity>254</identity> <target uri="254"/> </local> <state reason="local-bye">terminated</state> </dialog> </dialog-info> ``` ### Possible Solutions
Sorry for the long story, but for now I need your advice, with the logic for the fix (I already tried to start to fix, but get stuck with the logical concept). My current insight is quite cumbersome: - before is_dialog_terminated => get an array of dialogs call-ids from the body of current PUBLISH. Give it as input parameter for is_dialog_terminated - while processing get_dialog_state/parse_dialog_state_from_body = try to match state of matched dialogs - in some way try to update not complete body in a DB, but make some "partial" update. (update only those dialogs whose old state is NOT "terminated")
Thanks in advance for your advice. If the issue unclear from my description, please ask for details.
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
kamailio -v version: kamailio 5.0.3 (i386/linux) flags: STATS: Off, EXTRA_DEBUG, USE_TCP, USE_TLS, USE_SCTP, TLS_HOOKS, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAP, PKG_MALLOC, Q_MALLOC, F_MALLOC, TLSF_MALLOC, DBG_SR_MEMORY, USE_FUTEX, FAST_LOCK-ADAPTIVE_WAIT, USE_DNS_CACHE, USE_DNS_FAILOVER, USE_NAPTR, USE_DST_BLACKLIST, HAVE_RESOLV_RES ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535, DEFAULT PKG_SIZE 8MB poll method support: poll, epoll_lt, epoll_et, sigio_rt, select. id: unknown compiled on 11:27:38 Jan 30 2018 with gcc 4.3.2
@phil-lavin , can you take a look at this thread?
@vance-od - I tried a quick fix with the patches referenced above. Can you try with them and see if all ok? It relies on matching all related dialog states in the same xml document. If one found mismatching, then will not consider it terminated. Likely it can be improved with matching on callid, but this can be a starting point.
Hi, Daniel! thanks for the patch! will give it a try today
I tried proposed changes. in ps_match_dialog_state_from_body there is a loop: while(node != NULL) { .... why do we need there one more while loop? What actually happens: ..... if(strcmp(tmp_state, vstate)!=0) { // here we come only once, when matching first dialog ...
all later dialogs are iterating in the second while loop.
as for me this part of code ``` + do { + if(node->next != NULL && node->next->name != NULL + && xmlStrcmp(node->name, node->next->name) == 0) { + node = node->next; + break; + } + node = node->next; + } while(node != NULL); ```
should look like: ``` node = node->next; ``` maybe that checks for NULL are also necessary, but not while do-while.
General question - is there a way to propose changes in your custom branch?
thanks for your help!
What was the result of testing it? Not working?
The `while()` is there to deal with xml docs that have many `dialog` nodes. The `do { ... } while ()` is to find the next node with `dialog` name. These are to be compliant with xml specs, as there can be other nodes with different names. So just doing `node=node->next` doesn't ensure it will be the next `dialog` node, but the next node in the xml document.
The result: for multiple dialog bodie only first body checked (I added custom logging inside your code to see where it worked)
re-tested it again, it does check only first dialog in a body
your code with my custom logging ``` int ps_match_dialog_state_from_body(str body, int *is_dialog, char *vstate) { xmlDocPtr doc; xmlNodePtr node; xmlNodePtr childNode; char *tmp_state; int rmatch = 0;
*is_dialog = 0;
LM_ERR("****** body to parse %s\n", body);
doc = xmlParseMemory(body.s, body.len); if(doc == NULL || doc->children == NULL) { LM_ERR("failed to parse xml document\n"); return -1; }
node = xmlNodeGetChildByName(doc->children, "dialog");
while(node != NULL) { *is_dialog = 1;
childNode = xmlNodeGetChildByName(node, "state"); tmp_state = (char *)xmlNodeGetContent(childNode);
if (tmp_state != NULL) { LM_ERR("vstate is %s, tmp_state is %s\n", vstate, tmp_state);
if(strcmp(tmp_state, vstate)!=0) { LM_ERR("states are different! goto done \n");
xmlFree(tmp_state); rmatch = 0; goto done; } rmatch = 1; xmlFree(tmp_state); } /* search for next dialog node */ do { LM_ERR("do entered \n"); if(node->next != NULL && node->next->name != NULL && xmlStrcmp(node->name, node->next->name) == 0) { node = node->next; LM_ERR("break it \n"); break; } node = node->next; LM_ERR("node->next \n"); } while(node != NULL); }
done: xmlFreeDoc(doc); LM_ERR("going to return rmatch %d\n", rmatch); return rmatch; } ``` body of PUBLISH, that being processed: ``` <?xml version="1.0"?> <dialog-info xmlns="urn:ietf:params:xml:ns:dialog-info" version="109" state="full" entity="252"> <dialog id="3282208261@10.100.1.24" call-id="3282208261@10.100.1.24" local-tag="as0cf4d423" remote-tag="3673383467" remote-uri="sip:252@10.100.1.24:33035" local-uri="sip:254@10.100.1.85:6050" direction="initiator"> <note>On the phone</note> <remote> <identity display="user252">sip:252@10.100.1.85</identity> <target uri="sip:252@10.100.1.85"> </target> </remote> <local> <identity display="user254">sip:254@10.100.1.85</identity> <target uri="sip:254@10.100.1.85"/> </local> <state>confirmed</state> </dialog> </dialog-info> ``` logs printed by kamailio: ``` Feb 14 09:25:48 0050562c117c /usr/sbin/kamailio[29343]: ERROR: presence [presentity.c:321]: ps_match_dialog_state_from_body(): ****** body to parse <?xml version="1.0"?> <dialog-info xmlns="urn:ietf:params:xml:ns:dialog-info" version="108" state="full" entity="252"> <dialog id="0b386c6122b2dabc0fa49b1d4972ca3c@127.0.0.1:6050" call-id="0b386c6122b2dabc0fa49b1d4972ca3c@127.0.0.1:6050" local-tag="as29381132" remote-tag="3958334712" remote-uri="sip:252@10.100.1.24:33035" local-uri="sip:251@127.0.0.1:6050" direction="recipient"> <note>Ready</note> <remote> <identity>252</identity> <target uri="252"> </target> </remote> <local> <identity>251</identity> <target uri="251"/> </local> <state>terminated</state> </dialog> <dialog id="3282208261@10.100.1.24" call-id="3282208261@10.100.1.24" local-tag="as0cf4d423" remote-tag="3673383467" remote-uri="sip:252@10.100.1.24:33035" local-uri="sip:254@10.100.1.85:6050" direction="initiator"> <note>On the phone</note> <remote> <identity display="user252">sip:252@10.100.1.85</identity> <target uri="sip:252@10.100.1.85"> </target> </remote> <local> <identity display="user254">sip:254@10.100.1.85</identity> <target uri="sip:254@10.100.1.85"/> </local> <state>confirmed</state> </dialog> </dialog-info>
Feb 14 09:25:48 0050562c117c /usr/sbin/kamailio[29343]: ERROR: presence [presentity.c:341]: ps_match_dialog_state_from_body(): vstate is terminated, tmp_state is terminated Feb 14 09:25:48 0050562c117c /usr/sbin/kamailio[29343]: ERROR: presence [presentity.c:355]: ps_match_dialog_state_from_body(): do entered Feb 14 09:25:48 0050562c117c /usr/sbin/kamailio[29343]: ERROR: presence [presentity.c:363]: ps_match_dialog_state_from_body(): node->next Feb 14 09:25:48 0050562c117c /usr/sbin/kamailio[29343]: ERROR: presence [presentity.c:355]: ps_match_dialog_state_from_body(): do entered Feb 14 09:25:48 0050562c117c /usr/sbin/kamailio[29343]: ERROR: presence [presentity.c:363]: ps_match_dialog_state_from_body(): node->next Feb 14 09:25:48 0050562c117c /usr/sbin/kamailio[29343]: ERROR: presence [presentity.c:355]: ps_match_dialog_state_from_body(): do entered Feb 14 09:25:48 0050562c117c /usr/sbin/kamailio[29343]: ERROR: presence [presentity.c:363]: ps_match_dialog_state_from_body(): node->next Feb 14 09:25:48 0050562c117c /usr/sbin/kamailio[29343]: ERROR: presence [presentity.c:355]: ps_match_dialog_state_from_body(): do entered Feb 14 09:25:48 0050562c117c /usr/sbin/kamailio[29343]: ERROR: presence [presentity.c:363]: ps_match_dialog_state_from_body(): node->next Feb 14 09:25:48 0050562c117c /usr/sbin/kamailio[29343]: ERROR: presence [presentity.c:369]: ps_match_dialog_state_from_body(): going to return rmatch 1 Feb 14 09:25:48 0050562c117c /usr/sbin/kamailio[29343]: ERROR: presence [presentity.c:559]: ps_match_dialog_state(): rmatch is 1 ```
so for some reason in `do { ... } while ()` it did not break.
Can you print `node->name` and `node->next->name` inside the `do { } while()` to see the names that are compared there?
Hi, Daniel! Thank you for your help!
I tried to print that value in logs, but get stuck a bit (I did not found how to print xml elements correctly). I got some sequence of crashes while attempts to add appropriate logs.
What I tried more: I changed: ` && xmlStrcmp(node->name, node->next->name) == 0) { ` to ` && xmlStrcmp(node->next->name, (const xmlChar *)"dialog") == 0) { ` and now second dialog compare works, appropriate logs are (all is the same as in my latest comment): ``` Feb 15 15:15:18 0050562c117c /usr/sbin/kamailio[8030]: ERROR: presence [presentity.c:361]: ps_match_dialog_state_from_body(): vstate is terminated, tmp_state is terminated Feb 15 15:15:18 0050562c117c /usr/sbin/kamailio[8030]: ERROR: presence [presentity.c:375]: ps_match_dialog_state_from_body(): do entered Feb 15 15:15:18 0050562c117c /usr/sbin/kamailio[8030]: ERROR: presence [presentity.c:389]: ps_match_dialog_state_from_body(): node->next Feb 15 15:15:18 0050562c117c /usr/sbin/kamailio[8030]: ERROR: presence [presentity.c:375]: ps_match_dialog_state_from_body(): do entered Feb 15 15:15:18 0050562c117c /usr/sbin/kamailio[8030]: ERROR: presence [presentity.c:385]: ps_match_dialog_state_from_body(): break it Feb 15 15:15:18 0050562c117c /usr/sbin/kamailio[8030]: ERROR: presence [presentity.c:361]: ps_match_dialog_state_from_body(): vstate is terminated, tmp_state is confirmed Feb 15 15:15:18 0050562c117c /usr/sbin/kamailio[8030]: ERROR: presence [presentity.c:364]: ps_match_dialog_state_from_body(): states are different! goto done Feb 15 15:15:18 0050562c117c /usr/sbin/kamailio[8030]: ERROR: presence [presentity.c:395]: ps_match_dialog_state_from_body(): going to return rmatch 0 ``` in my understanding of your proposed patch, we compare: `node->name` - is always the name of current node `node->next->name` - name of next node such compare would match only if there are to consecutive "dialog" nodes. (and that did not happen in my yesterday's tests).
Sorry if I'am writing/supposing something trivial or incorrect, I have never worked before with libxml and XML from C code as well.
Thanks
Put this do...while():
``` /* search for next dialog node */ do { LM_ERR("do entered \n"); if(node->next != NULL && node->next->name != NULL && xmlStrcmp(node->name, node->next->name) == 0) { node = node->next; LM_ERR("break it \n"); break; } if(node->next != NULL && node->next->name != NULL) { LM_ERR("compared [%s] with [%s]\n", node->name, node->next->name); } node = node->next; LM_ERR("node->next \n"); } while(node != NULL); ```
my shame... I did not do NULL check and that was the reason for crash. With your logging added (processing same body and input PUBLISH as before):
``` Feb 16 08:54:16 0050562c117c /usr/sbin/kamailio[30264]: ERROR: presence [presentity.c:361]: ps_match_dialog_state_from_body(): vstate is terminated, tmp_state is terminated Feb 16 08:54:16 0050562c117c /usr/sbin/kamailio[30264]: ERROR: presence [presentity.c:388]: ps_match_dialog_state_from_body(): do entered Feb 16 08:54:16 0050562c117c /usr/sbin/kamailio[30264]: ERROR: presence [presentity.c:396]: ps_match_dialog_state_from_body(): compared [dialog] with [text] Feb 16 08:54:16 0050562c117c /usr/sbin/kamailio[30264]: ERROR: presence [presentity.c:399]: ps_match_dialog_state_from_body(): node->next Feb 16 08:54:16 0050562c117c /usr/sbin/kamailio[30264]: ERROR: presence [presentity.c:388]: ps_match_dialog_state_from_body(): do entered Feb 16 08:54:16 0050562c117c /usr/sbin/kamailio[30264]: ERROR: presence [presentity.c:396]: ps_match_dialog_state_from_body(): compared [text] with [dialog] Feb 16 08:54:16 0050562c117c /usr/sbin/kamailio[30264]: ERROR: presence [presentity.c:399]: ps_match_dialog_state_from_body(): node->next Feb 16 08:54:16 0050562c117c /usr/sbin/kamailio[30264]: ERROR: presence [presentity.c:388]: ps_match_dialog_state_from_body(): do entered Feb 16 08:54:16 0050562c117c /usr/sbin/kamailio[30264]: ERROR: presence [presentity.c:396]: ps_match_dialog_state_from_body(): compared [dialog] with [text] Feb 16 08:54:16 0050562c117c /usr/sbin/kamailio[30264]: ERROR: presence [presentity.c:399]: ps_match_dialog_state_from_body(): node->next Feb 16 08:54:16 0050562c117c /usr/sbin/kamailio[30264]: ERROR: presence [presentity.c:388]: ps_match_dialog_state_from_body(): do entered Feb 16 08:54:16 0050562c117c /usr/sbin/kamailio[30264]: ERROR: presence [presentity.c:399]: ps_match_dialog_state_from_body(): node->next Feb 16 08:54:16 0050562c117c /usr/sbin/kamailio[30264]: ERROR: presence [presentity.c:406]: ps_match_dialog_state_from_body(): going to return rmatch 1 ```
I just pushed patch 29d09b38fab5a1ee8d7f6f82818e3d3446d4dd8b for it.
thanks for the last patch, I tried it. Making the fast check the match of multiple dialog body works as expected. Let me do deeper testing today. Will report results later.
Closed #1427.
Thanks, reopen (if related) or create a new item (if different) when something wrong pops up.
Hi, Daniel! I did more testing and provided change works. What do you think regarding proper solution with comparing call-id (as you mentioned in the beginning)? Because current change improve the situation, but it is not what is expected by RFC.
Anyway - I think this changes should be committed.
Regarding the following work, should we create another task, or leave this thread open? (I have plans to work on the task following weeks, and possibly will need advice/review)
My work was to get it at least work as before of 839cf89. I do not have time for any further work, so anyone else willing to do it is more than welcome.
You can open a PR when you start the work and ask for review/suggestions if you get stuck.