<!-- Kamailio Pull Request Template -->
<!-- IMPORTANT: - for detailed contributing guidelines, read: https://github.com/kamailio/kamailio/blob/master/.github/CONTRIBUTING.md - pull requests must be done to master branch, unless they are backports of fixes from master branch to a stable branch - backports to stable branches must be done with 'git cherry-pick -x ...' - code is contributed under BSD for core and main components (tm, sl, auth, tls) - code is contributed GPLv2 or a compatible license for the other components - GPL code is contributed with OpenSSL licensing exception -->
#### Pre-Submission Checklist <!-- Go over all points below, and after creating the PR, tick all the checkboxes that apply --> <!-- All points should be verified, otherwise, read the CONTRIBUTING guidelines from above--> <!-- If you're unsure about any of these, don't hesitate to ask on sr-dev mailing list --> - [x ] Commit message has the format required by CONTRIBUTING guide - [x ] Commits are split per component (core, individual modules, libs, utils, ...) - [x ] Each component has a single commit (if not, squash them into one commit) - [x ] No commits to README files for modules (changes must be done to docbook files in `doc/` subfolder, the README file is autogenerated)
#### Type Of Change - [ ] Small bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds new functionality) - [x ] Breaking change (fix or feature that would change existing functionality)
#### Checklist: <!-- Go over all points below, and after creating the PR, tick the checkboxes that apply --> - [ ] PR should be backported to stable branches - [x ] Tested changes locally -- tested at 5.1.6 - [ ] Related to issue #XXXX (replace XXXX with an open issue number)
#### Description <!-- Describe your changes in detail --> Problem: Forwarding of 200 OK while sent 488 waits for ACK, destroys the UAS send buffer Scenario: During suspension of 200 OK by ims_qos module function Rx_AAR() at terminating PCSCF, The PCRF sends an AA Answer with result code DIAMETER_TOO_BUSY (3004), which triggers the PCSCF to send a 488 ‘Sorry no QoS available’ to the originating side (ims_dialog module function dlg_terminate()). Afterwards neither the 200 OK nor the ACK(488) is processed correctly by the PCSCF. Solution: The UAS send buffer should not be overwritten during processing of 200 OK, because non-2xx is needed to associate the ACK message in a correct way. 200 OK must be forwarded statelessly. Side-Effect (potentially breaks existing function): Some callbacks cannot be called for the 200 OK, to avoid messing the stored 488.
You can view, comment on, or merge this pull request online at:
https://github.com/kamailio/kamailio/pull/2590
-- Commit Summary --
* tm: 200 OK not processed correctly by Proxy after final non-2xx
-- File Changes --
M src/modules/tm/t_reply.c (72)
-- Patch Links --
https://github.com/kamailio/kamailio/pull/2590.patch https://github.com/kamailio/kamailio/pull/2590.diff
According to e-mail discussion with the author, there seems to be some adaptions necessary. Set to draft for now.
Hi @henningw : we did our internal check now and no change is necessary. Please remove the WIP label
Thanks for the confirmation.
Maybe one question about or potential future contributions: how is the testing planned by the community? I mean, we have an environment for patching stable release of kamailio (currently 5.1.6), where we can test our local (IMS related) use cases. We cannot test general use cases for the master branch. So any change must somehow pass some regression at the community, mustn't it?
@christoph-v-kontron testing is usually done during the development and also in the freeze period before a major release is done. Of course it would be good if you can test your extensions also in master regarding the expected behavior, being the developer of them.
You mention some side effects of this patch due to the changes in transaction call-back behavior in the commit log. This should be probably reviewed and discussed before merging.
[...]This should be probably reviewed and discussed before merging[...] Of course. Should we start a discussion at the mailing list?
@christoph-v-kontron - trying to understand the scenario from sip messages flow point of view ... is here about the case when 200ok is received, then suspended (via `t_suspend()`), and while it is suspended, the 488 arrives and it is forwarded, later the 200ok is resumed (`t_continue()`), being forwarded as well?
@miconda almost correct - your scenario would lead to the same error, but it is a scenario with ims_qos module. The **t_suspend()** is not explicitly called, but it happens implicitly within the function $var(aarret) = **Rx_AAR**("**MT_aar_reply**","term","",-1); switch ($var(aarret)) { case 0: #suspend was successful and we must break exit;
The module sends a DIAMETER AAR message to some peer while the 200 OK is suspended, but the DIAMETER response (AAA) indicates a failure.
The 488 is created by **dlg_terminate()** from ims_dialog module. route[**MT_aar_reply**] { #this is async so to know status we have to check the reply avp switch ($avp(s:aar_return_code)) { case 1: xlog("L_DBG", "Diameter: Orig AAR success on media authorization\n"); break; default: #comment this if you want to allow even if Rx fails if(dlg_get("$avp(CALLID_CUSTOM_AVP)","$avp(FTAG_CUSTOM_AVP)","$avp(TTAG_CUSTOM_AVP)")){ **dlg_terminate("all", "Sorry no QoS available")**; usleep("100000"); #prm2272 exit; } } }
This leads to the situation that the 488 and the 200 OK are sent nearly at the same instance of time (only a few ms distance) and hence the ACK(488) is received AFTER the 200 OK is sent.
I can send you per e-mail a powerpoint from our analysis, OK?
@christoph-v-kontron any updates on this patch, does the (apparently) private discussion clarified it?
@henningw : no private discussion. @miconda asked a public question (see above) and I answered (see above). So I assume, it's his turn.
@henningw : we are fine with the current version of the PR
@christoph-v-kontron - I thought of it couple of time and I could not decide what is the right approach. It is a very particular case it happens in your scenario with a 200ok suspended.
Somehow make sense what the patch tries to do, but the side effects are not completely asserted. I would say that we should add a modparam to control this behaviour at the moment, with a default value making it to behave as the patch proposed. But still to have the option to turn it of and behave as before the patch.
So I am going to merge the PR, then I will add a new modparam to be able to control its behaviour.
Merged #2590 into master.