Hi all
I'm playing with the dialog module to count the calls and limit the number of calls per subscriber.
After configured the proxy I've tested a call and I realize that the dialog list is not empty after the call ends.
Here's a sample call log and the dialog list while the call was being stablished and 3 minutes after that:
root@proxy1a:~# grep "fbrzjkplmvlauty@quenya" /var/log/kamailio.log New request - M=INVITE Dialog set mark TOTAL Proxy authentication failed New request - M=INVITE Dialog set mark TOTAL Dialog set mark user and userout to 'd5a78dafcb93' Dialog set mark type to outgoing Call to PSTN Dialog set mark peer and peerout to '4' Relaying request Request leaving server NAT-Reply - S=100 - Connecting NAT-Reply - S=488 - Not Acceptable Here Failure route for PSTN call Unset dialog mark peer and peerout '4' Filter reply code No failover routing needed for this response code New request - M=ACK
While the call was being stablished or just after it was dropped (quickly) sercmd> dlg.list
hash:3837:638039689 state:1 timestart:0 timeout:0 callid:fbrzjkplmvlauty@quenya from_tag:inobo to_tag: from_uri:sip:123456789@labsip.com to_uri:sip:0043987654321@labsip.com caller_contact:sip:123456789-sbhn1naovm6bc@111.111.111.164:5060;transport=udp caller_cseq:149 caller_route_set:sip:111.111.111.135;r2=on;lr=on;ftag=inobo;nat=yes,sip:111.111.111.100;r2=on;lr=on;ftag=inobo;nat=yes callee_contact: callee_cseq: callee_route_set: caller_bind_addr:udp:111.111.111.138:5062 callee_bind_addr:
hash:3837:638039690 state:5 timestart:0 timeout:0 callid:fbrzjkplmvlauty@quenya from_tag:inobo to_tag: from_uri:sip:123456789@labsip.com to_uri:sip:0043987654321@labsip.com caller_contact:sip:123456789-sbhn1naovm6bc@111.111.111.164:5060;transport=udp caller_cseq:150 caller_route_set: sip:111.111.111.135;r2=on;lr=on;ftag=inobo;nat=yes,sip:111.111.111.100;r2=on;lr=on;ftag=inobo;nat=yes callee_contact: callee_cseq: callee_route_set: caller_bind_addr:udp:111.111.111.138:5062 callee_bind_addr:
After a couple of minutes:
hash:3837:638039689 state:1 timestart:0 timeout:0 callid:fbrzjkplmvlauty@quenya from_tag:inobo to_tag: from_uri:sip:123456789@labsip.com to_uri:sip:0043987654321@labsip.com caller_contact:sip:123456789-sbhn1naovm6bc@111.111.111.164:5060;transport=udp caller_cseq:149 caller_route_set: sip:111.111.111.135;r2=on;lr=on;ftag=inobo;nat=yes,sip:111.111.111.100;r2=on;lr=on;ftag=inobo;nat=yes callee_contact: callee_cseq: callee_route_set: caller_bind_addr:udp:111.111.111.138:5062 callee_bind_addr:
sercmd> dlg.profile_get_size total Profile:total => profile:total value: count:1 sercmd> dlg.profile_get_size user Profile:user => profile:user value: count:0 sercmd> dlg.profile_get_size userout Profile:userout => profile:userout value: count:0 sercmd> dlg.profile_get_size peer Profile:peer => profile:peer value: count:0 sercmd> dlg.profile_get_size peerout Profile:peerout => profile:peerout value: count:0
Obviously, there's no such a call. But my dialog profile TOTOAL say's there's one, which is listed by sercmd. The dialog is persistent, I guess it will be destroyed after 12 hours, but I'd like to know why it's still there if I received a negative response and the ACK for it.
Thanks,
Jon
Hey,
On 16.08.2011 17:43, Jon Bonilla wrote:
After configured the proxy I've tested a call and I realize that the dialog list is not empty after the call ends.
Here's a sample call log and the dialog list while the call was being stablished and 3 minutes after that:
root@proxy1a:~# grep "fbrzjkplmvlauty@quenya" /var/log/kamailio.log New request - M=INVITE Dialog set mark TOTAL Proxy authentication failed New request - M=INVITE Dialog set mark TOTAL
[snip!]
While the call was being stablished or just after it was dropped (quickly) sercmd> dlg.list
hash:3837:638039689 state:1 timestart:0 timeout:0
hash:3837:638039690 state:5 timestart:0 timeout:0
After a couple of minutes:
hash:3837:638039689 state:1 timestart:0 timeout:0
Obviously, there's no such a call. But my dialog profile TOTOAL say's there's one, which is listed by sercmd. The dialog is persistent, I guess it will be destroyed after 12 hours, but I'd like to know why it's still there if I received a negative response and the ACK for it.
Looks pretty much like two dialog entries were created for the same dialog. One of them got cleaned up nicely on call termination while the other one keeps dangling until the dialog timer kicks in (12 hours in your case).
Most notably, this situation may happen when spiraling calls without using the spiral detection feature in the dialog module. (A spiral is a scenario where a proxy is processing the same request twice, e.g., to implement call forwardings.) Did you possibly toggle the default spiral detection setting from enabled to disabled?
If that's not the case: Can you check the Kamailio logs for any suspicious dialog-related log messages, particularly those with WARN or higher log verbosity?
Cheers,
--Timo
El Tue, 16 Aug 2011 18:03:48 +0200 Timo Reimann timo.reimann@1und1.de escribió:
Hey,
root@proxy1a:~# grep "fbrzjkplmvlauty@quenya" /var/log/kamailio.log New request - M=INVITE Dialog set mark TOTAL Proxy authentication failed New request - M=INVITE Dialog set mark TOTAL
[snip!]
While the call was being stablished or just after it was dropped (quickly) sercmd> dlg.list
hash:3837:638039689 state:1 timestart:0 timeout:0
hash:3837:638039690 state:5 timestart:0 timeout:0
After a couple of minutes:
hash:3837:638039689 state:1 timestart:0 timeout:0
Looks pretty much like two dialog entries were created for the same dialog. One of them got cleaned up nicely on call termination while the other one keeps dangling until the dialog timer kicks in (12 hours in your case).
Most notably, this situation may happen when spiraling calls without using the spiral detection feature in the dialog module. (A spiral is a scenario where a proxy is processing the same request twice, e.g., to implement call forwardings.) Did you possibly toggle the default spiral detection setting from enabled to disabled?
No I didn't
Here's the dialog modparams:
modparam("dialog","dlg_flag", 9) modparam("dialog","profiles_no_value","total ;emergency") modparam("dialog","profiles_with_value","peer ; user ; type ; peerout ; userout")
Maybe setting the dialog to TOTAL twice is the problem? I set it before proxy authentication. I asume that dialog is erased when sending the 407 back and a new one is created for the second INVITE.
If that's not the case: Can you check the Kamailio logs for any suspicious dialog-related log messages, particularly those with WARN or higher log verbosity?
I'll increase kamailio's loglevel and reproduce the call. I'll send any WARN or above message related to this.
cheers,
Jon
El Tue, 16 Aug 2011 18:14:12 +0200 Jon Bonilla (Manwe) manwe@aholab.ehu.es escribió:
El Tue, 16 Aug 2011 18:03:48 +0200 Timo Reimann timo.reimann@1und1.de escribió:
If that's not the case: Can you check the Kamailio logs for any suspicious dialog-related log messages, particularly those with WARN or higher log verbosity?
I'll increase kamailio's loglevel and reproduce the call. I'll send any WARN or above message related to this.
WARNING: dialog [dlg_handlers.c:884]: unable to find dialog for ACK with route param '6e4.5b2debb4' [1254:1270796981]
WARNING: dialog [dlg_handlers.c:884]: unable to find dialog for BYE with route param '6e4.5b2debb4' [1254:1270796981]
Looks like these two warnings match other calls. Not the two ones I've sent as example.
Hey Jon,
On 16.08.2011 18:14, Jon Bonilla wrote:
Looks pretty much like two dialog entries were created for the same dialog. One of them got cleaned up nicely on call termination while the other one keeps dangling until the dialog timer kicks in (12 hours in your case).
Most notably, this situation may happen when spiraling calls without using the spiral detection feature in the dialog module. (A spiral is a scenario where a proxy is processing the same request twice, e.g., to implement call forwardings.) Did you possibly toggle the default spiral detection setting from enabled to disabled?
No I didn't
Here's the dialog modparams:
modparam("dialog","dlg_flag", 9) modparam("dialog","profiles_no_value","total ;emergency") modparam("dialog","profiles_with_value","peer ; user ; type ; peerout ; userout")
Maybe setting the dialog to TOTAL twice is the problem? I set it before proxy authentication. I asume that dialog is erased when sending the 407 back and a new one is created for the second INVITE.
That's what spiral detection is supposed to prevent, i.e., the creation of multiple structures for the same dialog. However, it only works if you use the flag to track dialogs, not if you call dlg_manage() explicitly. I see you have actually configured flag #9 as dialog flag but just to make sure: Are you using that flag only, not dlg_manage() ?
Even if you did not use dlg_manage(), however, dialogs would likely not be tracked properly if you started doing so on unauthenticated calls. The reason is that terminated calls do not get removed until the associated transaction is deleted. When that second, authenticated INVITE is handled by the dialog module, it will find a dialog structure already in the DELETED state (as the dialog ID matches) and refuse further processing on it.
This may be considered as a shortcoming of the current implementation. On the other hand, do you actually need to track unauthenticated calls? They could just be SIP attacks trying to disrupt your infrastructure and shouldn't be accounted as real calls whatsoever. If you start tracking dialogs after authentication, things should look better.
Cheers,
--Timo
El Tue, 16 Aug 2011 18:38:17 +0200 Timo Reimann timo.reimann@1und1.de escribió:
Hey Jon,
On 16.08.2011 18:14, Jon Bonilla wrote:
Looks pretty much like two dialog entries were created for the same dialog. One of them got cleaned up nicely on call termination while the other one keeps dangling until the dialog timer kicks in (12 hours in your case).
Most notably, this situation may happen when spiraling calls without using the spiral detection feature in the dialog module. (A spiral is a scenario where a proxy is processing the same request twice, e.g., to implement call forwardings.) Did you possibly toggle the default spiral detection setting from enabled to disabled?
No I didn't
Here's the dialog modparams:
modparam("dialog","dlg_flag", 9) modparam("dialog","profiles_no_value","total ;emergency") modparam("dialog","profiles_with_value","peer ; user ; type ; peerout ; userout")
Maybe setting the dialog to TOTAL twice is the problem? I set it before proxy authentication. I asume that dialog is erased when sending the 407 back and a new one is created for the second INVITE.
That's what spiral detection is supposed to prevent, i.e., the creation of multiple structures for the same dialog. However, it only works if you use the flag to track dialogs, not if you call dlg_manage() explicitly. I see you have actually configured flag #9 as dialog flag but just to make sure: Are you using that flag only, not dlg_manage() ?
I use dlg_manage, not the flag. I don't use the flag at all. What would be the difference using the flag? Any example about how to use the flag instead of dlg_manage?
Even if you did not use dlg_manage(), however, dialogs would likely not be tracked properly if you started doing so on unauthenticated calls. The reason is that terminated calls do not get removed until the associated transaction is deleted. When that second, authenticated INVITE is handled by the dialog module, it will find a dialog structure already in the DELETED state (as the dialog ID matches) and refuse further processing on it.
This may be considered as a shortcoming of the current implementation. On the other hand, do you actually need to track unauthenticated calls? They could just be SIP attacks trying to disrupt your infrastructure and shouldn't be accounted as real calls whatsoever. If you start tracking dialogs after authentication, things should look better.
Only the dialog profile mark or also the dlg_manage?
I set it very soon in my scenario because I also receive unauthenticated calls from trusted ips that I need to track. I'll try to check if I can deal with this and move it somewhere else.
Cheers,
--Timo
Thanks Timo
On 16.08.2011 18:54, Jon Bonilla wrote:
El Tue, 16 Aug 2011 18:38:17 +0200 Timo Reimann timo.reimann@1und1.de escribió:
Hey Jon,
On 16.08.2011 18:14, Jon Bonilla wrote:
Looks pretty much like two dialog entries were created for the same dialog. One of them got cleaned up nicely on call termination while the other one keeps dangling until the dialog timer kicks in (12 hours in your case).
Most notably, this situation may happen when spiraling calls without using the spiral detection feature in the dialog module. (A spiral is a scenario where a proxy is processing the same request twice, e.g., to implement call forwardings.) Did you possibly toggle the default spiral detection setting from enabled to disabled?
No I didn't
Here's the dialog modparams:
modparam("dialog","dlg_flag", 9) modparam("dialog","profiles_no_value","total ;emergency") modparam("dialog","profiles_with_value","peer ; user ; type ; peerout ; userout")
Maybe setting the dialog to TOTAL twice is the problem? I set it before proxy authentication. I asume that dialog is erased when sending the 407 back and a new one is created for the second INVITE.
That's what spiral detection is supposed to prevent, i.e., the creation of multiple structures for the same dialog. However, it only works if you use the flag to track dialogs, not if you call dlg_manage() explicitly. I see you have actually configured flag #9 as dialog flag but just to make sure: Are you using that flag only, not dlg_manage() ?
I use dlg_manage, not the flag. I don't use the flag at all. What would be the difference using the flag? Any example about how to use the flag instead of dlg_manage?
dlg_manage() enables you to start dialog tracking not just on the initial INVITE but on any sequential request too. If you want to use the flag, all you need to do is enable it in the configuration script during the course of an INVITE processing. Practically, you replace a call to dlg_manage() with setflag().
I hate to say this but there's a compelling need to use dlg_manage() in your case: If you want to use profiles from INVITE messages, you cannot use the flag. This is, again, due to deficiencies in the module (sigh) which should probably be fixed. Until then, you're stuck with the spiral-detection-less (read: broken) dlg_manage().
You can solve your dilemma by waiting for SIP-Router 3.2 which improves the dialog module such that spiral detection is employed when calling dlg_manage() too. Alternatively, you can git-checkout my branch treimann/acc-cdr which already contains the fix. (This would also be a cool opportunity to have someone else but me test the code. :) ).
Even if you did not use dlg_manage(), however, dialogs would likely not be tracked properly if you started doing so on unauthenticated calls. The reason is that terminated calls do not get removed until the associated transaction is deleted. When that second, authenticated INVITE is handled by the dialog module, it will find a dialog structure already in the DELETED state (as the dialog ID matches) and refuse further processing on it.
This may be considered as a shortcoming of the current implementation. On the other hand, do you actually need to track unauthenticated calls? They could just be SIP attacks trying to disrupt your infrastructure and shouldn't be accounted as real calls whatsoever. If you start tracking dialogs after authentication, things should look better.
Only the dialog profile mark or also the dlg_manage?
dlg_manage() does the tracking magic; dialog profiling simply assigns dialogs to user-defined slots.
I set it very soon in my scenario because I also receive unauthenticated calls from trusted ips that I need to track. I'll try to check if I can deal with this and move it somewhere else.
Actually, if you have a legitimate reason to start tracking that early, the module should be fixed. Time's restricted on my side at the moment, however, but I'll try to get to it ASAP.
I think it's worth a try to see if my git branch already does better for your use case. If you could give it a shot I'd be glad to know how it works or whether it needs further work on.
Cheers,
--Timo
El Tue, 16 Aug 2011 19:26:25 +0200 Timo Reimann timo.reimann@1und1.de escribió:
I think it's worth a try to see if my git branch already does better for your use case. If you could give it a shot I'd be glad to know how it works or whether it needs further work on.
Hi Timo
I'll try to get the patch (or the whole dialog module) from your branch and merge it into my 3.1.3 build. I'll make some tests tomorrow and I'll report back to you.
thanks,
Jon
On 16.08.2011 19:32, Jon Bonilla wrote:
I think it's worth a try to see if my git branch already does better for your use case. If you could give it a shot I'd be glad to know how it works or whether it needs further work on.
I'll try to get the patch (or the whole dialog module) from your branch and merge it into my 3.1.3 build. I'll make some tests tomorrow and I'll report back to you.
The fix is part of the bigger commit c021559e41. I tried to apply the entire patch to both 3.1.3 and 3.1.4 but it produces too many failures each time.
So you'll either have to dig into the commit yourself and separate the fix, get a hold of my entire branch, or wait for 3.1.5 to release.
Sorry for any inconveniences.
Cheers,
--Timo
On 16.08.2011 19:43, Timo Reimann wrote:
So you'll either have to dig into the commit yourself and separate the fix, get a hold of my entire branch, or wait for 3.1.5 to release.
...or check out master: I just merged treimann/acc-cdr into master which was supposed to happen before the 3.2 feature freeze anyway.
Cheers,
--Timo
Hi Timo,
On 08/16/2011 07:59 PM, Timo Reimann wrote:
On 16.08.2011 19:43, Timo Reimann wrote:
So you'll either have to dig into the commit yourself and separate the fix, get a hold of my entire branch, or wait for 3.1.5 to release.
...or check out master: I just merged treimann/acc-cdr into master which was supposed to happen before the 3.2 feature freeze anyway.
Following up on this, I've seen couple of commits to dialog and merges to 3.1 yesterday. Is my assumption correct that this should actually work in the latest 3.1 branch now?
Andreas
Hey Andreas,
Am 27.08.2011 um 11:39 schrieb Andreas Granig:
On 08/16/2011 07:59 PM, Timo Reimann wrote:
On 16.08.2011 19:43, Timo Reimann wrote:
So you'll either have to dig into the commit yourself and separate the fix, get a hold of my entire branch, or wait for 3.1.5 to release.
...or check out master: I just merged treimann/acc-cdr into master which was supposed to happen before the 3.2 feature freeze anyway.
Following up on this, I've seen couple of commits to dialog and merges to 3.1 yesterday. Is my assumption correct that this should actually work in the latest 3.1 branch now?
Unfortunately not: The patches committed yesterday primarily fix a bug in reference counting management of the dialog module that could lead to segmentation faults in certain scenarios.
The issue which Jon brought up initially requires modifications to the dialog state machine with expectingly larger code changes. I haven't found the time to address this one yet but created a Flyspray entry which I assigned myself to:
http://sip-router.org/tracker/index.php?do=details&task_id=146
If you'd like to be informed when the bug is fixed, please register yourself as a watcher.
Cheers,
--Timo
El Tue, 16 Aug 2011 17:43:05 +0200 Jon Bonilla (Manwe) manwe@aholab.ehu.es escribió:
Another example, which is even worse for my configuration: A call rejected by the proxy itself sending a 480 response to the subscriber:
New request - M=INVITE Dialog set mark TOTAL Proxy authentication failed New request - M=INVITE Dialog set mark TOTAL Dialog set mark user and userout to 'd5a78dafcb93' Destination check failed, reject call with 480
Then, after the call has been rejected I see two dialogs in my list:
sercmd> dlg.profile_get_size total Profile:total => profile:total value: count:2 sercmd> dlg.profile_get_size user Profile:user => profile:user value: count:1 sercmd> dlg.profile_get_size userout Profile:userout => profile:userout value: count:1
sercmd> dlg.list hash:1002:1495821656 state:1 timestart:0 timeout:0 ... hash:2097:256764445 state:1 timestart:0 timeout:0 ...
Any ideas about this? Is it a bug or something I'm doing wrong in my configuration?
thanks,
Jon
El Tue, 16 Aug 2011 18:06:26 +0200 Jon Bonilla (Manwe) manwe@aholab.ehu.es escribió:
Then, after the call has been rejected I see two dialogs in my list:
sercmd> dlg.profile_get_size total Profile:total => profile:total value: count:2 sercmd> dlg.profile_get_size user Profile:user => profile:user value: count:1 sercmd> dlg.profile_get_size userout Profile:userout => profile:userout value: count:1
sercmd> dlg.list hash:1002:1495821656 state:1 timestart:0 timeout:0 ... hash:2097:256764445 state:1 timestart:0 timeout:0 ...
I've rechecked the dialog state of my second example call:
sercmd> dlg.list hash:1002:1495821656 state:1 timestart:0 timeout:0 hash:2097:256764445 state:1 timestart:0 timeout:0
They are still there. But the profile count has changed:
sercmd> dlg.profile_get_size total Profile:total => profile:total value: count:2 sercmd> dlg.profile_get_size user Profile:user => profile:user value: count:0 sercmd> dlg.profile_get_size userout Profile:userout => profile:userout value: count:0