Hi Hugh,
that is indeed the problem, and I noticed this pattern in similar modules.
I remember having this issue with the ims_charging module which I patched
for enhancements and that works in a similar fashion regarding the
suspension and resuming of transactions.
For error conditions, we might have the return code unset, causing the crash
you described. To avoid that , I imported the transaction AVP list right
after the transaction lookup and before checking for any kind of errors.
Also, I always set the AVP to a known value whether there was an error or
not. Attached I provide you a patch (which I haven't tested but) that will
hopefully cover all possible scenarios, could you please try it?
Regarding the null string when printing the AVP from configuration file, I
think this is an interpolation problem due to the fact that the AVP is
actually numeric and not alphanumeric. If you check the function that
creates the AVP, you will notice AVP_NAME_STR as the type. I also changed
this in the ims_charging module and here's the function that I used instead
[1].
[1]
Regards,
On Tue, Dec 17, 2013 at 2:12 PM, Hugh Waite <hugh.waite(a)crocodile-rcs.com>
wrote:
Hi Carlos, Carsten,
From a bit of code inspection, it looks like this affects the error paths
for the diameter responses.
I've seen these warnings printed from both the s-cscf, and the i-cscf when
there were diameter timeouts (although it didn't cause a crash every time).
Dec 15 12:13:23 kamailio kam-scscf[22542]: ERROR: <script>: We need to do
an UNREG server SAR assignemnt
Dec 15 12:13:23 kamailio kam-scscf[22542]: INFO: ims_registrar_scscf
[cxdx_sar.c:79]: create_return_code(): created AVP successfully :
[saa_return_code] - [-2]
Dec 15 12:13:23 kamailio kam-scscf[22553]: INFO: ims_registrar_scscf
[cxdx_avp.c:138]: cxdx_get_avp(): cxdx_get_experimental_result_code: Failed
finding avp
Dec 15 12:13:23 kamailio kam-scscf[22553]: INFO: ims_registrar_scscf
[cxdx_avp.c:138]: cxdx_get_avp(): cxdx_get_charging_info: Failed finding avp
Dec 15 12:13:23 kamailio kam-scscf[22553]: ERROR: <script>: Unknown return
code from SAR, value is [<null>]
...
Dec 16 17:53:51 kamailio kam-icscf[23653]: INFO: ims_icscf
[cxdx_uar.c:71]: create_uaa_return_code(): created AVP successfully :
[uaa_return_code]
Dec 16 17:53:57 kamailio kam-icscf[23666]: ERROR: ims_icscf
[cxdx_uar.c:107]: async_cdp_uar_callback(): Error timeout when sending
message via CDP
Dec 16 17:53:57 kamailio kam-icscf[23666]: ERROR: <script>: Unknown return
code from UAR, value is [<null>]
I think there are two issues:
1) The return_code avp does not work causing a NULL value or crash. I
experimented by restoring the avp lists from the suspended transaction in
the 'error:' section and this seems to work (attached patch) - I can now see
the "-2" return code that was set up before the suspend. I'll leave it to
you or others to decide if the error handling is being done properly in this
function and if my patch is useful.
Dec 17 16:41:07 kamailio kam-scscf[25089]: ERROR: <script>: We need to do
an UNREG server SAR assignemnt
Dec 17 16:41:07 kamailio kam-scscf[25089]: INFO: ims_registrar_scscf
[cxdx_sar.c:79]: create_return_code(): created AVP successfully :
[saa_return_code] - [-2]
Dec 17 16:41:07 kamailio kam-scscf[25099]: INFO: ims_registrar_scscf
[cxdx_avp.c:138]: cxdx_get_avp(): cxdx_get_experimental_result_code: Failed
finding avp
Dec 17 16:41:07 kamailio kam-scscf[25099]: INFO: ims_registrar_scscf
[cxdx_avp.c:138]: cxdx_get_avp(): cxdx_get_charging_info: Failed finding avp
Dec 17 16:41:07 kamailio kam-scscf[25099]: ERROR: <script>: SAR error -
error response sent from module
2) In these error cases, the original transaction is not responded to.
This leaves hanging calls and other requests. Perhaps the example cfgs could
be updated with default replies in the appropriate places.
Let me know if there are patches you want me to try.
Hugh
On 15/12/2013 21:17, Hugh Waite wrote:
Hello,
I am seeing a crash within the latest ims modules using the example cfg
scripts. It also happened in 4.1
1) The s-cscf receives a request from an application server and runs
'assign_server_unreg' (cfg line 368) because the intended destination is not
registered.
2) The HSS returns an error '5012: Unable to comply' and the suspended
transaction is resumed into the UNREG_SAR_REPLY route (cxdx_sar.c:290)
3) The coredump shows that the AVP lists are nonsensical, so the action to
get $avp(s:saa_return_code) causes a crash.
Do the avp lists need to be re-initialised from the suspended transaction,
like in the 'success/done' section (cxdx_sar.c:252)?
Maybe someone who is more familiar with this code can shine some light on
this?
Also in this scenario I can't see a code path that will send a response
back to the application server e.g. '480 Temporarily Unavailable' - Should
this be done in the cfg before calling assign_server_unreg?
Regards,
Hugh
Backtrace:
(gdb) bt
#0 0x000000000053dc89 in match_by_name (avp=0x303630363a6d6f63, id=116,
name=0x7ffff29895f8) at usr_avp.c:391
#1 0x000000000053e411 in search_next_avp (s=0x7ffff29895f0,
val=0x7ffff2989630) at usr_avp.c:507
#2 0x000000000053e120 in search_avp (ident=..., val=0x7ffff2989630,
state=0x7ffff29895f0) at usr_avp.c:475
#3 0x000000000053de09 in search_first_avp (flags=1, name=...,
val=0x7ffff2989630, s=0x7ffff29895f0) at usr_avp.c:427
#4 0x00007fa8de2f5626 in pv_get_avp (msg=0x7ffff298a030,
param=0x7fa8de86b898, res=0x7ffff2989760) at pv_core.c:1475
#5 0x0000000000499270 in pv_get_spec_value (msg=0x7ffff298a030,
sp=0x7fa8de86b880, value=0x7ffff2989760) at pvapi.c:1266
#6 0x00000000004c5f03 in rval_get_int (h=0x7ffff2989ef0,
msg=0x7ffff298a030, i=0x7ffff2989d58, rv=0x7fa8de86b878, cache=0x0) at
rvalue.c:978
#7 0x00000000004c89f5 in rval_expr_eval_int (h=0x7ffff2989ef0,
msg=0x7ffff298a030, res=0x7ffff2989d58, rve=0x7fa8de86b870) at rvalue.c:1918
#8 0x0000000000420648 in do_action (h=0x7ffff2989ef0, a=0x7fa8de86eaa8,
msg=0x7ffff298a030) at action.c:1219
#9 0x0000000000422878 in run_actions (h=0x7ffff2989ef0, a=0x7fa8de86aa30,
msg=0x7ffff298a030) at action.c:1599
#10 0x0000000000423017 in run_top_route (a=0x7fa8de86aa30,
msg=0x7ffff298a030, c=0x0) at action.c:1685
#11 0x00007fa8de59eae3 in t_continue (hash_index=15710, label=170389234,
route=0x7fa8de86aa30) at t_suspend.c:245
#12 0x00007fa8da1ebc98 in async_cdp_callback (is_timeout=0,
param=0x7fa8d5c68f40, saa=0x0, elapsed_msecs=1) at cxdx_sar.c:290
#13 0x00007fa8db23cacb in api_callback (p=0x7fa8d5c24d40,
msg=0x7fa8d5c5aca8, ptr=0x0) at api_process.c:115
#14 0x00007fa8db27ad87 in worker_process (id=2) at worker.c:330
#15 0x00007fa8db257aea in diameter_peer_start (blocking=0) at
diameter_peer.c:309
#16 0x00007fa8db25a02b in cdp_child_init (rank=0) at mod.c:237
#17 0x00000000004f7ec2 in init_mod_child (m=0x7fa8de841158, rank=0) at
sr_module.c:924
#18 0x00000000004f7d65 in init_mod_child (m=0x7fa8de841d00, rank=0) at
sr_module.c:921
#19 0x00000000004f7d65 in init_mod_child (m=0x7fa8de8420a8, rank=0) at
sr_module.c:921
#20 0x00000000004f7d65 in init_mod_child (m=0x7fa8de842458, rank=0) at
sr_module.c:921
#21 0x00000000004f7d65 in init_mod_child (m=0x7fa8de842ae8, rank=0) at
sr_module.c:921
#22 0x00000000004f7d65 in init_mod_child (m=0x7fa8de842f60, rank=0) at
sr_module.c:921
#23 0x00000000004f8048 in init_child (rank=0) at sr_module.c:948
#24 0x000000000046d57c in main_loop () at main.c:1694
#25 0x000000000047030b in main (argc=13, argv=0x7ffff298af78) at
main.c:2533
--
Hugh Waite
Principal Design Engineer
Crocodile RCS Ltd.
_______________________________________________
sr-dev mailing list
sr-dev(a)lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
--
Hugh Waite
Principal Design Engineer
Crocodile RCS Ltd.
_______________________________________________
sr-dev mailing list
sr-dev(a)lists.sip-router.org
http://lists.sip-router.org/cgi-bin/mailman/listinfo/sr-dev
+595981146623
_______________________________________________
sr-dev mailing list
sr-dev(a)lists.sip-router.org