[kamailio/kamailio] janssonrpcc: No failover to lower prio server (Issue #3378) - sr-dev

23 Feb 2023


      <!--
Kamailio Project uses GitHub Issues only for bugs in the code or feature requests. Please use this template only for bug reports.
If you have questions about using Kamailio or related to its configuration file, ask on sr-users mailing list:
* https://lists.kamailio.org/mailman3/postorius/lists/sr-users.lists.kamailio....
If you have questions about developing extensions to Kamailio or its existing C code, ask on sr-dev mailing list:
* https://lists.kamailio.org/mailman3/postorius/lists/sr-dev.lists.kamailio.or...
Please try to fill this template as much as possible for any issue. It helps the developers to troubleshoot the issue.
If there is no content to be filled in a section, the entire section can be removed.
You can delete the comments from the template sections when filling.
You can delete next line and everything above before submitting (it is a comment).
-->
### Description
I have 2 jsonrpc servers configured with different prio's. For testing, I have the servers configured to always delay the response to any request by more than the module's timout setting.
The (initial) request is sent to the first server. As this one times out, I would expect a retry to go to the second servers, but instead, all retries are sent to the same server. The backup server is never contacted. This makes the whole "prio" system seem a bit useless.
<!--
Explain what you did, what you expected to happen, and what actually happened.
-->
### Troubleshooting
#### Reproduction
```
modparam("janssonrpcc", "server", "conn=test;addr=pc1;port=8081;priority=5;weight=10")
modparam("janssonrpcc", "server", "conn=test;addr=pc1;port=8082;priority=5;weight=10")
```
```
janssonrpc_request("test", "Test.Timeout", '[  { "Timout": 1000} ]', "route=JSONRPC_RESPONSE;retry=10;timeout=1000");
```
<!--
If the issue can be reproduced, describe how it can be done.
-->
#### Log Messages
No useful logs are produced. I verified the described behavior on the jsonrpc server.
<!--
Check the syslog file and if there are relevant log messages printed by Kamailio, add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site).
-->
```
2023-02-23T16:59:34.585346+01:00 pc1 proxy1[340870]: INFO: janssonrpcc [janssonrpc_connect.c:361]: bev_connect(): Connecting to server pc1:8081 for conn rating.
2023-02-23T16:59:34.585420+01:00 pc1 proxy1[340870]: INFO: janssonrpcc [janssonrpc_connect.c:361]: bev_connect(): Connecting to server pc1:8082 for conn rating.
2023-02-23T16:59:34.585446+01:00 pc1 proxy1[340870]: INFO: janssonrpcc [janssonrpc_connect.c:290]: bev_connect_cb(): Connected to host pc1:8081
2023-02-23T16:59:34.585462+01:00 pc1 proxy1[340870]: INFO: janssonrpcc [janssonrpc_connect.c:290]: bev_connect_cb(): Connected to host pc1:8082
2023-02-23T17:05:10.903398+01:00 pc1 proxy1[340870]: WARNING: janssonrpcc [janssonrpc_request.c:247]: schedule_retry(): Number of retries exceeded. Failing request.
```
### Possible Solutions
<!--
If you found a solution or workaround for the issue, describe it. Ideally, provide a pull request with a fix.
-->
Retry in combination with a timeout and prio's is a bit tricky. When do what? Just retrying on the first prio makes the lower prio servers completely useless, while going to the next prio on every retry skips possibly useful high-prio servers and may exhaust the number of candidate servers very fast.
Best solution IMHO would be to first try every server in the highest prio, before going to the next prio. Do not do (exponential) backoff  between these steps.
If there are still retries remaining after that, wrap around to the highest prio with the exponentional backoff delay.
With the above, failover considers all servers and failover between servers is fast while not overloading a single server.
BTW. If I configure multiple servers per prio, it seems to randomly select one of them for every (re)try. It never selects one form the next prio.
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
```
5.6.1
```
* **Operating System**:
<!--
Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...;
Kernel details (output of `lsb_release -a` and `uname -a`)
-->
```
(paste your output here)
```
-- 
Reply to this email directly or view it on GitHub:
https://github.com/kamailio/kamailio/issues/3378
You are receiving this because you are subscribed to this thread.

Message ID: kamailio/kamailio/issues/3378@github.com