xadhoom created an issue (kamailio/kamailio#4203)
<!-- Kamailio Project uses GitHub Issues only for bugs in the code or feature requests. Please use this template only for bug reports.
If you have questions about using Kamailio or related to its configuration file, ask on sr-users mailing list:
* https://lists.kamailio.org/mailman3/postorius/lists/sr-users.lists.kamailio....
If you have questions about developing extensions to Kamailio or its existing C code, ask on sr-dev mailing list:
* https://lists.kamailio.org/mailman3/postorius/lists/sr-dev.lists.kamailio.or...
Please try to fill this template as much as possible for any issue. It helps the developers to troubleshoot the issue.
Note that an issue report may be closed automatically after about 2 months if there is no interest from developers or community users on pursuing it, being considered expired. In such case, it can be reopened by writing a comment that includes the token `/notexpired`. About two weeks before considered expired, the issue is marked with the label `stale`, trying to notify the submitter and everyone else that might be interested in it. To remove the label `stale`, write a comment that includes the token `/notstale`. Also, any comment postpone the `expire` timeline, being considered that there is interest in pursuing the issue.
If there is no content to be filled in a section, the entire section can be removed.
You can delete the comments from the template sections when filling.
You can delete next line and everything above before submitting (it is a comment). -->
### Description
Enabled json logging engine, with the following setup: - kam cmd line params: `--log-engine=json:Mp` - prefix: `log_prefix=", "callid": "$ci","srcip": "$si", "ts": $TV(Sn), "method": "$rm", "mt": $mt, "ua": "$(ua{s.escape.common})", "cseq": "$hdr(CSeq)", "status": $rs "` - another prefix (same result): `log_prefix=", "callid": "$ci", "ts": $TV(Sn) "`
What I see that certain json lines are truncated, for example - one line is interrupted by another json entry - one line starts "truncated"
### Troubleshooting What I can observe is that seems to happens always with same "kind" of line, usually internal logs, which don't have callid etc.
#### Log Messages
<!-- Check the syslog file and if there are relevant log messages printed by Kamailio, add them next, or attach to issue, or provide a link to download them (e.g., to a pastebin site). -->
The following first 3 lines are ok, then the latest is corrupted: ``` { "time": "2025-04-04T15:51:44.448427533Z", "idx": 193, "pid": 200, "level": "DEBUG", "module": "core", "file": "core/receive.c", "line": 128, "function": "sip_check_fline", "message": "first line indicates a SIP reply" } { "time": "2025-04-04T15:51:44.448682606Z", "idx": 194, "pid": 201, "level": "DEBUG", "module": "core", "file": "core/receive.c", "line": 128, "function": "sip_check_fline", "message": "first line indicates a SIP reply" } { "time": "2025-04-04T15:52:14.448279053Z", "idx": 195, "pid": 202, "level": "DEBUG", "module": "core", "file": "core/receive.c", "line": 128, "function": "sip_check_fline", "message": "first line indicates a SIP reply" } { "time": "2025-04-04T15:52:14.448505797Z", "idx": 196, "pid": 203, "level": "DEBUG", "module": "core", "file": "core/receive.c"{ "time": "2025-04-04T15:52:14.448508084Z", "idx": 195, "pid": 202, "level": "DEBUG", "module": "core", "file": "core/parser/msg, "line": 128, "function": "sip_check_fline", "message": "first line indicates a SIP reply" } ```
or (notice 2nd line or 5th one)
``` { "time": "2025-04-04T16:13:33.789641936Z", "idx": 221, "pid": 228, "level": "DEBUG", "module": "core", "file": "core/receive.c"{ "time": "2025-04-04T16:13:33.789642978Z", "idx": 222, "pid": 229, "level": "DEBUG", "module": "core", "file": "core/xavp.c", ", "line": 265, "function": "ksr_evrt_pre_routing", "callid": "5715efe40e9b7721-264@127.0.0.1", "ts": 1743783213.789640 , "message": "event route core:pre-routing not defined" } line": 630, "function": "xavp_destroy_list", "callid": "5715efe40e9b7720-264@127.0.0.1", "ts": 1743783213.789527 , "message": "destroying xavp list (nil)" } { "time": "2025-04-04T16:13:33.789646470Z", "idx": 222, "pid": 229, "level": "DEBUG", "module": "core", "file": "core/xavp.c", "line": 630, "function": "xavp_destroy_list", "callid": "5715efe40e9b7720-264@127.0.0.1", "ts": 1743783213.789527 , "message": "destroying xavp list (nil)" } { "time": "2025-04-04T16:13:33.789648201Z", "idx": 221, "pid": 228, "level": "DEBUG", "module": "tm", "file": "t_lookup.c", "line": 1579, "function": "t_check_msg", "callid": "5715efe40e9b7721-264@127.0.0.1", "ts": 1743783213.789640 , "message": "msg (0x7f{ "time": "2025-04-04T16:13:33.789649346Z", "idx": 222, "pid": 229, "level": "DEBUG", "module": "core", "file": "core/receive.c"8f7a693c50) id=2/228 global id=1/228 T start=0xffffffffffffffff" } , "line": 637, "function": "receive_msg", "callid": "5715efe40e9b7720-264@127.0.0.1", "ts": 1743783213.789527 , "message": "cleaning up" } { "time": "2025-04-04T16:13:33.789652839Z", "idx": 221, "pid": 228, "level": "DEBUG", "module": "tm", "file": "t_lookup.c", "line": 1424, "function": "t_reply_matching", "callid": "5715efe40e9b7721-264@127.0.0.1", "ts": 1743783213.789640 , "message": "t_reply_matching: hash 26404 label 0 branch 0" } ```
### Additional Information
* **Kamailio Version** - output of `kamailio -v`
``` 6.0.1 ```
happens also with 5.8.x
* **Operating System**:
<!-- Details about the operating system, the type: Linux (e.g.,: Debian 8.4, Ubuntu 16.04, CentOS 7.1, ...), MacOS, xBSD, Solaris, ...; Kernel details (output of `lsb_release -a` and `uname -a`) -->
``` Debian noble, docker container ```
xadhoom left a comment (kamailio/kamailio#4203)
Additional infos:
- logging happens over stderr: kamailio `-E` flag and `log_stderror=yes` - kamailio forked with `-DD` to have it running as pid 1 in the container
henningw left a comment (kamailio/kamailio#4203)
Thanks for the report, looks like some internal corruption indeed. How often does this happen, it is somehow related to the load etc..?
xadhoom left a comment (kamailio/kamailio#4203)
No load, is just a test system. the only traffic I have are dispatcher pings toward a couple of kamailios.
Happens continuosly, I would say one every 20 logs line (when in debug mode)
xadhoom left a comment (kamailio/kamailio#4203)
Going on with investigation, seems that also normal logs are affected, for example: ``` 130(137) DEBUG: , "callid": "1a0c92885690f77a-264@127.0.0.1", "ts": 1744705480.227433 dispatcher [dispatch.c:3339]: ds_update_st129(136) DEBUG: , "callid": "1a0c92885690f779-264@127.0.0.1", "ts": 1744705480.227412 dispatcher [dispatch.c:3339]: ds_update_state(): update state for sip:172.23.42.2:5060 in group 1 to 8 ate(): update state for sip:172.23.42.3:5060 in group 1 to 8 130(137) DEBUG: , "callid": "1a0c92885690f77a-264@127.0.0.1", "ts": 1744705480.227433 dispatcher [dispatch.c:3420]: ds_update_st129(136) DEBUG: , "callid": "1a0c92885690f779-264@127.0.0.1", "ts": 1744705480.227412 dispatcher [dispatch.c:3420]: ds_update_state(): old state was 0, set new state to 8 ate(): old state was 0, set new state to 8 ```
Or ``` 129(136) DEBUG: , "callid": "1a0c92885690f779-264@127.0.0.1", "ts": 1744705480.227412 tm [t_reply.c:2269]: local_reply(): branch130(137) DEBUG: , "callid": "1a0c92885690f77a-264@127.0.0.1", "ts": 1744705480.227433 tm [t_reply.c:1377]: t_should_relay_respon=0, save=0, winner=0 se(): ->>>>>>>>> T_code=0, new_code=200 129(136) DEBUG: , "callid": "1a0c92885690f779-264@127.0.0.1", "ts": 1744705480.227412 tm [t_reply.c:2308]: local_reply(): local 130(137) DEBUG: , "callid": "1a0c92885690f77a-264@127.0.0.1", "ts": 1744705480.227433 tm [t_reply.c:1633]: t_should_relay_respontransaction completed 200/0 (totag retr: 0/1024) se(): rps completed - uas status: 0 130(137) DEBUG: , "callid": "1a0c92885690f77a-264@127.0.0.1", "ts": 1744705480.227433 tm [t_reply.c:2269]: local_reply(): branch129(136) DEBUG: , "callid": "1a0c92885690f779-264@127.0.0.1", "ts": 1744705480.227412 tm [h_table.c:534]: tm_xdata_swap(): copy =0, save=0, winner=0 X/AVPs from msg context to txdata ```
Note the spurious lines?
xadhoom left a comment (kamailio/kamailio#4203)
Going on with my tests, those errors seems "interleaving" problems, where one thread writes to stderr while another is writing, most of the times you can see the "interrupted" line continuing after the interleaved one.
This should not happen, since fprintf should be atomic according to glibc, but this goes out of my reach.
I've tested rebuilding my container with rocky 9 and voilà, problem fixes. So is definitely something related to glibc/gcc differences between noble and rocky.
Now I'm gonna test it with debian bookworm and see what happens.
xadhoom left a comment (kamailio/kamailio#4203)
Ok a container with debian bookworm works as expected. So is definitely something within ubuntu noble.
Closed #4203 as completed.
miconda left a comment (kamailio/kamailio#4203)
Closing as not related to Kamailio code.