On Thursday 28 February 2008, Sergio Gutierrez wrote:
My OpenSER 1.3 installation running on Solaris Sparc
is facing random and
unexpected crashes, in appearance related to timer process.
The last core presents the following backtrace
#0 0xfe977a04 in get_expired_dlgs (time=4233810208) at dlg_timer.c:194
#1 0xfe977540 in dlg_timer_routine (ticks=7980, attr=0x0) at
dlg_timer.c:210
#2 0x000a839c in timer_ticker (timer_list=0x15ec00) at timer.c:275
#3 0x000a80ec in run_timer_process (tpl=0x1b8088, do_jiffies=1) at timer.c
:357
#4 0x000a8668 in start_timer_processes () at timer.c:386
#5 0x00035ea8 in main_loop () at main.c:873
#6 0x000397c4 in main (argc=-4195024, argv=0x150e9c) at main.c:1372
Thanks in advance for any hint you can give me.
Hi Sergio,
signal 10 is SIGBUS on solaris. This could be caused from an invalid address
alignment, a segmention fault on a physical address and a object hardware
error (wikipedia).
The first crashes were both caused from a get_all_ucontact, triggered by a
timer. This crash is now another timer, deletion of expired dialogs,
strange.. Is this machine otherwise stable, when (openser release) does this
crashes started?
Do you have already inspected with the debugger the datastructures in the code
of the get_expired_dlgs functions? Perhaps there is something wrong in
there..
Cheers,
Henning