Does anyone know what causes this?
*/set_timer for 1 list called on a "detached" timer -- ignoring /*
I also see
*/set_timer for 3 list called on a "detached" timer -- ignoring /*
When this happens Openser seems to lock up for 10 seconds or so.
From searching it appears this is caused by a race but I am not sure what the race is or why this results in an unresponsive openser instance for multiple seconds.
Transaction expiration racing reply?
Desperately need to understand how this could be triggered so I can get customer to adjust system.
Any way to adjust?
tried tweaking fr_inv_timer but no joy.
TR
Hi TR,
it is race between expire even (from timer) and inserting again on a timer list. 1 is the final response timer list (fr_timer) 3 id the wait timer list (wt_timer)
I would say there is no way this could leas to a any kind of lock.
what version are you using? what makes you say it locks?
regards, bogdan
T.R. Missner wrote:
Does anyone know what causes this?
*/set_timer for 1 list called on a "detached" timer -- ignoring /*
I also see
*/set_timer for 3 list called on a "detached" timer -- ignoring /*
When this happens Openser seems to lock up for 10 seconds or so.
From searching it appears this is caused by a race but I am not sure
what the race is or why this results in an unresponsive openser instance for multiple seconds.
Transaction expiration racing reply?
Desperately need to understand how this could be triggered so I can get customer to adjust system.
Any way to adjust?
tried tweaking fr_inv_timer but no joy.
TR
Devel mailing list Devel@openser.org http://openser.org/cgi-bin/mailman/listinfo/devel
Bogdan,
I have been chasing this for days and done lots of debugging. using 1.1.1 While looking at the network trace at the time of these messages ( I usually see at least 5 in a row with differing hex values ) I see many incoming packets coming into the box and no response from the proxy for somewhere between 5 - 10 seconds, then a flood a responses from the proxy. I can email you a sample pcap file if you like. As part of my debugging I forced a 100 reply at the very top of my cfg file. The forced 100 was not sent during the locked up time leading me to believe openser was not processing incoming packets. I have now seen this on multiple servers in different locations. Likely a particular customer call flow is causing this but I have not been able to pin it down to the exact customer. These proxies run pretty fast during the day so finding a pattern leading up the this issue is difficult. What could I add to the Log output to identify the offending sip-callid? Is sip-callid or branch tag or anything similar easily accessible in any of the data structs in timer.c?
TR
Bogdan-Andrei Iancu wrote:
Hi TR,
it is race between expire even (from timer) and inserting again on a timer list. 1 is the final response timer list (fr_timer) 3 id the wait timer list (wt_timer)
I would say there is no way this could leas to a any kind of lock.
what version are you using? what makes you say it locks?
regards, bogdan
T.R. Missner wrote:
Does anyone know what causes this?
*/set_timer for 1 list called on a "detached" timer -- ignoring /*
I also see
*/set_timer for 3 list called on a "detached" timer -- ignoring /*
When this happens Openser seems to lock up for 10 seconds or so.
From searching it appears this is caused by a race but I am not sure
what the race is or why this results in an unresponsive openser instance for multiple seconds.
Transaction expiration racing reply?
Desperately need to understand how this could be triggered so I can get customer to adjust system.
Any way to adjust?
tried tweaking fr_inv_timer but no joy.
TR
Devel mailing list Devel@openser.org http://openser.org/cgi-bin/mailman/listinfo/devel
Is it possible the locked state I am seeing with openser leads to the "detached" timer? Since the "detached" timer is a race, it would make sense to see the race condition after openser locks up and messages buffer up in the stack. When a bunch of messages are processed all at once by multiple threads the race condition would occur. Does this make sense?
Maybe I have been focusing on the wrong place.
Ignoring the "detached" timer what could cause openser to hang for a couple seconds then clear every 5 - 10 minutes?
Ideas?
We are seeing this on 3 different productions servers.
Thanks
TR
using openser1.1.1
T.R. Missner wrote:
Bogdan,
I have been chasing this for days and done lots of debugging. using 1.1.1 While looking at the network trace at the time of these messages ( I usually see at least 5 in a row with differing hex values ) I see many incoming packets coming into the box and no response from the proxy for somewhere between 5 - 10 seconds, then a flood a responses from the proxy. I can email you a sample pcap file if you like. As part of my debugging I forced a 100 reply at the very top of my cfg file. The forced 100 was not sent during the locked up time leading me to believe openser was not processing incoming packets. I have now seen this on multiple servers in different locations. Likely a particular customer call flow is causing this but I have not been able to pin it down to the exact customer. These proxies run pretty fast during the day so finding a pattern leading up the this issue is difficult. What could I add to the Log output to identify the offending sip-callid? Is sip-callid or branch tag or anything similar easily accessible in any of the data structs in timer.c?
TR
Bogdan-Andrei Iancu wrote:
Hi TR,
it is race between expire even (from timer) and inserting again on a timer list. 1 is the final response timer list (fr_timer) 3 id the wait timer list (wt_timer)
I would say there is no way this could leas to a any kind of lock.
what version are you using? what makes you say it locks?
regards, bogdan
T.R. Missner wrote:
Does anyone know what causes this?
*/set_timer for 1 list called on a "detached" timer -- ignoring /*
I also see
*/set_timer for 3 list called on a "detached" timer -- ignoring /*
When this happens Openser seems to lock up for 10 seconds or so.
From searching it appears this is caused by a race but I am not
sure what the race is or why this results in an unresponsive openser instance for multiple seconds.
Transaction expiration racing reply?
Desperately need to understand how this could be triggered so I can get customer to adjust system.
Any way to adjust?
tried tweaking fr_inv_timer but no joy.
TR
Devel mailing list Devel@openser.org http://openser.org/cgi-bin/mailman/listinfo/devel
Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users
FYI All
This turned out to be a database write ( acc ) that was blocking due to a raid card problem.
T.R. Missner wrote:
Is it possible the locked state I am seeing with openser leads to the "detached" timer? Since the "detached" timer is a race, it would make sense to see the race condition after openser locks up and messages buffer up in the stack. When a bunch of messages are processed all at once by multiple threads the race condition would occur. Does this make sense?
Maybe I have been focusing on the wrong place.
Ignoring the "detached" timer what could cause openser to hang for a couple seconds then clear every 5 - 10 minutes?
Ideas?
We are seeing this on 3 different productions servers.
Thanks
TR
using openser1.1.1
T.R. Missner wrote:
Bogdan,
I have been chasing this for days and done lots of debugging. using 1.1.1 While looking at the network trace at the time of these messages ( I usually see at least 5 in a row with differing hex values ) I see many incoming packets coming into the box and no response from the proxy for somewhere between 5 - 10 seconds, then a flood a responses from the proxy. I can email you a sample pcap file if you like. As part of my debugging I forced a 100 reply at the very top of my cfg file. The forced 100 was not sent during the locked up time leading me to believe openser was not processing incoming packets. I have now seen this on multiple servers in different locations. Likely a particular customer call flow is causing this but I have not been able to pin it down to the exact customer. These proxies run pretty fast during the day so finding a pattern leading up the this issue is difficult. What could I add to the Log output to identify the offending sip-callid? Is sip-callid or branch tag or anything similar easily accessible in any of the data structs in timer.c?
TR
Bogdan-Andrei Iancu wrote:
Hi TR,
it is race between expire even (from timer) and inserting again on a timer list. 1 is the final response timer list (fr_timer) 3 id the wait timer list (wt_timer)
I would say there is no way this could leas to a any kind of lock.
what version are you using? what makes you say it locks?
regards, bogdan
T.R. Missner wrote:
Does anyone know what causes this?
*/set_timer for 1 list called on a "detached" timer -- ignoring /*
I also see
*/set_timer for 3 list called on a "detached" timer -- ignoring /*
When this happens Openser seems to lock up for 10 seconds or so.
From searching it appears this is caused by a race but I am not
sure what the race is or why this results in an unresponsive openser instance for multiple seconds.
Transaction expiration racing reply?
Desperately need to understand how this could be triggered so I can get customer to adjust system.
Any way to adjust?
tried tweaking fr_inv_timer but no joy.
TR
Devel mailing list Devel@openser.org http://openser.org/cgi-bin/mailman/listinfo/devel
Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users
Actually more likely it has been both. The root problem lies in the timer subsystem and may be amplified by other troubles (or amplify those).
-jiri
At 01:35 30/03/2007, T.R. Missner wrote:
FYI All
This turned out to be a database write ( acc ) that was blocking due to a raid card problem.
T.R. Missner wrote:
Is it possible the locked state I am seeing with openser leads to the "detached" timer? Since the "detached" timer is a race, it would make sense to see the race condition after openser locks up and messages buffer up in the stack. When a bunch of messages are processed all at once by multiple threads the race condition would occur. Does this make sense?
Maybe I have been focusing on the wrong place.
Ignoring the "detached" timer what could cause openser to hang for a couple seconds then clear every 5 - 10 minutes?
Ideas?
We are seeing this on 3 different productions servers.
Thanks
TR
using openser1.1.1
T.R. Missner wrote:
Bogdan,
I have been chasing this for days and done lots of debugging. using 1.1.1 While looking at the network trace at the time of these messages ( I usually see at least 5 in a row with differing hex values ) I see many incoming packets coming into the box and no response from the proxy for somewhere between 5 - 10 seconds, then a flood a responses from the proxy. I can email you a sample pcap file if you like. As part of my debugging I forced a 100 reply at the very top of my cfg file. The forced 100 was not sent during the locked up time leading me to believe openser was not processing incoming packets. I have now seen this on multiple servers in different locations. Likely a particular customer call flow is causing this but I have not been able to pin it down to the exact customer. These proxies run pretty fast during the day so finding a pattern leading up the this issue is difficult. What could I add to the Log output to identify the offending sip-callid? Is sip-callid or branch tag or anything similar easily accessible in any of the data structs in timer.c?
TR
Bogdan-Andrei Iancu wrote:
Hi TR,
it is race between expire even (from timer) and inserting again on a timer list. 1 is the final response timer list (fr_timer) 3 id the wait timer list (wt_timer)
I would say there is no way this could leas to a any kind of lock.
what version are you using? what makes you say it locks?
regards, bogdan
T.R. Missner wrote:
Does anyone know what causes this?
*/set_timer for 1 list called on a "detached" timer -- ignoring /*
I also see
*/set_timer for 3 list called on a "detached" timer -- ignoring /*
When this happens Openser seems to lock up for 10 seconds or so.
From searching it appears this is caused by a race but I am not sure what the race is or why this results in an unresponsive openser instance for multiple seconds.
Transaction expiration racing reply?
Desperately need to understand how this could be triggered so I can get customer to adjust system.
Any way to adjust?
tried tweaking fr_inv_timer but no joy.
TR
Devel mailing list Devel@openser.org http://openser.org/cgi-bin/mailman/listinfo/devel
Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users
Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users
-- Jiri Kuthan http://iptel.org/~jiri/
wrong again :)
as I mentioned in my previous email, the "detached timer" was more an maker that something else was going wrong - there was no amplification.
and as TR clearly said, the problem was with DB connectivity and had nothing to do with TM timers.
regards, bogdan
Jiri Kuthan wrote:
Actually more likely it has been both. The root problem lies in the timer subsystem and may be amplified by other troubles (or amplify those).
-jiri
At 01:35 30/03/2007, T.R. Missner wrote:
FYI All
This turned out to be a database write ( acc ) that was blocking due to a raid card problem.
T.R. Missner wrote:
Is it possible the locked state I am seeing with openser leads to the "detached" timer? Since the "detached" timer is a race, it would make sense to see the race condition after openser locks up and messages buffer up in the stack. When a bunch of messages are processed all at once by multiple threads the race condition would occur. Does this make sense?
Maybe I have been focusing on the wrong place.
Ignoring the "detached" timer what could cause openser to hang for a couple seconds then clear every 5 - 10 minutes?
Ideas?
We are seeing this on 3 different productions servers.
Thanks
TR
using openser1.1.1
T.R. Missner wrote:
Bogdan,
I have been chasing this for days and done lots of debugging. using 1.1.1 While looking at the network trace at the time of these messages ( I usually see at least 5 in a row with differing hex values ) I see many incoming packets coming into the box and no response from the proxy for somewhere between 5 - 10 seconds, then a flood a responses from the proxy. I can email you a sample pcap file if you like. As part of my debugging I forced a 100 reply at the very top of my cfg file. The forced 100 was not sent during the locked up time leading me to believe openser was not processing incoming packets. I have now seen this on multiple servers in different locations. Likely a particular customer call flow is causing this but I have not been able to pin it down to the exact customer. These proxies run pretty fast during the day so finding a pattern leading up the this issue is difficult. What could I add to the Log output to identify the offending sip-callid? Is sip-callid or branch tag or anything similar easily accessible in any of the data structs in timer.c?
TR
Bogdan-Andrei Iancu wrote:
Hi TR,
it is race between expire even (from timer) and inserting again on a timer list. 1 is the final response timer list (fr_timer) 3 id the wait timer list (wt_timer)
I would say there is no way this could leas to a any kind of lock.
what version are you using? what makes you say it locks?
regards, bogdan
T.R. Missner wrote:
Does anyone know what causes this?
*/set_timer for 1 list called on a "detached" timer -- ignoring /*
I also see
*/set_timer for 3 list called on a "detached" timer -- ignoring /*
When this happens Openser seems to lock up for 10 seconds or so.
>From searching it appears this is caused by a race but I am not sure what the race is or why this results in an unresponsive openser instance for multiple seconds.
Transaction expiration racing reply?
Desperately need to understand how this could be triggered so I can get customer to adjust system.
Any way to adjust?
tried tweaking fr_inv_timer but no joy.
TR
At 14:48 30/03/2007, Bogdan-Andrei Iancu wrote:
wrong again :)
I wish it would be.
The operational experience shows us that in the former versions there have been race conditions which do cause troubles under hard-to-reproduce conditions. Based on surface knowledge, it appears that openser has inhereted those from ser before's ser's overhaul of those.
as I mentioned in my previous email, the "detached timer" was more an maker that something else was going wrong - there was no amplification.
lucky those who haven't been affected by the race conditions. My point is though, this particular warning corelates with undeterminism.
and as TR clearly said, the problem was with DB connectivity and had nothing to do with TM timers.
Well, as a matter of fact, I have witnessed several failures which coincidently appeared with this warning. Studing the code will reveal to you and anyone else that actually this warning is just a hack which helps to ignore erroneous conditions and survive those, but doesn't heal the cause of the problem, which may still generate disfucntional service.
Again -- I don't mean to daemonize it, with this -ignore-the-problem-hack things have been running mostly fine.
-jiri
regards, bogdan
Jiri Kuthan wrote:
Actually more likely it has been both. The root problem lies in the timer subsystem and may be amplified by other troubles (or amplify those).
-jiri
At 01:35 30/03/2007, T.R. Missner wrote:
FYI All
This turned out to be a database write ( acc ) that was blocking due to a raid card problem.
T.R. Missner wrote:
Is it possible the locked state I am seeing with openser leads to the "detached" timer? Since the "detached" timer is a race, it would make sense to see the race condition after openser locks up and messages buffer up in the stack. When a bunch of messages are processed all at once by multiple threads the race condition would occur. Does this make sense?
Maybe I have been focusing on the wrong place.
Ignoring the "detached" timer what could cause openser to hang for a couple seconds then clear every 5 - 10 minutes?
Ideas?
We are seeing this on 3 different productions servers.
Thanks
TR
using openser1.1.1
T.R. Missner wrote:
Bogdan,
I have been chasing this for days and done lots of debugging. using 1.1.1 While looking at the network trace at the time of these messages ( I usually see at least 5 in a row with differing hex values ) I see many incoming packets coming into the box and no response from the proxy for somewhere between 5 - 10 seconds, then a flood a responses from the proxy. I can email you a sample pcap file if you like. As part of my debugging I forced a 100 reply at the very top of my cfg file. The forced 100 was not sent during the locked up time leading me to believe openser was not processing incoming packets. I have now seen this on multiple servers in different locations. Likely a particular customer call flow is causing this but I have not been able to pin it down to the exact customer. These proxies run pretty fast during the day so finding a pattern leading up the this issue is difficult. What could I add to the Log output to identify the offending sip-callid? Is sip-callid or branch tag or anything similar easily accessible in any of the data structs in timer.c?
TR
Bogdan-Andrei Iancu wrote:
Hi TR,
it is race between expire even (from timer) and inserting again on a timer list. 1 is the final response timer list (fr_timer) 3 id the wait timer list (wt_timer)
I would say there is no way this could leas to a any kind of lock.
what version are you using? what makes you say it locks?
regards, bogdan
T.R. Missner wrote:
>Does anyone know what causes this? > >*/set_timer for 1 list called on a "detached" timer -- ignoring /* > >I also see > >*/set_timer for 3 list called on a "detached" timer -- ignoring /* > > > >When this happens Openser seems to lock up for 10 seconds or so. > >>From searching it appears this is caused by a race but I am not sure what the race is or why this results in an unresponsive openser instance for multiple seconds. > >Transaction expiration racing reply? > > >Desperately need to understand how this could be triggered so I can get customer to adjust system. > >Any way to adjust? > >tried tweaking fr_inv_timer but no joy. > > > >TR >
-- Jiri Kuthan http://iptel.org/~jiri/
well the openser related information you based you statements/opinions on is quite deprecated, as a lot of work was done in that area.
please try to update with the progress of the openser project.
bogdan
Jiri Kuthan wrote:
At 14:48 30/03/2007, Bogdan-Andrei Iancu wrote:
wrong again :)
I wish it would be.
The operational experience shows us that in the former versions there have been race conditions which do cause troubles under hard-to-reproduce conditions. Based on surface knowledge, it appears that openser has inhereted those from ser before's ser's overhaul of those.
as I mentioned in my previous email, the "detached timer" was more an maker that something else was going wrong - there was no amplification.
lucky those who haven't been affected by the race conditions. My point is though, this particular warning corelates with undeterminism.
and as TR clearly said, the problem was with DB connectivity and had nothing to do with TM timers.
Well, as a matter of fact, I have witnessed several failures which coincidently appeared with this warning. Studing the code will reveal to you and anyone else that actually this warning is just a hack which helps to ignore erroneous conditions and survive those, but doesn't heal the cause of the problem, which may still generate disfucntional service.
Again -- I don't mean to daemonize it, with this -ignore-the-problem-hack things have been running mostly fine.
-jiri
regards, bogdan
Jiri Kuthan wrote:
Actually more likely it has been both. The root problem lies in the timer subsystem and may be amplified by other troubles (or amplify those).
-jiri
At 01:35 30/03/2007, T.R. Missner wrote:
FYI All
This turned out to be a database write ( acc ) that was blocking due to a raid card problem.
T.R. Missner wrote:
Is it possible the locked state I am seeing with openser leads to the "detached" timer? Since the "detached" timer is a race, it would make sense to see the race condition after openser locks up and messages buffer up in the stack. When a bunch of messages are processed all at once by multiple threads the race condition would occur. Does this make sense?
Maybe I have been focusing on the wrong place.
Ignoring the "detached" timer what could cause openser to hang for a couple seconds then clear every 5 - 10 minutes?
Ideas?
We are seeing this on 3 different productions servers.
Thanks
TR
using openser1.1.1
T.R. Missner wrote:
Bogdan,
I have been chasing this for days and done lots of debugging. using 1.1.1 While looking at the network trace at the time of these messages ( I usually see at least 5 in a row with differing hex values ) I see many incoming packets coming into the box and no response from the proxy for somewhere between 5 - 10 seconds, then a flood a responses from the proxy. I can email you a sample pcap file if you like. As part of my debugging I forced a 100 reply at the very top of my cfg file. The forced 100 was not sent during the locked up time leading me to believe openser was not processing incoming packets. I have now seen this on multiple servers in different locations. Likely a particular customer call flow is causing this but I have not been able to pin it down to the exact customer. These proxies run pretty fast during the day so finding a pattern leading up the this issue is difficult. What could I add to the Log output to identify the offending sip-callid? Is sip-callid or branch tag or anything similar easily accessible in any of the data structs in timer.c?
TR
Bogdan-Andrei Iancu wrote:
> Hi TR, > > it is race between expire even (from timer) and inserting again on a timer list. > 1 is the final response timer list (fr_timer) > 3 id the wait timer list (wt_timer) > > I would say there is no way this could leas to a any kind of lock. > > what version are you using? what makes you say it locks? > > regards, > bogdan > > T.R. Missner wrote: > > >> Does anyone know what causes this? >> >> */set_timer for 1 list called on a "detached" timer -- ignoring /* >> >> I also see >> >> */set_timer for 3 list called on a "detached" timer -- ignoring /* >> >> >> >> When this happens Openser seems to lock up for 10 seconds or so. >> >> >From searching it appears this is caused by a race but I am not sure what the race is or why this results in an unresponsive openser instance for multiple seconds. >> >> Transaction expiration racing reply? >> >> >> Desperately need to understand how this could be triggered so I can get customer to adjust system. >> >> Any way to adjust? >> >> tried tweaking fr_inv_timer but no joy. >> >> >> >> TR >> >> > > -- > Jiri Kuthan http://iptel.org/~jiri/ >
Hi TR,
that explains it! if acc becomes blocking the transactions will be delayed for certain ops (as TM will block processing in the callbacks triggered by acc), so the probability for the race to show up was increased.
So the "detached timer" was not the source of the problem - as I told you, it can never lead to block or crash -, but it was a side effect because of the acc blocking.
regards, bogdan
T.R. Missner wrote:
FYI All
This turned out to be a database write ( acc ) that was blocking due to a raid card problem.
T.R. Missner wrote:
Is it possible the locked state I am seeing with openser leads to the "detached" timer? Since the "detached" timer is a race, it would make sense to see the race condition after openser locks up and messages buffer up in the stack. When a bunch of messages are processed all at once by multiple threads the race condition would occur. Does this make sense?
Maybe I have been focusing on the wrong place.
Ignoring the "detached" timer what could cause openser to hang for a couple seconds then clear every 5 - 10 minutes?
Ideas?
We are seeing this on 3 different productions servers.
Thanks
TR
using openser1.1.1
T.R. Missner wrote:
Bogdan,
I have been chasing this for days and done lots of debugging. using 1.1.1 While looking at the network trace at the time of these messages ( I usually see at least 5 in a row with differing hex values ) I see many incoming packets coming into the box and no response from the proxy for somewhere between 5 - 10 seconds, then a flood a responses from the proxy. I can email you a sample pcap file if you like. As part of my debugging I forced a 100 reply at the very top of my cfg file. The forced 100 was not sent during the locked up time leading me to believe openser was not processing incoming packets. I have now seen this on multiple servers in different locations. Likely a particular customer call flow is causing this but I have not been able to pin it down to the exact customer. These proxies run pretty fast during the day so finding a pattern leading up the this issue is difficult. What could I add to the Log output to identify the offending sip-callid? Is sip-callid or branch tag or anything similar easily accessible in any of the data structs in timer.c?
TR
Bogdan-Andrei Iancu wrote:
Hi TR,
it is race between expire even (from timer) and inserting again on a timer list. 1 is the final response timer list (fr_timer) 3 id the wait timer list (wt_timer)
I would say there is no way this could leas to a any kind of lock.
what version are you using? what makes you say it locks?
regards, bogdan
T.R. Missner wrote:
Does anyone know what causes this?
*/set_timer for 1 list called on a "detached" timer -- ignoring /*
I also see
*/set_timer for 3 list called on a "detached" timer -- ignoring /*
When this happens Openser seems to lock up for 10 seconds or so.
From searching it appears this is caused by a race but I am not
sure what the race is or why this results in an unresponsive openser instance for multiple seconds.
Transaction expiration racing reply?
Desperately need to understand how this could be triggered so I can get customer to adjust system.
Any way to adjust?
tried tweaking fr_inv_timer but no joy.
TR
I wouldn't deaemonize this problem.
This looks like a combination of errors in the old timer subsystem and TM, which has been known to cause race conditions and consequently either blocked functionality or crashes. Fortunately, the race conditions do occur rather rarely, its frequency can be easily delt with using some restart tools, and it is certainly just matter of time, till the overhauled timer system + tm is ported along with tm from SER to openSER.
-jiri
At 17:27 29/03/2007, T.R. Missner wrote:
Does anyone know what causes this?
set_timer for 1 list called on a "detached" timer -- ignoring
I also see
set_timer for 3 list called on a "detached" timer -- ignoring
When this happens Openser seems to lock up for 10 seconds or so.
From searching it appears this is caused by a race but I am not sure what the race is or why this results in an unresponsive openser instance for multiple seconds.
Transaction expiration racing reply?
Desperately need to understand how this could be triggered so I can get customer to adjust system.
Any way to adjust?
tried tweaking fr_inv_timer but no joy.
TR _______________________________________________ Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users
-- Jiri Kuthan http://iptel.org/~jiri/
jiri,
that is incorrect information - there was no plan / need / acceptance for the port you are mentioning.
I thing you realize that the TM+timer code from SER and OpenSER is following different direction of development, which makes quite unrealistic your sayings.
regards, bogdan
Jiri Kuthan wrote:
....and it is certainly just matter of time, till the overhauled timer system
- tm is ported along with tm from SER to openSER.
On Fri, 2007-03-30 at 15:44 +0300, Bogdan-Andrei Iancu wrote:
jiri,
that is incorrect information - there was no plan / need / acceptance for the port you are mentioning.
I thing you realize that the TM+timer code from SER and OpenSER is following different direction of development, which makes quite unrealistic your sayings.
HA ! I understand better now, I almost got confused :-)
Hi Jerome,
right, sometime is good to have thinks as clear as possible :).
as update on the topic, openser 1.2 has a new, improved timer implementation (core and TM) - actually done by myself :) - and part of the performance boost roots from there.
regards, bogdan
Jerome Martin wrote:
On Fri, 2007-03-30 at 15:44 +0300, Bogdan-Andrei Iancu wrote:
jiri,
that is incorrect information - there was no plan / need / acceptance for the port you are mentioning.
I thing you realize that the TM+timer code from SER and OpenSER is following different direction of development, which makes quite unrealistic your sayings.
HA ! I understand better now, I almost got confused :-)
On Fri, 2007-03-30 at 16:16 +0300, Bogdan-Andrei Iancu wrote:
Hi Jerome,
right, sometime is good to have thinks as clear as possible :).
Yep :-)
as update on the topic, openser 1.2 has a new, improved timer implementation (core and TM) - actually done by myself :) - and part of the performance boost roots from there.
Which is, BTW, a great job :-)
For people interested in looking at what SER does, this message led me to take a look a SER 2.0 documentation and bits of code, and it is not immediatly evident that they are taking a radically different route ... they also improved timer granularity (down to a resolution of 62.5 ms).
They changed a bit parameters to configure various timers from config file, and they of course retained the ability to change fr_timer and fr_inv_timer (interesting for controlling max ringing duration) on-the-fly on a per-transaction basis.
They also are currently developping a very interesting module called "timer", which provides the ability to set timers on-the-fly, with callback implemented as routes called when the custom timers fire. This seems pretty simple in their model, the timer module being only 408 lines long (but I can't tell if this works already or not).
An other puzzling fact is that SER's implementation of timers in tm module is about half the size as OpenSER's .... I'm not sure we can infer anything from this fact, still it made me curious.
Jerome Martin wrote:
On Fri, 2007-03-30 at 16:16 +0300, Bogdan-Andrei Iancu wrote:
Hi Jerome,
right, sometime is good to have thinks as clear as possible :).
Yep :-)
as update on the topic, openser 1.2 has a new, improved timer implementation (core and TM) - actually done by myself :) - and part of the performance boost roots from there.
Which is, BTW, a great job :-)
thanks :)
For people interested in looking at what SER does, this message led me to take a look a SER 2.0 documentation and bits of code, and it is not immediatly evident that they are taking a radically different route ... they also improved timer granularity (down to a resolution of 62.5 ms).
openser 1.2 has a granularity of milliseconds - you can adjust it as you want, based on your system requirements. Default value is of 100 milliseconds - the tests showed it is enough for high quality retransmissions without any performance penalties.
They changed a bit parameters to configure various timers from config file, and they of course retained the ability to change fr_timer and fr_inv_timer (interesting for controlling max ringing duration) on-the-fly on a per-transaction basis.
They also are currently developping a very interesting module called "timer", which provides the ability to set timers on-the-fly, with callback implemented as routes called when the custom timers fire. This seems pretty simple in their model, the timer module being only 408 lines long (but I can't tell if this works already or not).
I agree with you, there are a lot of thinks you can build, but the question is about their importance (as usage). as you know, we want to focus more one the hot topics (things really needed) and to avoid wasting resources for thinks not needed at that moment.
An other puzzling fact is that SER's implementation of timers in tm module is about half the size as OpenSER's .... I'm not sure we can infer anything from this fact, still it made me curious.
well...size does not matter ;) - also the code structuring may differ. and so far I found no relation between size an quality for code :)
regards, bogdan
For people interested in looking at what SER does, this message led me to take a look a SER 2.0 documentation and bits of code, and it is not immediatly evident that they are taking a radically different route ... they also improved timer granularity (down to a resolution of 62.5 ms).
openser 1.2 has a granularity of milliseconds - you can adjust it as you want, based on your system requirements. Default value is of 100 milliseconds - the tests showed it is enough for high quality retransmissions without any performance penalties.
I was not arguing, just quoting facts. However it is good you clarify this for OpenSER, some people might have though my statement was implying SER is "better" or "worse" in this respect. But this is not the case, things are just different :-)
They also are currently developping a very interesting module called "timer", which provides the ability to set timers on-the-fly, with callback implemented as routes called when the custom timers fire. This seems pretty simple in their model, the timer module being only 408 lines long (but I can't tell if this works already or not).
I agree with you, there are a lot of thinks you can build, but the question is about their importance (as usage). as you know, we want to focus more one the hot topics (things really needed) and to avoid wasting resources for thinks not needed at that moment.
In fact, I'd say that such module would be great if someone needs it and want to code and contribute it :-) Still I found the idea interesting, but only because I have _features_ in the back of my mind ...
An other puzzling fact is that SER's implementation of timers in tm module is about half the size as OpenSER's .... I'm not sure we can infer anything from this fact, still it made me curious.
well...size does not matter ;) - also the code structuring may differ. and so far I found no relation between size an quality for code :)
Right. But sometimes, with an identical functionnal-set, length of code (with a ponderation for code density per line) might be an indicator of added complexity. But this can not really be generalized, typical counter-example being language idioms, often very compact but more error-prone and less readable in general.
Regards, Jerome
At 15:45 30/03/2007, Jerome Martin wrote:
On Fri, 2007-03-30 at 16:16 +0300, Bogdan-Andrei Iancu wrote:
Hi Jerome,
right, sometime is good to have thinks as clear as possible :).
Yep :-)
as update on the topic, openser 1.2 has a new, improved timer implementation (core and TM) - actually done by myself :) - and part of the performance boost roots from there.
Which is, BTW, a great job :-)
For people interested in looking at what SER does, this message led me to take a look a SER 2.0 documentation and bits of code, and it is not immediatly evident that they are taking a radically different route ... they also improved timer granularity (down to a resolution of 62.5 ms).
Well actually I think so. I am not sure what else you could call for a software to be of radical change than complete change of the underlying data structures and associated algorithms :-). (referring to the timer subsystem)
They changed a bit parameters to configure various timers from config file, and they of course retained the ability to change fr_timer and fr_inv_timer (interesting for controlling max ringing duration) on-the-fly on a per-transaction basis.
The key thing (in addition to minor) is elimination of race conditions.
They also are currently developping a very interesting module called "timer", which provides the ability to set timers on-the-fly, with callback implemented as routes called when the custom timers fire. This seems pretty simple in their model, the timer module being only 408 lines long (but I can't tell if this works already or not).
Yes, that's a cute thing but I was previsouly merely referring to the under-the-hood kind of things.
An other puzzling fact is that SER's implementation of timers in tm module is about half the size as OpenSER's .... I'm not sure we can infer anything from this fact, still it made me curious.
Neither am I.
-jiri
On Fri, 2007-03-30 at 16:39 +0200, Jiri Kuthan wrote:
Well actually I think so. I am not sure what else you could call for a software to be of radical change than complete change of the underlying data structures and associated algorithms :-). (referring to the timer subsystem)
Right, after looking at the code. What I was saying is ".... and it is not immediatly evident ...", by "immediatly" I meant "at first look", "just by looking at the docs" ...
They changed a bit parameters to configure various timers from config file, and they of course retained the ability to change fr_timer and fr_inv_timer (interesting for controlling max ringing duration) on-the-fly on a per-transaction basis.
The key thing (in addition to minor) is elimination of race conditions.
That is an intersting one. Do you have any pointers to the relevant parts of code or to which structural changes enables that ?
They also are currently developping a very interesting module called "timer", which provides the ability to set timers on-the-fly, with callback implemented as routes called when the custom timers fire. This seems pretty simple in their model, the timer module being only 408 lines long (but I can't tell if this works already or not).
Yes, that's a cute thing but I was previsouly merely referring to the under-the-hood kind of things.
Yes, I understand that now.
An other puzzling fact is that SER's implementation of timers in tm module is about half the size as OpenSER's .... I'm not sure we can infer anything from this fact, still it made me curious.
Neither am I.
OK, OK, this one was not very insightfull from me :-)
At 17:02 30/03/2007, Jerome Martin wrote:
Right, after looking at the code. What I was saying is ".... and it is not immediatly evident ...", by "immediatly" I meant "at first look", "just by looking at the docs" ...
:-)
One of the hard things with overhauls is they don't belong to the best-sellers -- they don't bring exciting values, are sitting somewhere well hidden, and they begin to be of concern first when things fly apart. Unfortuantely the timer overhaul falls under those. That's been also one of the flamed topics a while ago when proponents of SER suggested to split the work between openser and ser contributors so that SER works on the under-the-hood thigns and openser on the priotirized applications, to aovid contributors doing the same thing twice. Nevertheless, the interest appeared rather negative.
They changed a bit parameters to configure various timers from config file, and they of course retained the ability to change fr_timer and fr_inv_timer (interesting for controlling max ringing duration) on-the-fly on a per-transaction basis.
The key thing (in addition to minor) is elimination of race conditions.
That is an intersting one. Do you have any pointers to the relevant parts of code or to which structural changes enables that ?
Actually the error in question was specifically one of those.
An other puzzling fact is that SER's implementation of timers in tm module is about half the size as OpenSER's .... I'm not sure we can infer anything from this fact, still it made me curious.
Neither am I.
OK, OK, this one was not very insightfull from me :-)
Well, I'm sometimes inclined to generate guidelines for myself too :-) Just in this specific case, it would be I guess too stratched to do so.
-jiri
jiri,
let us be realistic !!!
the policy (internal - about the code, targets and speed - and external - regarding contributions and user's wishes) was the key factor that made for us necessary to fork OpenSER.
having this in mind, I see no fundamentals for your "split-work" idea (I'm afraid it is just a diversion/advertising thing)... The success of a piece of code relies on the unity and synchronization of the developers!
so, let us not bore the users from this list....they have better thinks to learn from it.
regards bogdan
Jiri Kuthan wrote:
That's been also one of the flamed topics a while ago when proponents of SER suggested to split the work between openser and ser contributors so that SER works on the under-the-hood thigns and openser on the priotirized applications, to aovid contributors doing the same thing twice. Nevertheless, the interest appeared rather negative.
At 18:58 30/03/2007, Bogdan-Andrei Iancu wrote:
jiri,
let us be realistic !!!
the policy (internal - about the code, targets and speed - and external - regarding contributions and user's wishes) was the key factor that made for us necessary to fork OpenSER.
I am kind of not very certain that neither this was the factor not it was necessary. Actually I remember that folks with insight into this were (and I maintain quite by right) rather concerned. To refresh your memory I recommend you this thread: http://lists.iptel.org/pipermail/serdev/2005-June/005120.html
having this in mind, I see no fundamentals for your "split-work" idea (I'm afraid it is just a diversion/advertising thing)... The success of a piece of code relies on the unity and synchronization of the developers!
I agree with the statement, which appears to be in contrast with the fork you apparently consider "necessary".
Not that there would not be good progress -- the 1.2.0 release list seems to have great deal of inspiration from ottendorf, it is just I don't understand why some folks are upset about fixing TM.
-jiri
so, let us not bore the users from this list....they have better thinks to learn from it.
regards bogdan
Jiri Kuthan wrote:
That's been also one of the flamed topics a while ago when proponents of SER suggested to split the work between openser and ser contributors so that SER works on the under-the-hood thigns and openser on the priotirized applications, to aovid contributors doing the same thing twice. Nevertheless, the interest appeared rather negative.
-- Jiri Kuthan http://iptel.org/~jiri/
On 03/31/07 09:14, Jiri Kuthan wrote:
At 18:58 30/03/2007, Bogdan-Andrei Iancu wrote:
jiri,
let us be realistic !!!
the policy (internal - about the code, targets and speed - and external - regarding contributions and user's wishes) was the key factor that made for us necessary to fork OpenSER.
I am kind of not very certain that neither this was the factor not it was necessary. Actually I remember that folks with insight into this were (and I maintain quite by right) rather concerned. To refresh your memory I recommend you this thread: http://lists.iptel.org/pipermail/serdev/2005-June/005120.html
the spread and evolution of openser project proves contrary -- knowing what was then and where openser got so far, I can say that the fork was a good thing. You should accept the open source environment where the code can be forked at any moment, even if you like it or not. If you personally don't like it, doesn't mean it is something bad.
having this in mind, I see no fundamentals for your "split-work" idea (I'm afraid it is just a diversion/advertising thing)... The success of a piece of code relies on the unity and synchronization of the developers!
I agree with the statement, which appears to be in contrast with the fork you apparently consider "necessary".
Not that there would not be good progress -- the 1.2.0 release list seems to have great deal of inspiration from ottendorf, it is just I don't understand why some folks are upset about fixing TM.
I'm afraid you try to spread unrealistic stories -- since you started the activity on openser mailing lists there was no constructive conversation from your side, only accuses and claims to the project and folks here. Really, you are not force to use openser or participate to mailing lists if you dislike it.
OpenSER had all the time the roadmap public (btw, osas pointed we should upgrade it :-) ), it happened to be changed when external contributions popped up, or was strong demand of some feature. When you do such statements, please list some of those great things, and we will let you know when it started and how evolved (of course, you can dig on mailing lists and forums if you want quick answer). I could say that is the opposite direction, I may have quite strong arguments, but I don't, because will end in political discussions, without a good progressive result, so, there was no inspiration from openser to ser.
Regarding porting tm/timers or what so ever, we appreciate and welcome any contribution to OpenSER, it will be reviewed and accepted if brings something new or good. Not to invest unnecessary efforts in you side, ser's tm is very likely to be rejected as it is now, because its known big vulnerability to DoS. OpenSER tm module has very good performances and lot of features which are not in ser.
Daniel
-jiri
so, let us not bore the users from this list....they have better thinks to learn from it.
regards bogdan
Jiri Kuthan wrote:
That's been also one of the flamed topics a while ago when proponents of SER suggested to split the work between openser and ser contributors so that SER works on the under-the-hood thigns and openser on the priotirized applications, to aovid contributors doing the same thing twice. Nevertheless, the interest appeared rather negative.
-- Jiri Kuthan http://iptel.org/~jiri/
Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users
At 14:46 05/04/2007, Daniel-Constantin Mierla wrote:
the spread and evolution of openser project proves contrary -- knowing what was then and where openser got so far, I can say that the fork was a good thing. You should accept the open source environment where the code can be forked at any moment, even if you like it or not. If you personally don't like it, doesn't mean it is something bad.
My point is not about where it is, but where it could have been if the split-work hasn't been dilluting the result.
Not that there would not be good progress -- the 1.2.0 release list seems to have great deal of inspiration from ottendorf, it is just I don't understand why some folks are upset about fixing TM.
I'm afraid you try to spread unrealistic stories -- since you started the activity on openser mailing lists there was no constructive conversation from your side, only accuses and claims to the project and folks here. Really, you are not force to use openser or participate to mailing lists if you dislike it.
OpenSER had all the time the roadmap public (btw, osas pointed we should upgrade it :-) ), it happened to be changed when external contributions popped up, or was strong demand of some feature. When you do such statements, please list some of those great things, and we will let you know when it started and how evolved (of course, you can dig on mailing lists and forums if you want quick answer). I could say that is the opposite direction, I may have quite strong arguments, but I don't, because will end in political discussions, without a good progressive result, so, there was no inspiration from openser to ser.
This is quite beyond my point. Actually I think it is a good thing if the gap is being narrowed (whatever the process for that might have been, eventually seeing the "diff" getting shorter is the positive point). The real question is then how to effectively stimulate this trend.
Regarding porting tm/timers or what so ever, we appreciate and welcome any contribution to OpenSER, it will be reviewed and accepted if brings something new or good. Not to invest unnecessary efforts in you side, ser's tm is very likely to be rejected as it is now, because its known big vulnerability to DoS. OpenSER tm module has very good performances and lot of features which are not in ser.
Regretfully I have not been very succesful in locating the backtraces which are IMO showing specifically what flies apart and to code which to my knowledge had been copied'and'pasted. I still have some thin hope to locate those, but it will take time and is uncertain.
-jiri
-- Jiri Kuthan http://iptel.org/~jiri/
At 14:54 30/03/2007, Jerome Martin wrote:
On Fri, 2007-03-30 at 15:44 +0300, Bogdan-Andrei Iancu wrote:
jiri,
that is incorrect information - there was no plan / need / acceptance for the port you are mentioning.
I thing you realize that the TM+timer code from SER and OpenSER is following different direction of development, which makes quite unrealistic your sayings.
HA ! I understand better now, I almost got confused :-)
Well actually the previous statement got me more confused. I don't see the real point in forking to a different direction off the robustness features. Following different direction is not apparently something what happens on its own, it is something some must be striving for and I clearly don't see the point in it.
-jiri
-- Jiri Kuthan http://iptel.org/~jiri/
At 14:44 30/03/2007, Bogdan-Andrei Iancu wrote:
jiri,
that is incorrect information - there was no plan / need / acceptance for the port you are mentioning.
ok, I think that's a kind of pity, since this addresses some of the problems in question. I was hoepful we would have the capacity to port, but momentarily I'm not so optimistic about our cycles.
I thing you realize that the TM+timer code from SER and OpenSER is following different direction of development, which makes quite unrealistic your sayings.
I would say difficult, but not unrealistic. In any case, this has been exactly my concern.
-jiri
-- Jiri Kuthan http://iptel.org/~jiri/
On Thu, 2007-03-29 at 18:51 +0200, Jiri Kuthan wrote:
it is certainly just matter of time, till the overhauled timer system
- tm is ported along with tm from SER to openSER.
Jiri, does this mean that this improved timer system is the one introduced in 1.2.0 or does the SER tree have some more TM + timer improvements that still need to be integrated in OpenSER ?
Hi Jerome,
probably you got me email too late to answer your question :)
the TM+timer code from SER and OpenSER followed / follows different direction (totally unrelated) of development/improvements .....and this makes any kind of similarities or ports very unrealistic.
regards, bogdan
PS: sorry for repeating myself, but I like to have thinks clear.
Jerome Martin wrote:
On Thu, 2007-03-29 at 18:51 +0200, Jiri Kuthan wrote:
it is certainly just matter of time, till the overhauled timer system
- tm is ported along with tm from SER to openSER.
Jiri, does this mean that this improved timer system is the one introduced in 1.2.0 or does the SER tree have some more TM + timer improvements that still need to be integrated in OpenSER ?
Users mailing list Users@openser.org http://openser.org/cgi-bin/mailman/listinfo/users
On Fri, 2007-03-30 at 16:37 +0300, Bogdan-Andrei Iancu wrote:
Hi Jerome,
probably you got me email too late to answer your question :)
the TM+timer code from SER and OpenSER followed / follows different direction (totally unrelated) of development/improvements .....and this makes any kind of similarities or ports very unrealistic.
Yes, messages crossed each other :-) I even just sent an other related one 5 seconds ago, about the path SER takes :-)
At 14:50 30/03/2007, Jerome Martin wrote:
On Thu, 2007-03-29 at 18:51 +0200, Jiri Kuthan wrote:
it is certainly just matter of time, till the overhauled timer system
- tm is ported along with tm from SER to openSER.
Jiri, does this mean that this improved timer system is the one introduced in 1.2.0 or does the SER tree have some more TM + timer improvements that still need to be integrated in OpenSER ?
I'm hearing that the interest in porting those is rather low, I just expressed my opinion that it would do good to openser if someone took the time to do it. The overhauls have greatly improved robustness of SER after openser forked off.
-jiri
On Fri, 2007-03-30 at 16:43 +0200, Jiri Kuthan wrote:
I'm hearing that the interest in porting those is rather low, I just expressed my opinion that it would do good to openser if someone took the time to do it. The overhauls have greatly improved robustness of SER after openser forked off.
But can't we assume than bogdan's work on 1.2 might give OpenSER the same level of robustness as SER's TM/timers overhaul, and give this code time to stabilize before thinking of porting ?
I mean, what would let you think that 1.2 TM/timers is less robust than SER's 2.0 modifications ?
At 16:58 30/03/2007, Jerome Martin wrote:
On Fri, 2007-03-30 at 16:43 +0200, Jiri Kuthan wrote:
I'm hearing that the interest in porting those is rather low, I just expressed my opinion that it would do good to openser if someone took the time to do it. The overhauls have greatly improved robustness of SER after openser forked off.
But can't we assume than bogdan's work on 1.2 might give OpenSER the same level of robustness as SER's TM/timers overhaul, and give this code time to stabilize before thinking of porting ?
I'm sure I will receive an opposite answer to this quickly, it just occurs to be my respectful opinion based on studies of the code that this would be an unsafe assumption.
I mean, what would let you think that 1.2 TM/timers is less robust than SER's 2.0 modifications ?
The data structures appear to me to still allow conditions where load can get out of control. Similarly, the previous error message was an indication of race condition (haven't checked though now if it remains in 1.2).
-jiri
Jiri Kuthan wrote:
The data structures appear to me to still allow conditions where load can get out of control. Similarly, the previous error message was an indication of race condition (haven't checked though now if it remains in 1.2).
Then please check it, before making any claims.
I guess everybody here appreciates discussions about any weaknesses of OpenSER, since it allows to target those. But these arguments must base upon up-to-date facts, otherwise they are just assertions without any substance and only lead to more bad blood (and I hope this isn't what you want to achieve).
Cheers, Andreas