[omniORB] 4.2.0 tcpSocket->setTimeout bug suspicion
Andry
andry at inbox.ru
Mon Nov 10 16:25:21 GMT 2014
Recently i've catched a bad behaviour of the Windows Socket "select"
function where it have had hanging over 35 seconds before the return and
with all timeouts set to 200ms specified
(clientCallTimeOutPeriod+clientConnectTimeOutPeriod+serverCallTimeOutPeriod+poaHoldRequestTimeout).
Seems the select function waits by internal default timeout (35 seconds in
my case) to return. It only reproducible when the corba servant side
application accidently closes on an exception or exit or some other kind
of termination.
After a dig up corba code a while, i've found a strange line of code:
omniORB-4.2.0\include\omniORB4\internal\tcpSocket.h, 384
--------------------------------
if (deadline < now) {
t.tv_sec = t.tv_usec = 0;
return 1;
}
--------------------------------
Seems there has to be:
--------------------------------
if (deadline <= now) {
t.tv_sec = t.tv_usec = 0;
return 1;
}
--------------------------------
Otherwise the code will call the select one more time after a deadline hit
with the last parameter = 0.
MSDN says:
--------------------------------
timeout [in]
The maximum time for select to wait, provided in the form of a TIMEVAL
structure. Set the timeout parameter to null for blocking operations.
--------------------------------
So, the select in my case blocks for 35 seconds which lead to block a
client at least for 35 seconds (per each call to the "waitWrite" function)
when the all timeouts explicitly setted to 200ms.
Second thought. I didn't found a call to the setsockopt function with the
override of the SO_RCVTIMEO/SO_SNDTIMEO values. As i understand it
correctly if you won't call that explicitly, then it will have
unpredictable default timeouts in the downlayer socket API which is not
good at all because each of platform will have it's own default timeouts
and so behavior. In my case it was 35 seconds.
More information about the omniORB-list
mailing list