[omniORB] timeout detection (client-side)
Duncan Grisby
duncan at grisby.org
Mon Nov 26 15:40:44 GMT 2007
On Wednesday 21 November, "Michael Kilburn" wrote:
> Hmm... From my tests I have a feeling that COMM_FAILURE gets generated
> if underlying ORB had non-closed socket to the server (which died) and
> and I attempt to send a request there. Which is kind of unpredictable,
> since you never know if ORB has connection or not...
> Question: in such cases, will omniORB (after detecting that socket
> connection is dead) attempt to reestablish connection in the same call
> or I need to catch these cases and call again, expecting TRANSIENT?
If a connection has previously been used successfully, and a subsequent
call fails when marshalling the request, omniORB will transparently try
to reconnect, and you will see a TRANSIENT. If the connection breaks in
the middle of sending a request (e.g. because the server crashes in
response to the call), you see COMM_FAILURE.
> My original intention was to find reliably whether:
> - server is 100% dead (i.e. box is reachable, but related port is not
> listening)
In that case, you will definitely get TRANSIENT with minor code
TRANSIENT_ConnectFailed.
> - server is potentially alive (i.e. socket can't connect due to
> timeout (due to network problems, or server overload))
If the server is too busy to accept the new connection, you will still
see TRANSIENT_ConnectFailed, since the client can't tell the
difference. If it accepts the connection but is too busy to start
processing it, you'll get TRANSIENT_CallTimedOut.
> - server is 100% alive, but has problems in business logic (i.e.
> request was sent and received successfully, but request processing
> timed out, e.g. due to deadlock)
In that case, you'll get TRANSIENT with minor code
TRANSIENT_CallTimedOut. There is no way a client can tell the difference
between this situation and the case that the server is just too busy to
respond.
> - server is 100% alive, but there are problems in CORBA layer (no
> resources, protocol incompatibilities and so on)
You'll most likely get some other system exception like MARSHAL,
BAD_PARAM, BAD_OPERATION, etc. If the server gets confused in a way that
causes it to drop the connection, you'll get a COMM_FAILURE.
--
-- Duncan Grisby --
-- duncan at grisby.org --
-- http://www.grisby.org --
More information about the omniORB-list
mailing list