[omniORB] Latency problem with redundant networks

Thu Aug 21 10:27:27 BST 2003

Hi folks,

On linux (suse/redhat 7.2) I have a problem with latency times in 
network failure detection.

For example, the server creates object references for multiple network 
interfaces as follows:

serverApp -ORBendPoint giop:tcp:<eth0>: -ORBendPoint giop:tcp:<eth1>:

The client is able to establish a connection for the first endPoint. Now 
the network fails and the client shall try to establish a connection via 
the second network. Unfortunately, network failure detection and 
connection re-establishing takes to much time.
In case client call timeouts are disabled (ORBclientCallTimeOutPeriod=0) 
failure detection may take some minutes.
Since we have the requirement to detect and handle network failures 
within a few seconds we define a timeout on client side (e.g. 
ORBclientCallTimeOutPeriod=2000). In this case the ORB throws 
TRANSIENT_ConnectFailed to the application as expected.
But subsequent calls on the remote object reference seem only to use the 
first endpoint (profile).

Is it possible to encourage the client ORB to use the next network 
address of the object reference?

Maybe, an alternative (better?) way would be to adjust the TCP layer for 
faster timouts. In the tcp manual page the socket option IP_RECVERR is 
listed for quicker error notification. But I cannot estimate the 
possible side effects.

Is there anybody who has experience?

Thanks in advance,
Jochen Behrens