<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<META content="MSHTML 6.00.2800.1106" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><FONT face="Trebuchet MS" size=2><SPAN
class=695422605-01102004>Hello All,</SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN
class=695422605-01102004></SPAN></FONT> </DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004>We are
using OmniORB 4.0.1 for are server application, which consists of various
distributed components across the globe. The problem is that the our server
hangs after giving the following error:</SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">omniORB:
Unrecoverable error for this endpoint: giop:tcp:10.91.201.202:2222, it will no
longer be serviced.</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">There are no
reproducible steps to the above error but it reoccurs in few hours of operation.
However, u</SPAN></FONT></SPAN></FONT><FONT face="Trebuchet MS" size=2><SPAN
class=695422605-01102004><FONT face=Arial size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">pon investigation we have found that
one of code in OmniORB the above error could be displayed is in the following
scenario:</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">CORBA::Boolean<BR>tcpEndpoint::notifyReadable(SocketHandle_t
fd) {<BR> if (fd == pd_socket) {<BR> SocketHandle_t
sock;<BR> sock =
::accept(pd_socket,0,0);<BR> if (sock == RC_SOCKET_ERROR)
{<BR> return 0;<BR>
}<BR>....</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">...</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">}</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">As it is
clear from the above that whenever accept sys call fails (in our accept fails
with error ECONNABORTED which means "Software caused connection abort")
this routine would return 0 and eventually OmniORB would shutdown the endpoint
e.g. giop:tcp:10.91.201.202:2222 in our case. </SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"></SPAN></FONT></SPAN></FONT> </DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">Question 1:
Is it desired that whenever there is such failure occurs OmniORB should stop
servicing the concerned endpoint, because in real time accept could fail even if
there is any n/w problem from the clients who are connecting to the server?
</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"></SPAN></FONT></SPAN></FONT> </DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">To solve
this we have changed the giopRendezvouser::execute() in giopRendezvouser.cc to
do NOT break from the while loop of incase AcceptAndMonitor return NULL pointer
i.e. internally when accept fails. Please see the following code snippet from
changed giopRendezvouser::execute() method:</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">void<BR>giopRendezvouser::execute()<BR>{</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">....</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">....</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">
CORBA::Boolean exit_on_error;</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2></FONT> </DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"> do
{<BR> exit_on_error = 0;<BR> giopConnection*
newconn = 0;<BR> try {<BR>
newconn =
pd_endpoint->AcceptAndMonitor(notifyReadable,this);<BR>
if (newconn)
{<BR> pd_server->notifyRzNewConnection(this,newconn);<BR>
}<BR> else {</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"> /********
COMMENTED OUT THE FOLLWOING TWO
LINES *********<BR>
exit_on_error = 1;<BR>
break;</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"> *******************************************************************************/<BR>
}<BR> }</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">....</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004><FONT
face=Arial size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial">....</SPAN></FONT></SPAN></FONT></DIV>
<DIV><FONT size=2><SPAN class=695422605-01102004><FONT face="Trebuchet MS"
size=2><SPAN
style="FONT-SIZE: 10pt; FONT-FAMILY: Arial"></SPAN></FONT></SPAN></FONT><FONT
size=2><SPAN class=695422605-01102004><FONT face="Trebuchet MS">} // end
function</FONT></SPAN></FONT></DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN
class=695422605-01102004></SPAN></FONT> </DIV>
<DIV><FONT face="Trebuchet MS" size=2><SPAN class=695422605-01102004>After
making the above change now our server logs the SAME error message, but resumes
and keep listening on the SAME endpoint e.g. <FONT
face=Arial>giop:tcp:10.91.201.202:2222 in our case.</FONT></SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=695422605-01102004></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=695422605-01102004>Question 2: Is the
above fix right or does it violates CORBA specs in any way?</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=695422605-01102004></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN class=695422605-01102004>Also, in the current
scope we cannot use the multiple endpoints to keep server application
available as it does not solve our problem.</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=695422605-01102004></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN
class=695422605-01102004>Regards,</SPAN></FONT></DIV>
<DIV><FONT face=Arial size=2><SPAN
class=695422605-01102004></SPAN></FONT> </DIV>
<DIV><FONT face=Arial size=2><SPAN
class=695422605-01102004>--Kamal</SPAN></FONT></DIV>
<DIV><FONT size=2><SPAN class=695422605-01102004><FONT
face="Trebuchet MS"></FONT> </DIV></SPAN></FONT></BODY></HTML>