[omniORB] problem with bidirectional feature - BUG OR FEATURE
Fernando A. de Araujo Filho
maverick@elogica.com.br
Fri Dec 6 17:40:02 2002
Hi again,
I continue with this problem. Sorry for the noise and bad english ...
I have executed TRACES to try to discover
the problem/bug or expected behaviour.
I read carefully the paper CORBAControls2002.pdf writed by Duncan Grisby.
As I am not know enough to deep understanding of the core of OmniOrb
implementation,
I sent the trace below.
Basically, in a bidirectional "conversation", when the server is dying, it
send to client
a GIOP::CloseConnection message. This message allways raise a COMMFAILURE
exception.
If g->pd_strand->biDir is TRUE nothing is done.
When the server restart, and the client try to call the server, allways the
client get a valid RopeLink
with its "giopStrand" in giopStrand::DYING state. In this case, a timeout==0
occurs and
a TRANSIENT_CallTimedout exception is raised.
>From that point, the RopeLink allways get the same giopStrand in dying state
raise
and a TRANSIENT_CallTimedout exception is raised and.
we never call the server again.
What I cannot understand is:
If I dont apply ANY bidir feature, that problem not occurs. The server can
die and restart
without any trouble. The client allways call the server again.
The question is :
that is a bug or feature in BIDIR mode ?
The TRACE is below :
************************************
FIRST TIME CALLING THE SERVER
WE ARE IN
IOP_C_Holder::IOP_C_Holder(const omniIOR* ior,
const CORBA::Octet* key,
CORBA::ULong keysize,
Rope* rope,
omniCallDescriptor* calldesc) : pd_rope(rope) {
OMNIORB_ASSERT(calldesc);
pd_iop_c = rope->acquireClient(ior,key,keysize,calldesc);
}
WE HAVE CALLED
IOP_C*
BiDirClientRope::acquireClient(const omniIOR* ior,
const CORBA::Octet* key,
CORBA::ULong keysize,
omniCallDescriptor* calldesc) {
GIOP_C* giop_c = (GIOP_C*)
giopRope::acquireClient(ior,key,keysize,calldesc);
...
WE TRY TO AQUIRE THE CLIENT
IOP_C*
giopRope::acquireClient(const omniIOR* ior,
const CORBA::Octet* key,
CORBA::ULong keysize,
omniCallDescriptor* calldesc) {
...
// DONT EXISTS ANY RopeLink FOR THE FIRST TIME
RopeLink* p = pd_strands.next;
...
// Reach here if we haven't got a strand to grab a GIOP_C.
if ((nbusy + ndying) < max) {
// Create a new strand.
...
giopStrand* s = new
giopStrand(pd_addresses[pd_addresses_order[pd_address_in_use]]);
s->state(giopStrand::ACTIVE);
s->RopeLink::insert(pd_strands);
s->StrandList::insert(giopStrand::active);
s->version = v;
s->giopImpl = impl;
}
goto again:
...
// NOW WE HAVE THE FIRST RopeLink
// GET THE giopStrand REFERENCE
giopStrand* s = (giopStrand*)p;
STATE is ACTIVE
if (!giopStreamList::is_empty(s->clients)) {
...
else {
THERE ARE NO CLIENTS ON giopStreamList
CREATE A new GIOP_C with GIOP 1.2 impl
g = new GIOP_C(this,s);
...
}
...
OK WE HAVE CONTACTED THE SERVER
EVERYTHING IS OK WHILE THE SERVER IS ALIVE, NO PROBLEM
*******************************************************
NOW OUR SERVER DIES
THE CLIENT ON giopImpl12::unmarshalWildCardRequestHeader
RECEIVE A GIOP::CloseConnection:
void
giopImpl12::unmarshalWildCardRequestHeader(giopStream* g) {
...
case GIOP::CloseConnection:
if (g->pd_strand->biDir) {
//g->pd_strand->biDir is TRUE BUT NOTHING IS DONE
// proper shutdown of a connection.
// XXX what to do?
}
// CALL inputRaiseCommFailure
inputRaiseCommFailure(g);
}
void
giopImpl12::inputRaiseCommFailure(giopStream* g) {
CORBA::ULong minor;
CORBA::Boolean retry;
g->notifyCommFailure(0,minor,retry);
g->pd_strand->state(giopStrand::DYING);
giopStream::CommFailure::_raise(minor,
(CORBA::CompletionStatus)g->completion(),
0,__FILE__,__LINE__);
}
OK A CommFailure EXCEPTION IS RAISED
***********************************************************
OUR SERVER IS RESTARTED
WE TRY TO CALL IT AGAIN
CALL STACK
omni::IOP_C_Holder::IOP_C_Holder(const omniIOR * 0x016a1b30, const unsigned
char * 0x016a1b88, unsigned long 21, omni::Rope * 0x0169e648,
omniCallDescriptor * 0x047efaa0) line 69
omniRemoteIdentity::locateRequest(omniCallDescriptor & {...}) line 257 + 44
bytes
omniObjRef::_locateRequest() line 1049
omniObjRef::_assertExistsAndTypeVerified() line 395
omniObjRef::_invoke(omniCallDescriptor & {...}, unsigned char 1) line 732
DVRSafenetIdls::_objref_DVRSafenetEstacao::centralChecaEstacaoAtiva() line
2738
WE ARE IN
////////////////////////////////////////////////////////////////////////////
IOP_C_Holder::IOP_C_Holder(const omniIOR* ior,
const CORBA::Octet* key,
CORBA::ULong keysize,
Rope* rope,
omniCallDescriptor* calldesc) : pd_rope(rope) {
OMNIORB_ASSERT(calldesc);
pd_iop_c = rope->acquireClient(ior,key,keysize,calldesc);
}
CALL STACK
omni::BiDirClientRope::acquireClient(const omniIOR * 0x016a1b30, const
unsigned char * 0x016a1b88, unsigned long 21, omniCallDescriptor *
0x047efaa0) line 483
omni::IOP_C_Holder::IOP_C_Holder(const omniIOR * 0x016a1b30, const unsigned
char * 0x016a1b88, unsigned long 21, omni::Rope * 0x0169e648,
omniCallDescriptor * 0x047efaa0) line 69 + 27 bytes
omniRemoteIdentity::locateRequest(omniCallDescriptor & {...}) line 257 + 44
bytes
omniObjRef::_locateRequest() line 1049
omniObjRef::_assertExistsAndTypeVerified() line 395
omniObjRef::_invoke(omniCallDescriptor & {...}, unsigned char 1) line 732
NOW WE WILL CALL giopRope::acquireClient(ior,key,keysize,calldesc);
IOP_C*
BiDirClientRope::acquireClient(const omniIOR* ior,
const CORBA::Octet* key,
CORBA::ULong keysize,
omniCallDescriptor* calldesc) {
GIOP_C* giop_c = (GIOP_C*)
giopRope::acquireClient(ior,key,keysize,calldesc);
...
}
IN giopRope::acquireClient
IOP_C*
giopRope::acquireClient(const omniIOR* ior,
const CORBA::Octet* key,
CORBA::ULong keysize,
omniCallDescriptor* calldesc) {
...
RopeLink* p = pd_strands.next;
for (; p != &pd_strands; p = p->next) {
giopStrand* s = (giopStrand*)p;
switch (s->state()) {
case giopStrand::DYING:
{
WE GET A ROPELINK WITH A "giopStrand* s" STILL IN DYING STATE
ndying++;
break;
}
...
AS if (pd_oneCallPerConnection || ndying >= max) {
// Wait for a strand to be unused.
pd_nwaiting++;
unsigned long deadline_secs,deadline_nanosecs;
calldesc->getDeadline(deadline_secs,deadline_nanosecs);
if (deadline_secs || deadline_nanosecs) {
THE pd_cond.timedwait call return 0
if (pd_cond.timedwait(deadline_secs,deadline_nanosecs) == 0) {
pd_nwaiting--;
THROW A TRANSIENT_CallTimedout EXCEPTION
OMNIORB_THROW(TRANSIENT,TRANSIENT_CallTimedout,CORBA::COMPLETED_NO);
...
}
FROM THAT POINT, THE ROPELINK ALLWAYS GET THE SAME STRAND IN DYING STATE
WE NEVER CALL THE SERVER AGAIN ...
THE PROBLEM IS:
IF WE DONT APPLY ANY BIDIR FEATURE EVERYTHING WORKS FINE
BUT IF WE HAVE A pd_strand->biDir THE PROBLEM ALLWAYS OCCURS AFTER
THE SERVER HAVE RESTARTED AND WHEN WE TRY TO CALL IT
some help ?
Fernando A. de Araujo Filho
maverick@elogica.com.br