[omniORB] Race between deactivate and outstanding invocations
Chris Newbold
chris.newbold@laurelnetworks.com
Thu, 12 Oct 2000 11:49:07 -0400
I believe I've found a race condition between object deactivation (via
the POA) and outstanding method invocations; I'm not 100% confident
in my analysis, but...
We're currently running 3.0.0, but I made my diagnosis looking at the
3.0.2 source code. First, the symptom:
----------------------------------------------------------
Oct 11 01:38:28 npbd[5314]: 971242708.441742 15375
omniORB D Assertion failed. This indicates a bug in
omniORB.
Oct 11 01:38:28 npbd[5314]: file: ../objectAdapter.cc
Oct 11 01:38:28 npbd[5314]: line: 311
Oct 11 01:38:28 npbd[5314]: info: pd_nDetachedObjects > 0
Oct 11 01:38:28 npbd[5314]:
Oct 11 01:38:28 npbd[5314]: Aborted
Oct 11 01:38:28 npbd[5314]: PID = 5947
Backtrace:
#0 0x40456c68 __restore
#1 0x40456d41 __kill+17
#2 0x404580d8 abort+200
#3 0x402953c3
omniORB::fatalException::fatalException(char const *, int, char const
*)+55
#4 0x4026887b omni::assertFail(char const *, int,
char const *)+247
#5 0x4024a8fb
omniObjAdapter::met_detached_object(void)+79
#6 0x40254389
omniOrbPOA::lastInvocationHasCompleted(omniLocalIdentity *)+569
#7 0x402abe13
omniLocalIdentity_RefHolder::~omniLocalIdentity_RefHolder(void)+159
#8 0x402482e3 omniLocalIdentity::dispatch(GIOP_S
&)+155
#9 0x4027a5e7 GIOP_S::HandleRequest(bool)+963
#10 0x40279dd5 GIOP_S::dispatcher(Strand *)+449
#11 0x4029bd70 tcpSocketWorker::_realRun(void *)+116
#12 0x402b7cfb
omniORB::giopServerThreadWrapper::run(void (*)(void *), void *)+35
#13 0x4029bce8 tcpSocketWorker::run(void *)+64
#14 0x402feab1 omni_thread_wrapper+273
--------------------------------------------------------------
The asserting thread is completing a method invocation on an object;
while this invocation was in progress, another thread called
deactivate_object() on the same object's POA passing the OID of the
same object.
At the time of the assertion failure, deactivate_object() has not
yet returned.
So, I started looking at what happens in deactivate_object() and
found that, while holding the internal lock, deactivate() is called
on the omniLocalIdentity for the object (poa.cc:832). Further
along, the internal lock is dropped (line 857) and detached_object()
is called.
The race condition is that once deactivate_object() has called
deactivate on the omniLocalIdentity and dropped the internal lock,
the thread handling the invocation can now see that the
omniLocalIdentity has been deactivated in the
omniLocalIdentity_RefHolder destructor (localIdentity.cc:78).
However, deactivate_object() has not yet called detached_object(),
so pd_nDetachedObjects in omniObjAdapater has not been updated,
resulting in the assertion from the invocation thread in
met_detached_object().
Sorry for the long-winded naration; hopefully it makes some sense...
-Chris Newbold
Laurel Networks, Inc.