[omniORB] Race condition between Scavenger thread startup and ORB shutdown
Peter Klotz
Peter.Klotz at ith-icoserve.com
Wed Feb 4 09:12:02 GMT 2015
We have a client application that contacts omniNames to obtain a server IOR, and then calls a ping() method on the server. After that, orb->shutdown(true) and orb->destroy() are called and the application exits.
In one production system, we observe rare but reproducible crashes (Red Hat Enterprise Linux 6 x86_64, omniORB 4.2.0). Here is a trace level 10 output:
(0) 2015-01-28 12:20:47.465834: Preparing to shutdown ORB.
(0) 2015-01-28 12:20:47.465970: Shutting-down all incoming endpoints.
(0) 2015-01-28 12:20:47.466024: ORB shutdown is complete.
(0) 2015-01-28 12:20:47.466070: Destroy ORB...
(0) 2015-01-28 12:20:47.466203: Deinitialising omniDynamic library.
(0) 2015-01-28 12:20:47.466621: AsyncInvoker: deleted.
(1) 2015-01-28 12:20:47.466693: AsyncInvoker: thread id 1 has started. Total threads = 1.
(0) 2015-01-28 12:20:47.466698: ORB destroyed.
One can see that AsyncInvoker is already deleted when thread 1 starts.
Finally thread 1 crashes in omniAsyncWorker::mid_run() in file invoker.cc, line 513:
pd_pool->workerRun(this);
It seems that AsyncInvoker does not yet see the Scavenger thread and performs its shutdown. Most probably this happens because pd_lock is held during AsyncInvoker destruction which prevents Scavenger from being started. As soon as the AsyncInvoker destructor gives up the mutex, Scavenger can continue. Since pd_pool is a member of AsyncInvoker, the access of Scavenger results in a crash.
The application no longer crashes if Scavenger is disabled using "-ORBscanGranularity 0".
Since running without Scavenger is not an option, any help is appreciated.
Regards, Peter.
More information about the omniORB-list
mailing list