[omniORB] Race condition between Scavenger thread startup and ORB shutdown

Wed Feb 4 09:12:02 GMT 2015

We have a client application that contacts omniNames to obtain a server IOR, and then calls a ping() method on the server. After that, orb->shutdown(true) and orb->destroy() are called and the application exits.

In one production system, we observe rare but reproducible crashes (Red Hat Enterprise Linux 6 x86_64, omniORB 4.2.0). Here is a trace level 10 output:

 (0) 2015-01-28 12:20:47.465834: Preparing to shutdown ORB.
 (0) 2015-01-28 12:20:47.465970: Shutting-down all incoming endpoints.
 (0) 2015-01-28 12:20:47.466024: ORB shutdown is complete.
 (0) 2015-01-28 12:20:47.466070: Destroy ORB...
 (0) 2015-01-28 12:20:47.466203: Deinitialising omniDynamic library.
 (0) 2015-01-28 12:20:47.466621: AsyncInvoker: deleted.
 (1) 2015-01-28 12:20:47.466693: AsyncInvoker: thread id 1 has started. Total threads = 1.
 (0) 2015-01-28 12:20:47.466698: ORB destroyed.

One can see that AsyncInvoker is already deleted when thread 1 starts. 

Finally thread 1 crashes in omniAsyncWorker::mid_run() in file invoker.cc, line 513:

  pd_pool->workerRun(this);

It seems that AsyncInvoker does not yet see the Scavenger thread and performs its shutdown. Most probably this happens because pd_lock is held during AsyncInvoker destruction which prevents Scavenger from being started. As soon as the AsyncInvoker destructor gives up the mutex, Scavenger can continue. Since pd_pool is a member of AsyncInvoker, the access of Scavenger results in a crash.

The application no longer crashes if Scavenger is disabled using "-ORBscanGranularity 0".

Since running without Scavenger is not an option, any help is appreciated.

Regards, Peter.