[omniORB] giopServer::deactivate hangs

Thu Jun 26 13:05:40 BST 2003

On Wednesday 25 June, baileyk at schneider.com wrote:

> I didn't see that it was resolved, so I'll post the details I have.  I have
> processes communicating via omniORB like so
> 
>   A ----> B <----> C
> 
> B spawns C (fork/exec) and passes an IOR on the command line to it.  C
> sends it's IOR to B via a pipe set up by B prior to the fork.  I'm having
> trouble getting C to shut down consistently.  If I have B sequentially
> spawn and signal 50 instances of C, perhaps 2 of them will hang.  I'm using
> what I hope is the most robust means of shutting down a CORBA server
> process:

Forking from multi-threaded programs often causes odd things to
happen. It's safer to Pre-fork before B becomes multi-threaded, then
have a single threaded forker, F, which talks to B through pipes. That
way, C would be a child of F, rather than directly of B.

I don't know if that will help here, but it rules one area of problems
out.

>       - Each CORBA server has a signal handler for SIGTERM.  The handler
> simply sets a global flag and signals a condition variable.

It's not guaranteed to be safe to signal a condition variable from
inside a signal handler. The only threading primitive which is
guaranteed safe is the System V semaphore.

[...]
> For every other server this method of signalling shutdown has been
> infalible.  For this fork/exec'd server it hangs now and then and if I have
> to kill it then B will not shutdown nicely either.  I'm pretty sure B has
> released all object references to C prior to signalling it, but I would
> hope that wouldn't matter.  Shouldn't an ORB unconditionally return from
> run() if there are no active invocations at the time?

It should, but connection shutdown is a relatively complex and
concurrent thing, so it won't be that surprising if there are bugs.

[...]
> then it hangs a while and eventually I see these two additional lines

That shows that the ORB shutdown thread was never told that the
connections had all been shut down. The question is why...

[...]
> then it hangs again.  After many more minutes I get these lines (btw, why
> is the Total threads not dropping if threads are exiting?)

That is very odd. It implies that the threads are all blocking as they
attempt to lock the mutex that protects the count of threads. If that
is happening, it would explain why the shutdown thread isn't hearing
that the threads have completed.

I'd suggest you update to the CVS version of omniORB, which has a few
small changes to the shutdown code, and try again. Next, try the
pre-forking idea.

Cheers,

Duncan.

-- 
 -- Duncan Grisby         --
  -- duncan at grisby.org     --
   -- http://www.grisby.org --