[omniORB] Omnithreads suspension
Tom Haggie
THaggie@img.seagatesoftware.com
Wed, 3 Jun 1998 06:34:15 -0700
We' re having problems with a thread pool in our application, running on
Windows NT ( version 4.00.1381).
Originally, it seemed that signals were getting lost between threads - a
request for work was being put on the queue and a signal sent to
announce
that data was available, but no threads were waking, despite the fact
that
they appeared to be waiting for the signal. We changed the signal to a
broadcast so that all of the pool threads would wake when a request was
placed on the queue, and this seemed to improve things and give us more
information, but it hasn't solved the problem.
Most of the time, things work as expected - a request is placed on the
queue;
broadcast() is called on the "request present" condition; the free pool
threads wake up one at a time with the first thread to wake picking up
the
request and the others seeing that the queue is now empty and going back
to
sleep. The request is completed normally and the thread doing the work
goes
back to sleep to wait for more. However, occasionally one of the threads
fails to wake up and is never seen alive again. Eventually, the pool is
reduced to one live thread and our application deadlocks because we need
two active threads - one worker thread can place a request for another
worker to carry out. (It also deadlocks on shutdown because the pool has
to
wait for its threads to exit before destructing, and the inactive
threads
never wake up or exit). At the moment, we're running with only one
client
so there's never more than a single request on the queue (obviously this
will change later).
Looking in a debugger, the inactive threads are still present but
suspended.
Nothing is actually calling the Windows SuspendThread() function
directly,
so we are a bit puzzled as to what is going on.
The broadcast which occasionally fails to wake all the free workers is
called
by another worker thread. Requests can also be made from our main thread
(not created by the omnithreads), but these don't seem to cause
problems.
Any ideas? We are using OmniORB 2.5.1 and the associated omnithreads
library.
Thanks,
Richard Wilkinson