[omniORB] deadlock with distributed callback application
Lars Immisch
lars@ibp.de
Tue, 5 Jun 2001 16:42:09 +0200
Dear Christof,
thank you.
> omniORB only creates one thread on the server side for each incoming
> connection - this is fine as long as each client creates a new connection
> for concurrent requests to the same server. But with oneway requests the
> client side doesn't know when the server has finished processing the request
> (as there is no reply) and AFAIK omniORB doesn't create a new connection for
> oneways (or mark the connection as used) -- it just assumes that processing
> of oneway requests doesn't take long on the server.
That is what I thought, but I am not convinced yet it is what I see :-)
> The problem starts when the server is still busy with processing a oneway
> request while the client wants to send another two-way request (over the
> same connection). The client then has to wait for the reply from the server,
> but the server doesn't even see the request as it is still busy with
> processing the oneway request.
I thought of that, but I deliberately don't invoke twoway request from oneway
requests back to the client. This is why I suspected the LocationRequests were
causing my problem - if switched on, they turn a oneway request into a twoway
request and I couldn't see whether they created a new connection. I need to
look harder into this.
> > Our 'real' system deadlocks immediately when verifyObjectExistsAndType is
> > enabled. When I look where it is hanging, both processes are blocked on the
> > select in tcpSocketStrand::ll_recv called from the _locateRequest inside
> > the omniObjRef::_invoke.
>
> And I guess they are processing oneway requests from each other...
Not as far as I can see - when the deadlock occurs, all other threads are
idle, i.e. blocking where they should.
> > My suspicion was that in this case, the LocateRequest is sent over a reused
> > connection, and the other oneway invocation gets into the way. But I
> > haven't been able to verify that - mainly because my attempts to recreate
> > the problem
>
> Hmm, I have hacked together some simple Python scripts that show a possible
> deadlock situation with oneways:
Thanks. It deadlocks nicely, indeed, but it's not the problem I have here. I
am looking into the problem, if currently rather indirectly.
I will post a summary once I know what is going on.
Thanks,
Lars