[omniORB] deadlock

Tue Oct 11 12:43:15 BST 2016

Hi,

Am 27.09.16 um 18:55 schrieb Duncan Grisby:
> On Fri, 2016-09-16 at 09:48 +0200, Michael Teske via omniORB-list wrote:
> 
>> we recently experienced a deadlock in omniORB 4.2.1. I list the stack traces of the
>> relevant threads here:
> 
> [...]
> 
>> #4  omni::omniOrbPOA::synchronise_request (this=this at entry=0x14c7140, lid=lid at entry=0x14d8080) at poa.cc:2906
> 
>> This leads to Thread 14 having pd_lock and wanting *omni::internalLock
>>           and Thread 96 having *omni::internalLock and wanting pd_lock.   
>>
>> Mutexes should always locked in the same order. 
> 
> Indeed they should. POA::pd_lock comes before omni::internalLock in the
> partial lock order, so the code in synchronise_request is wrong to try
> to acquire them in the opposite order. It's quite rare for
> synchronise_request to be called at all, because it's only invoked when
> a POA is in holding state, so this bug slipped through the net.

Does that mean a call has been made while the POA was not activated? I
wonder why that happened. Maybe there was some logic error which
triggered this mutex bug?

> 
> The right fix is actually to not aquire pd_lock at all. The
> omniLocalIdentity object it is checking is protected by
> omni::internalLock, so it doesn't need pd_lock. I've checked a fix in to
> the 4_2 branch.
> 
> Thanks for the bug report.

Thanks for the quick fix!

Greetings,
  Michael