[omniORB] Re: [omniORB 2.8.0] Connection handling bug?

Fri Jun 3 14:53:00 BST 2005

Well, I've done some digging in the archives and found this 
posting:
http://www.omniorb-support.com/pipermail/omniorb-list/2000-January/014669.html

Consider this situation:
- max tcp connection per server is 5
- client makes 5 CORBA calls to a server; all 5 are open
- client makes 6th CORBA call to the server
- The above fix randomizes the strand that this 6th connection 
  waits on; say it waits on strand 2.
- Server completes calls on strands 1, 3, 4 and 5.
- The 6th CORBA call is still waiting even though 4 strands
  are now available.

I guess what I need is the change that was applied to the
omniORB3pre stream, rather than the 2.8 stream.  How can I go 
about identifying the changeset applied to the 3pre stream?  
Some CVS help, please...

Regards,
Rak.

PS: Please ignore the blather about Rope locking in my original
email.

"Simha, Rakshit" wrote:
> 
> Hi folks,
> 
> I am seeing some interesting behaviour with omniORB 2.8.0
> in the following situation:
> 
> - Solaris 8; omniORB 2.8.0; client compiled using SunPRO;
>   server compiled using gcc; client and server located on
>   same host; both client and server have set max. TCP
>   connections per server to (say) 5 before initializing
>   the ORB.
> 
> - multi-threaded client makes CORBA call to server; while
>   server is processing the call, the client thread blocks:
>   -----------------  lwp# 34 / thread# 100  --------------------
>    fee1bfb8 recv     (f, 9aba18, 2008, 0)
>    ff1d53fc unsigned tcpSocketStrand::ll_recv(void*,unsigned) (9289a0, 9aba18, 2008, 9aba30, 9aba18, 9aba18) + c4
>    ff1d2048 void reliableStreamStrand::fetch(unsigned long) (9289a0, 0, fffffff8, 9aba1f, ff1d1f78, 9aba18) + 58
>    ff1d1c90 Strand::sbuf reliableStreamStrand::receive(unsigned,unsigned char,int, unsigned char) (9289a0, 2000, 0, 8, 1, 1) + 84
>    ff1bea88 void NetBufferedStream::receive(unsigned,unsigned char) (fe080f08, 8, 1, 8, ff1d1f78, 0) + 11c
>    ff1bf218 void*NetBufferedStream::align_and_get_bytes(omni::alignment_t,unsigned,unsigned char) (8, 1, 8, 1, 0, 9a9a40) + 54
>    ff1be3c8 void NetBufferedStream::get_char_array(unsigned char*,int,omni::alignment_t,unsigned char) (fe080f08, fe080e40, 8, 1, 1, 0) + 134
>    ff1b3bc8 GIOP::ReplyStatusType GIOP_C::ReceiveReply() (fe080f08, fe080f08, 3, 7437d9, 7d6dc8, 3f) + f4
>    ff1d0448 void OmniProxyCallWrapper::invoke(omniObject*,OmniProxyCallDesc&) (83b13c, fe080fd4, 7437d9, 0, ff1d11a8, ff20cb74) + 128
>    004525e0 void Common::_proxy_Transaction::commit() (83b198, 271078, 927118, 87f9b8, fef6a07c, 20) + 60
>    00267398 void ProxyAgentDevice::transactionEnd() (8988b8, fe0812f4, 1, 0, 72e2ac, 61d836e4) + 2d8
>    0026537c void ProxyAgentDevice::transaction() (8988b8, 271078, 1, 0, fe0818cb, 0) + 16c
>    00263d00 void ProxyAgentDevice::configure() (8988b8, fe081874, 1, 925478, fe081943, 0) + 760
>    002958ac void ProxyAgentDevicePoller::doConfig() (827c70, 827dd0, 0, fe0819eb,0, 0) + 324
>    002a0398 void DeferredCall<ProxyAgentDevicePoller>::timerCallback(const Scheduler::Timer*) (827dc0, 92a680, fe081a93, 0, 0, 33d634e1) + 80
>    00335b68 void Scheduler::ScheduledTimer::triggerTimer(Scheduler::Timer*)const (97abc8, 92a680, 0, fffffff8, 0, 81fa39) + 28
>    00334a38 void*Scheduler::run_undetached(void*) (8fe0f0, 0, ff26c000, 0, 8fe0f0, 11c0c) + 630
>    ff35269c omni_thread_wrapper (8fe0f0, fecf5d38, 0, 5, 1, fe401000) + e4
>    ff25b01c _thread_start (8fe0f0, 0, 0, 0, 0, 0) + 40
> 
> - while this thread is still blocked, another client
>   thread makes another CORBA call to the server, to a
>   different remote object. Most of the time, this call
>   goes through.  But once in a while, the call "freezes":
>   --------------------------  thread# 36  --------------------
>    ff2481ac cond_wait (fe181d98, 0, 0, ff26c000, 0, 0) + 11c
>    ff248070 pthread_cond_wait (9289b0, 7fc2e0, 1, 0, 72e2ac, 0) + 8
>    ff352094 void omni_condition::wait() (9289a8, 44d00, 1, 0, 3039, ffff8000) + 18
>    ff1c835c void Strand::Sync::RdLock(unsigned char) (fe180f68, 9289a0, ff1c89cc, fef555ec, 9289a0, ffffffff) + 50
>    ff1c8170 Strand::Sync::Sync #Nvariant 1(Rope*,unsigned char,unsigned char) (fe180f68, 7fc2d8, 1, 1, 4ea70, fe180fbc) + 60
>    ff1be128 NetBufferedStream::NetBufferedStream(Rope*,unsigned char,unsigned char,unsigned) (fe180f68, 7fc2d8, 1, 1, 0, bc) + 28
>    ff1b3768 GIOP_C::GIOP_C #Nvariant 1(Rope*) (fe180f68, 7fc2d8, 6, 1, fef6a07c, bc) + 14
>    ff1d039c void OmniProxyCallWrapper::invoke(omniObject*,OmniProxyCallDesc&) (83b00c, fe181034, 6, 0, ff1d11a8, ff20cb74) + 7c
>    00451ba8 void Common::_proxy_Transaction::begin() (83b068, 271078, 866900, 8fff50, fef6a07c, 20) + 60
>    00266834 void ProxyAgentDevice::transactionStart() (883808, fe1812f4, 1, 0, 72e2ac, 61d836e4) + 1ac
>    0026533c void ProxyAgentDevice::transaction() (883808, 271078, 1, 0, fe1818cb, 0) + 12c
>    00263d00 void ProxyAgentDevice::configure() (883808, fe181874, 1, 967ed0, fe181943, 0) + 760
>    002958ac void ProxyAgentDevicePoller::doConfig() (85f168, 85f2c8, 0, fe1819eb, 0, 0) + 324
>    002a0398 void DeferredCall<ProxyAgentDevicePoller>::timerCallback(const Scheduler::Timer*) (85f2b8, 8773f8, fe181a93, 0, 0, 2d3f9822) + 80
>    00335b68 void Scheduler::ScheduledTimer::triggerTimer(Scheduler::Timer*)const (9603f0, 8773f8, 0, fffffff8, 0, 888191) + 28
>    00334a38 void*Scheduler::run_undetached(void*) (84e0d0, 0, ff26c000, 0, 84e0d0, 11c0c) + 630
>    ff35269c omni_thread_wrapper (84e0d0, fec25d38, 0, 5, 1, fe401000) + e4
>    ff25b01c _thread_start (84e0d0, 0, 0, 0, 0, 0) + 40
> 
> - This sometimes happens when there is one connection
>   already between client and server, and sometimes when
>   there are more (provided the total connections remains
>   <= 5, in this instance).  It happens more often on a
>   multi-CPU host.
> 
> - When this happens, the only way to get the second CORBA
>   call to go through is to wait for the first call to
>   complete.  This severely impacts the software's
>   predictability and throughput.
> 
> >From what I can see, the creation of a GIOP_C object
> results in an attempt to lock the Rope given to this
> object (by the grandparent class Strand::Sync). This
> Rope is unlocked in GIOP_C dtor.  If there is an
> existing GIOP_C object for thread 0x100, it will have
> the lock on the Rope. If the same Rope is passed to
> the GIOP_C object in thread 0x36, then this ctor will
> block trying to acquire the read-lock on the Rope
> (since the object in thread 0x100 already has the
> write-lock on this Rope).
> 
> But what I don't understand is: why does this not
> happen all the time?  Is a different Rope handed to
> each GIOP_C under normal circumstances?
> 
> I do realize the version of omniORB is very old - but
> there are constraints that prevent an upgrade in the
> near timeframe.  I'll settle for an explanation of how
> this is supposed to work and any pointers on where I
> can start digging.  If this is an issue recognized and
> solved in a subsequent release, that would be great &
> I'll appreciate any info on the fix location in CVS.
> 
> Regards,
> Rak.