[omniORB] Re: [omniORB 2.8.0] Connection handling bug?
Simha, Rakshit
rsimha at metasolv.com
Fri Jun 3 14:53:00 BST 2005
Well, I've done some digging in the archives and found this
posting:
http://www.omniorb-support.com/pipermail/omniorb-list/2000-January/014669.html
Consider this situation:
- max tcp connection per server is 5
- client makes 5 CORBA calls to a server; all 5 are open
- client makes 6th CORBA call to the server
- The above fix randomizes the strand that this 6th connection
waits on; say it waits on strand 2.
- Server completes calls on strands 1, 3, 4 and 5.
- The 6th CORBA call is still waiting even though 4 strands
are now available.
I guess what I need is the change that was applied to the
omniORB3pre stream, rather than the 2.8 stream. How can I go
about identifying the changeset applied to the 3pre stream?
Some CVS help, please...
Regards,
Rak.
PS: Please ignore the blather about Rope locking in my original
email.
"Simha, Rakshit" wrote:
>
> Hi folks,
>
> I am seeing some interesting behaviour with omniORB 2.8.0
> in the following situation:
>
> - Solaris 8; omniORB 2.8.0; client compiled using SunPRO;
> server compiled using gcc; client and server located on
> same host; both client and server have set max. TCP
> connections per server to (say) 5 before initializing
> the ORB.
>
> - multi-threaded client makes CORBA call to server; while
> server is processing the call, the client thread blocks:
> ----------------- lwp# 34 / thread# 100 --------------------
> fee1bfb8 recv (f, 9aba18, 2008, 0)
> ff1d53fc unsigned tcpSocketStrand::ll_recv(void*,unsigned) (9289a0, 9aba18, 2008, 9aba30, 9aba18, 9aba18) + c4
> ff1d2048 void reliableStreamStrand::fetch(unsigned long) (9289a0, 0, fffffff8, 9aba1f, ff1d1f78, 9aba18) + 58
> ff1d1c90 Strand::sbuf reliableStreamStrand::receive(unsigned,unsigned char,int, unsigned char) (9289a0, 2000, 0, 8, 1, 1) + 84
> ff1bea88 void NetBufferedStream::receive(unsigned,unsigned char) (fe080f08, 8, 1, 8, ff1d1f78, 0) + 11c
> ff1bf218 void*NetBufferedStream::align_and_get_bytes(omni::alignment_t,unsigned,unsigned char) (8, 1, 8, 1, 0, 9a9a40) + 54
> ff1be3c8 void NetBufferedStream::get_char_array(unsigned char*,int,omni::alignment_t,unsigned char) (fe080f08, fe080e40, 8, 1, 1, 0) + 134
> ff1b3bc8 GIOP::ReplyStatusType GIOP_C::ReceiveReply() (fe080f08, fe080f08, 3, 7437d9, 7d6dc8, 3f) + f4
> ff1d0448 void OmniProxyCallWrapper::invoke(omniObject*,OmniProxyCallDesc&) (83b13c, fe080fd4, 7437d9, 0, ff1d11a8, ff20cb74) + 128
> 004525e0 void Common::_proxy_Transaction::commit() (83b198, 271078, 927118, 87f9b8, fef6a07c, 20) + 60
> 00267398 void ProxyAgentDevice::transactionEnd() (8988b8, fe0812f4, 1, 0, 72e2ac, 61d836e4) + 2d8
> 0026537c void ProxyAgentDevice::transaction() (8988b8, 271078, 1, 0, fe0818cb, 0) + 16c
> 00263d00 void ProxyAgentDevice::configure() (8988b8, fe081874, 1, 925478, fe081943, 0) + 760
> 002958ac void ProxyAgentDevicePoller::doConfig() (827c70, 827dd0, 0, fe0819eb,0, 0) + 324
> 002a0398 void DeferredCall<ProxyAgentDevicePoller>::timerCallback(const Scheduler::Timer*) (827dc0, 92a680, fe081a93, 0, 0, 33d634e1) + 80
> 00335b68 void Scheduler::ScheduledTimer::triggerTimer(Scheduler::Timer*)const (97abc8, 92a680, 0, fffffff8, 0, 81fa39) + 28
> 00334a38 void*Scheduler::run_undetached(void*) (8fe0f0, 0, ff26c000, 0, 8fe0f0, 11c0c) + 630
> ff35269c omni_thread_wrapper (8fe0f0, fecf5d38, 0, 5, 1, fe401000) + e4
> ff25b01c _thread_start (8fe0f0, 0, 0, 0, 0, 0) + 40
>
> - while this thread is still blocked, another client
> thread makes another CORBA call to the server, to a
> different remote object. Most of the time, this call
> goes through. But once in a while, the call "freezes":
> -------------------------- thread# 36 --------------------
> ff2481ac cond_wait (fe181d98, 0, 0, ff26c000, 0, 0) + 11c
> ff248070 pthread_cond_wait (9289b0, 7fc2e0, 1, 0, 72e2ac, 0) + 8
> ff352094 void omni_condition::wait() (9289a8, 44d00, 1, 0, 3039, ffff8000) + 18
> ff1c835c void Strand::Sync::RdLock(unsigned char) (fe180f68, 9289a0, ff1c89cc, fef555ec, 9289a0, ffffffff) + 50
> ff1c8170 Strand::Sync::Sync #Nvariant 1(Rope*,unsigned char,unsigned char) (fe180f68, 7fc2d8, 1, 1, 4ea70, fe180fbc) + 60
> ff1be128 NetBufferedStream::NetBufferedStream(Rope*,unsigned char,unsigned char,unsigned) (fe180f68, 7fc2d8, 1, 1, 0, bc) + 28
> ff1b3768 GIOP_C::GIOP_C #Nvariant 1(Rope*) (fe180f68, 7fc2d8, 6, 1, fef6a07c, bc) + 14
> ff1d039c void OmniProxyCallWrapper::invoke(omniObject*,OmniProxyCallDesc&) (83b00c, fe181034, 6, 0, ff1d11a8, ff20cb74) + 7c
> 00451ba8 void Common::_proxy_Transaction::begin() (83b068, 271078, 866900, 8fff50, fef6a07c, 20) + 60
> 00266834 void ProxyAgentDevice::transactionStart() (883808, fe1812f4, 1, 0, 72e2ac, 61d836e4) + 1ac
> 0026533c void ProxyAgentDevice::transaction() (883808, 271078, 1, 0, fe1818cb, 0) + 12c
> 00263d00 void ProxyAgentDevice::configure() (883808, fe181874, 1, 967ed0, fe181943, 0) + 760
> 002958ac void ProxyAgentDevicePoller::doConfig() (85f168, 85f2c8, 0, fe1819eb, 0, 0) + 324
> 002a0398 void DeferredCall<ProxyAgentDevicePoller>::timerCallback(const Scheduler::Timer*) (85f2b8, 8773f8, fe181a93, 0, 0, 2d3f9822) + 80
> 00335b68 void Scheduler::ScheduledTimer::triggerTimer(Scheduler::Timer*)const (9603f0, 8773f8, 0, fffffff8, 0, 888191) + 28
> 00334a38 void*Scheduler::run_undetached(void*) (84e0d0, 0, ff26c000, 0, 84e0d0, 11c0c) + 630
> ff35269c omni_thread_wrapper (84e0d0, fec25d38, 0, 5, 1, fe401000) + e4
> ff25b01c _thread_start (84e0d0, 0, 0, 0, 0, 0) + 40
>
> - This sometimes happens when there is one connection
> already between client and server, and sometimes when
> there are more (provided the total connections remains
> <= 5, in this instance). It happens more often on a
> multi-CPU host.
>
> - When this happens, the only way to get the second CORBA
> call to go through is to wait for the first call to
> complete. This severely impacts the software's
> predictability and throughput.
>
> >From what I can see, the creation of a GIOP_C object
> results in an attempt to lock the Rope given to this
> object (by the grandparent class Strand::Sync). This
> Rope is unlocked in GIOP_C dtor. If there is an
> existing GIOP_C object for thread 0x100, it will have
> the lock on the Rope. If the same Rope is passed to
> the GIOP_C object in thread 0x36, then this ctor will
> block trying to acquire the read-lock on the Rope
> (since the object in thread 0x100 already has the
> write-lock on this Rope).
>
> But what I don't understand is: why does this not
> happen all the time? Is a different Rope handed to
> each GIOP_C under normal circumstances?
>
> I do realize the version of omniORB is very old - but
> there are constraints that prevent an upgrade in the
> near timeframe. I'll settle for an explanation of how
> this is supposed to work and any pointers on where I
> can start digging. If this is an issue recognized and
> solved in a subsequent release, that would be great &
> I'll appreciate any info on the fix location in CVS.
>
> Regards,
> Rak.
More information about the omniORB-list
mailing list