[omniORB] Timed out waiting for rendezvousers to terminate
Mike Richmond
mike.richmond at globalgraphics.com
Fri Sep 21 18:18:56 BST 2007
I am using omniORB 4.1.0 and am seeing a problem whereby my app takes
a long time (about 10 seconds) to quit. I've tracked this down to it
timing out when waiting for the rendezvouser to terminate
(giopServer.cc:683). I've tried extending the wait by changing the
value of timeout in gdb to 30 seconds, but the rendezvouser is still
timed out. Before and after the wait the rendezvouser thread is in
this state:
Thread 3 (process 993 thread 0x160b):
#0 0x9001a1cc in select ()
#1 0x0207fe42 in omni::do_select (maxfd=17, r=0xb0101cd4, w=0x0,
e=0x0, t=0x0) at SocketCollection.cc:1161
#2 0x02080136 in omni::SocketCollection::Select (this=0x493c258) at
SocketCollection.cc:1239
#3 0x020a2c82 in omni::tcpEndpoint::AcceptAndMonitor
(this=0x493c250, func=0x206a334
<omni::giopRendezvouser::notifyReadable(void*,
omni::giopConnection*)>, cookie=0x493c700) at ./tcp/tcpEndpoint.cc:613
#4 0x0206a425 in omni::giopRendezvouser::execute (this=0x493c700) at
giopRendezvouser.cc:97
#5 0x020bcd95 in omniAsyncWorker::real_run (this=0x493c730) at invoker.cc:234
#6 0x02023811 in omniAsyncWorkerInfo::run (this=0xb0101ef4) at invoker.cc:282
#7 0x020bd033 in omniAsyncWorker::run (this=0x493c730) at invoker.cc:161
#8 0x015d3862 in omni_thread_wrapper (ptr=0x493c730) at posix.cc:451
#9 0x90024227 in _pthread_body ()
Adding a breakpoint shows that omni::SocketCollection::Select() does
not return.
However if I step through the rendezvouser terminate() method, and in
particular through tcpAddress->Poke(), then
omni::SocketCollection::Select() does return. In tcpAddress->Poke()
::connect() gives EINPROGRESS, and CLOSESOCKET() returns 0.
My theory is that it is possible to close the socket in
tcpAddress->Poke() before it has "done enough to poke the endpoint".
In support of this theory I observe that
omni::SocketCollection::Select() returns if I sleep for a short time
before closing the socket in tcpAddress->Poke(), or if I undefine
USE_NONBLOCKING_CONNECT.
My machine is pretty quick - a 2 x 2.66 GHz Dual-Core Intel Xeon Mac
Pro running Mac OS X 10.4.10. Unfortunately I don't know enough
about sockets to know if this is a problem with tcpAddress->Poke(),
or with the socket implementation on Mac OS X. After some googling I
tried adding a loop calling getsockopt( sock, SOL_SOCKET, SO_ERROR,
&err, &len ) between ::connect() and CLOSESOCKET() but getsockopt()
returned 0, err = 0 on the first call and didn't fix my problem.
Any suggestions?
Mike Richmond
Global Graphics Software Ltd
More information about the omniORB-list
mailing list