[omniORB] Idle connections
Sai-Lai Lo
S.Lo@orl.co.uk
01 Sep 1998 16:39:53 +0100
It looks to me your compiler is not generating thread safe exception
handling code. When two server threads are throw COMM_FAILURE exceptions,
the unwinding code got completely confused.
It is a common problem with compiling omniORB with gcc or egcs. It seems to
work with simple tests but core dump with concurrent activities. If you
can, I suggest you use Sun's CC on Solaris. Alternatively, you can give the
latest egcs snapshots a try. They are at pre-1.1 release phase.
By the way, attached is a test program which, if it core dumps, shows that
your compiler is generating non-thread safe exception handling.
Sai-Lai
P.S. Jens: Could you try this test program with your gcc-2.8.1 compiler?
--------------------- Cut here ------------------------------------------
// This test case demonstrate the bug in multithreaded exception handling
// egcs-19980803 for Alpha Linux Redhat 5.1
// egcs-19980803 for x86 Linux Redhat 5.1
//
// Compile:
// g++ -o bug1 -D_REENTRANT bug1.cc -lpthread
//
// On Alpha Linux: core dump
//
// $ ./bug1
// [1025] C
// [1025] B
// [2050] C
// [2050] B
// [1025] A
// [1025] ~A
// [2050] A
// [2050] ~A
// [1025] ~B
// [1025] ~C
// [1025] ~B
// [1025] ~C
// [1025] ~A
// zsh: illegal hardware instruction ./bug1
//
// On x86 Linux:
//
// The dtor of A was called twice. Once before the throw was caught
// and once after.
// % ./bug1
// [1025] C
// [1025] B
// [2050] C
// [2050] B
// [1025] A
// [1025] ~A
// [2050] A
// [2050] ~A
// [1025] ~B
// [1025] ~C
// [1025] ~A
// [2050] ~B
// [2050] ~C
// [2050] ~A
// [2050] C
// [2050] B
// [1025] C
// [1025] B
// [1025] A
// [1025] ~A
// [1025] ~B
// [1025] ~C
// [1025] ~A
// [2050] A
// [2050] ~A
// [2050] ~B
// [2050] ~C
// [2050] ~A
// [2050] C
// [2050] B
// [1025] C
// [1025] B
// [1025] A
// [1025] ~A
// [1025] ~B
// [1025] ~C
// [1025] ~A
// contact now block for a while
// [2050] A
// [2050] ~A
// [2050] ~B
// [2050] ~C
// [2050] ~A
// contact now block for a while
// Main thread about to exit
// %
#include <iostream.h>
#include <unistd.h>
#include <pthread.h>
class A {
public:
A() {
cerr << "[" << (long) pthread_self() << "] A" << endl;
}
~A() {
cerr << "[" << (long) pthread_self() << "] ~A" << endl;
}
A(const A& x) {
cerr << "[" << (long) pthread_self() << "] A(const A)" << endl;
}
A& operator=(const A& x) {
cerr << "[" << (long) pthread_self() << "] A::operator=" << endl;
return *this;
}
};
class B {
public:
B() {
cerr << "[" << (long) pthread_self() << "] B" << endl;
}
~B() {
cerr << "[" << (long) pthread_self() << "] ~B" << endl;
}
};
class C {
public:
C() {
cerr << "[" << (long) pthread_self() << "] C" << endl;
}
~C() {
cerr << "[" << (long) pthread_self() << "] ~C" << endl;
}
};
void
ff()
{
B b;
sleep(1);
throw A();
}
void f() {
try {
C d;
ff();
}
catch (...) {
}
}
extern "C"
void*
contact(void* ptr)
{
int loopcount = 3;
while (loopcount--) {
try {
sleep(1);
f();
}
catch (...) {
cerr << "Caught system exception. Abort" << endl;
return 0;
}
}
cerr << "contact now block for a while" << endl;
return 0;
}
int
main (int argc, char **argv) {
pthread_t worker1;
pthread_t worker2;
pthread_attr_t attr;
pthread_attr_init(&attr);
if (pthread_create(&worker1,&attr,contact,0) < 0) {
cerr << "Error: cannot create thread" << endl;
return 1;
}
if (pthread_create(&worker2,&attr,contact,0) < 0) {
cerr << "Error: cannot create thread" << endl;
return 1;
}
pthread_join(worker1,0);
pthread_join(worker2,0);
cerr << "Main thread about to exit" << endl;
return 0;
}
--------------------------------------------------------------------
>>>>> Dominic Chorafakis XE41 ext 9049 writes:
> I create an instance of such an object in one
> application. I then have two client
> applications which call the Ping method on the
> server object only once, then both
> client apps just sit in a loop and sleep.
> On the server side, the inScavenger runs after a
> while, and after it shuts down the
> two idle connections, the application crashes. I
> have tried to track down why and
> where but I've had no luck. This problem only
> happens if the two clients are started
> immediatly one after the other, so that the
> scavenger closes both idle connections
> within one idle scan loop. Also, this problem
> does not happen if I only start ONE
> of the clients.
> The problem is occuring with omniORB 2.5.0 on
> Solaris 2.6 using the Cygnus GNU
> compiler.
> Has anyone else had such problems ? Any
> suggestions ?
--
Dr. Sai-Lai Lo | Research Scientist
|
E-mail: S.Lo@orl.co.uk | Olivetti & Oracle Research Lab
| 24a Trumpington Street
Tel: +44 223 343000 | Cambridge CB2 1QA
Fax: +44 223 313542 | ENGLAND