[omniORB] omniORB exception and application crash in giopServer::removeConnectionAndWorker
Markus Czernek
Markus.Czernek at web.de
Mon Jan 22 08:50:00 GMT 2018
Hi,
we are using omniORB 4.2.2 on Windows Server 2012R2/2016 (x86 builds) for inter server process communication.
Since we switched from orb 4.1.6 to orb 4.2.2 we saw several process crashes with the same callstack inside the orb implementation:
This is the exception analysis from WinDBG:
*******************************************************************************
* *
* Exception Analysis *
* *
*******************************************************************************
Failed calling InternetOpenUrl, GLE=12002
FAULTING_IP:
omniORB422_vc9_rt!omni::giopServer::removeConnectionAndWorker+70 [orb422src\omniorb\dist\src\lib\omniorb\orbcore\giopserver.cc @ 1053]
01c7ce10 55 push ebp
EXCEPTION_RECORD: ffffffff -- (.exr 0xffffffffffffffff)
ExceptionAddress: 01c7ce10 (omniORB422_vc9_rt!omni::giopServer::removeConnectionAndWorker+0x00000070)
ExceptionCode: 40000015
ExceptionFlags: 00000000
NumberParameters: 0
DEFAULT_BUCKET_ID: STATUS_FATAL_APP_EXIT
PROCESS_NAME: Statistic_srv.exe
ERROR_CODE: (NTSTATUS) 0x40000015 - {Anwendungsbeendung} %hs
EXCEPTION_CODE: (Win32) 0x40000015 (1073741845) - <Unable to get error code text>
NTGLOBALFLAG: 0
APPLICATION_VERIFIER_FLAGS: 0
FAULTING_THREAD: 0000623c
PRIMARY_PROBLEM_CLASS: STATUS_FATAL_APP_EXIT
BUGCHECK_STR: APPLICATION_FAULT_STATUS_FATAL_APP_EXIT
LAST_CONTROL_TRANSFER: from 01c7cfd8 to 01c7ce10
STACK_TEXT:
03fbfc68 01c7cfd8 04e98790 594eb9c3 ffffffff omniORB422_vc9_rt!omni::giopServer::removeConnectionAndWorker+0x70 [orb422src\omniorb\dist\src\lib\omniorb\orbcore\giopserver.cc @ 1053]
03fbfc94 01c7e6bf 04e98790 367a8801 594eb913 omniORB422_vc9_rt!omni::giopServer::notifyWkDone+0x38 [orb422src\omniorb\dist\src\lib\omniorb\orbcore\giopserver.cc @ 1103]
03fbfcd0 01c2eb60 594eb88b 01cd129c 012a5870 omniORB422_vc9_rt!omni::giopWorker::execute+0xff [orb422src\omniorb\dist\src\lib\omniorb\orbcore\giopworker.cc @ 85]
03fbfd30 01c2f89c 01cd15dc 012a5870 00000000 omniORB422_vc9_rt!omniAsyncWorker::real_run+0x160 [orb422src\omniorb\dist\src\lib\omniorb\orbcore\invoker.cc @ 16707566]
03fbfd40 01c2e866 012a5870 594eb8eb 00000000 omniORB422_vc9_rt!omniAsyncPoolServer::workerRun+0x3c [orb422src\omniorb\dist\src\lib\omniorb\orbcore\invoker.cc @ 329]
03fbfd9c 01c4052b 03fbfe20 01c2f67e 03fbfdb8 omniORB422_vc9_rt!omniAsyncWorker::mid_run+0x1c6 [orb422src\omniorb\dist\src\lib\omniorb\orbcore\invoker.cc @ 514]
03fbfda4 01c2f67e 03fbfdb8 594eb80f 012a5870 omniORB422_vc9_rt!abortOnNativeExceptionInterceptor+0xb [orb422src\omniorb\dist\src\lib\omniorb\orbcore\omniinternal.cc @ 1455]
03fbfddc 10002f5f 00000000 00000000 750f3433 omniORB422_vc9_rt!omniAsyncWorker::run+0xbe [orb422src\omniorb\dist\src\lib\omniorb\orbcore\invoker.cc @ 126]
03fbfde8 750f3433 012a5870 2d8e3917 00000000 omnithread40_vc9_rt!omni_thread_wrapper+0x6f [orb422src\omniorb\dist\src\lib\omnithread\nt.cc @ 500]
03fbfe20 750f34c7 00000000 03fbfe38 770a919f msvcr90!_callthreadstartex+0x1b [f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c @ 348]
03fbfe2c 770a919f 012ad648 03fbfe7c 77aea8cb msvcr90!_threadstartex+0x69 [f:\dd\vctools\crt_bld\self_x86\crt\src\threadex.c @ 326]
03fbfe38 77aea8cb 012ad648 2f3143c2 00000000 kernel32!BaseThreadInitThunk+0xe
03fbfe7c 77aea8a1 ffffffff 77adf668 00000000 ntdll!__RtlUserThreadStart+0x20
03fbfe8c 00000000 750f345e 012ad648 00000000 ntdll!_RtlUserThreadStart+0x1b
FOLLOWUP_IP:
omniORB422_vc9_rt!omni::giopServer::removeConnectionAndWorker+70 [orb422src\omniorb\dist\src\lib\omniorb\orbcore\giopserver.cc @ 1053]
01c7ce10 55 push ebp
FAULTING_SOURCE_CODE:
1049:
1050: // Once we reach here, it is certain that the rendezvouser thread
1051: // would not take any interest in this connection anymore. It
1052: // is therefore safe to delete this record.
> 1053: pd_lock.lock();
1054:
1055: int workers;
1056: CORBA::Boolean singleshot = w->singleshot();
1057:
1058: if (singleshot)
SYMBOL_STACK_INDEX: 0
SYMBOL_NAME: omniORB422_vc9_rt!omni::giopServer::removeConnectionAndWorker+70
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: omniORB422_vc9_rt
IMAGE_NAME: omniORB422_vc9_rt.dll
DEBUG_FLR_IMAGE_TIMESTAMP: 5a53cf68
STACK_COMMAND: ~20s; .ecxr ; kb
FAILURE_BUCKET_ID: STATUS_FATAL_APP_EXIT_40000015_omniORB422_vc9_rt.dll!omni::giopServer::removeConnectionAndWorker
BUCKET_ID: APPLICATION_FAULT_STATUS_FATAL_APP_EXIT_omniORB422_vc9_rt!omni::giopServer::removeConnectionAndWorker+70
WATSON_STAGEONE_URL: http://watson.microsoft.com/StageOne/myprocess/5a540623/omniORB422_vc9_rt_dll/4_2_2_241/5a53cf68/40000015/0007ce10.htm?Retriage=1
Followup: MachineOwner
---------
>From source
src\lib\omniorb\orbcore\giopserver.cc line 1048
conn->clearSelectable(); seems to be the line where the crash occurs.
Could it that the object giopConnection* conn is already in destruction so that calling
conn->clearSelectable();
results in a pure virtual function call?
giopServer::removeConnectionAndWorker(giopWorker* w)
{
ASSERT_OMNI_TRACEDMUTEX_HELD(pd_lock, 0);
connectionState* cs;
CORBA::Boolean cs_removed = 0;
{
omni_tracedmutex_lock sync(pd_lock);
giopConnection* conn = w->strand()->connection;
conn->pd_dying = 1; // From now on, the giopServer will not create
// any more workers to serve this connection.
cs = csLocate(conn);
// We remove the lock on pd_lock before calling the connection's
// clearSelectable(). This is necessary so that a simultaneous
// callback from the Rendezvouser thread will have a chance to
// look at the connectionState table.
pd_lock.unlock();
conn->clearSelectable();
// Once we reach here, it is certain that the rendezvouser thread
// would not take any interest in this connection anymore. It
// is therefore safe to delete this record.
pd_lock.lock();
.....
The crashes appear randomly during traffic.
Has this behavior been seen anywhere else?
Any suggestions how to solve the issue?
Thanks
Markus
More information about the omniORB-list
mailing list