[omniORB] omni::giopRope::match() crash
Michael Teske
subscribe at teskor.de
Wed Apr 8 07:13:08 BST 2015
Hi,
I got another crash
(gdb) where
#0 0xf7edfd47 in omni::giopRope::match (this=0x86a7738, addrlist=..., info=0xf0d2b6e0) at giopRope.cc:647
#1 0xf7ee0ef4 in omni::giopRope::selectRope (addrlist=..., info=0xf0d2b6e0, rope=@0xf35f2c4c, is_local=@0xf35f2c47) at giopRope.cc:588
#2 0xf7ea1e4a in omni::createIdentity (ior=0xf0d0dfe8, target=0xf7f33465 "IDL:omg.org/CORBA/Object:1.0", locked=false) at omniInternal.cc:734
#3 0xf7ea3b09 in omni::createObjRef (targetRepoId=0xf7f33465 "IDL:omg.org/CORBA/Object:1.0", ior=0xf0d0dfe8, locked=false, id=0x0) at omniInternal.cc:810
#4 0xf7ed2933 in omni::corbalocURIHandler::locToObject (c=@0xf35f301c, cycles=0, def_key=0x0) at uri.cc:889
#5 0xf7ed3c86 in omni::corbalocURIHandler::toObject (this=0xf7f8bd34, uri=0xf4e0460c "corbaloc::localhost:5571/ExchangeAgent", cycles=0) at uri.cc:488
#6 0xf7ed12f4 in omni::omniURI::stringToObject (uri=0xf4e0460c "corbaloc::localhost:5571/ExchangeAgent", cycles=0) at uri.cc:277
...
this time I can see i and j in omni::giopRope::match():
(gdb) p *i
$8 = (omni::tcpAddress *) 0xf0d3ab30
(gdb) p *(*i)
$9 = (omni::tcpAddress) {
<omni::giopAddress> = {
_vptr.giopAddress = 0xf7f87788
},
members of omni::tcpAddress:
pd_address = {
host = {
_ptr = 0xf0d0b808 "localhost"
},
port = 5571
},
pd_address_string = {
_data = 0xf0d22f18 "giop:tcp:localhost:5571"
}
}
(gdb) p *(*j)
Cannot access memory at address 0x1
(gdb) p (*j)
$10 = (omni::giopAddress * const) 0x1
(gdb) p j
$11 = (omni::giopAddress * const *) 0x8758a20
(gdb) p pd_addresses
$12 = {
start = 0x8793f78,
finish = 0x8793f80,
end_of_storage = 0x8793f80
}
This can only mean, some other thread must have changed pd_addresses in beween. I notice, there's another thread in this giopRope 0x86a7738
#0 0xffffe410 in __kernel_vsyscall ()
#1 0x00375839 in __lll_lock_wait () from /lib/libpthread.so.0
#2 0x00370e9f in _L_lock_885 () from /lib/libpthread.so.0
#3 0x00370d66 in pthread_mutex_lock () from /lib/libpthread.so.0
#4 0xf7edf2ac in lock (this=0x86a7738, ior=0x8750370, key=0x87be56c "ExchangeAgentingl\345{\b\r", keysize=13, calldesc=0xf5bdafb0) at ../../../../include/omnithread.h:257
#5 ~omni_tracedmutex_unlock (this=0x86a7738, ior=0x8750370, key=0x87be56c "ExchangeAgentingl\345{\b\r", keysize=13, calldesc=0xf5bdafb0) at ../../../../include/omniORB4/tracedthread.h:172
#6 omni::giopRope::acquireClient (this=0x86a7738, ior=0x8750370, key=0x87be56c "ExchangeAgentingl\345{\b\r", keysize=13, calldesc=0xf5bdafb0) at giopRope.cc:164
#7 0xf7ed77c8 in omni::IOP_C_Holder::IOP_C_Holder (this=0xf5bdaef8, ior=0x8750370, key=0x87be56c "ExchangeAgentingl\345{\b\r", keysize=13, rope=0x86a7738, calldesc=0xf5bdafb0) at omniTransport.cc:74
#8 0xf7eca85c in omniRemoteIdentity::dispatch (this=0x87be568, call_desc=...) at remoteIdentity.cc:93
#9 0xf7eab7f0 in omniObjRef::_invoke (this=0x86b0414, call_desc=..., do_assert=false) at omniObjRef.cc:674
#10 0xf7eac6a9 in omniObjRef::_remote_is_a (this=0x86b0414, a_repoId=0x81ce888 "IDL:exchangeAgent/eaServer:1.0") at omniObjRef.cc:338
#11 0xf7eac79d in omniObjRef::_real_is_a (this=0x86b0414, repoId=0x81ce888 "IDL:exchangeAgent/eaServer:1.0") at omniObjRef.cc:109
#12 0xf7eac954 in omniObjRef::_realNarrow (this=0x86b0414, repoId=0x81ce888 "IDL:exchangeAgent/eaServer:1.0") at omniObjRef.cc:172
But it's correctly waiting in the lock (which we have,
(gdb) p omniTransportLock->posix_mutex.__data.__owner
$19 = 15174, this is the right thread id
), so something else must have happened.
when running with trace, I see this often (It tries to connect to some non-existing objects)
:
omniORB: (19) 2015-04-08 08:09:44.998692: Creating ref to remote: key<ExchangeAgent>
target id : IDL:omg.org/CORBA/Object:1.0
most derived id:
omniORB: (19) 2015-04-08 08:09:44.998719: Resolve name 'sbz1'...
omniORB: (21) 2015-04-08 08:09:44.998731: Name 'sbz1' resolved to 192.168.68.194
omniORB: (21) 2015-04-08 08:09:44.998769: Client attempt to connect to giop:tcp:192.168.68.194:5571
omniORB: (21) 2015-04-08 08:09:44.998922: Failed to connect (no peer name): 192.168.68.194:5571
omniORB: (21) 2015-04-08 08:09:44.998958: Switch rope to use address giop:tcp:192.168.68.194:5571
omniORB: (21) 2015-04-08 08:09:44.998971: Unable to open new connection: giop:tcp:192.168.68.194:5571
omniORB: (21) 2015-04-08 08:09:44.998982: throw giopStream::CommFailure from giopStream.cc:1235(0,NO,TRANSIENT_ConnectFailed)
omniORB: (21) 2015-04-08 08:09:44.999036: throw TRANSIENT from omniObjRef.cc:723 (NO,TRANSIENT_ConnectFailed)
omniORB: (21) 2015-04-08 08:09:44.999094: omniRemoteIdentity deleted.
omniORB: (21) 2015-04-08 08:09:44.999109: ObjRef() -- deleted.
omniORB: (22) 2015-04-08 08:09:44.999648: Name 'sbz1' resolved to 192.168.68.194
omniORB: (22) 2015-04-08 08:09:44.999680: Client attempt to connect to giop:tcp:192.168.68.194:46136
omniORB: (19) 2015-04-08 08:09:44.999765: Name 'sbz1' resolved to 192.168.68.194
omniORB: (19) 2015-04-08 08:09:44.999799: Client attempt to connect to giop:tcp:192.168.68.194:5572
omniORB: (22) 2015-04-08 08:09:44.999832: Failed to connect (no peer name): 192.168.68.194:46136
omniORB: (22) 2015-04-08 08:09:44.999853: Switch rope to use address giop:tcp:192.168.68.194:46136
omniORB: (22) 2015-04-08 08:09:44.999866: Unable to open new connection: giop:tcp:192.168.68.194:46136
omniORB: (22) 2015-04-08 08:09:44.999885: throw giopStream::CommFailure from giopStream.cc:1235(0,NO,TRANSIENT_ConnectFailed)
omniORB: (22) 2015-04-08 08:09:44.999938: throw TRANSIENT from omniObjRef.cc:723 (NO,TRANSIENT_ConnectFailed)
omniORB: (22) 2015-04-08 08:09:44.999996: omniRemoteIdentity deleted.
omniORB: (22) 2015-04-08 08:09:45.000011: ObjRef() -- deleted.
omniORB: (19) 2015-04-08 08:09:45.000062: Failed to connect (no peer name): 192.168.68.194:5572
omniORB: (19) 2015-04-08 08:09:45.000083: Switch rope to use address giop:tcp:192.168.68.194:5572
omniORB: (19) 2015-04-08 08:09:45.000095: Unable to open new connection: giop:tcp:192.168.68.194:5572
omniORB: (19) 2015-04-08 08:09:45.000106: throw giopStream::CommFailure from giopStream.cc:1235(0,NO,TRANSIENT_ConnectFailed)
omniORB: (19) 2015-04-08 08:09:45.000154: throw TRANSIENT from omniObjRef.cc:723 (NO,TRANSIENT_ConnectFailed)
omniORB: (19) 2015-04-08 08:09:45.000211: omniRemoteIdentity deleted.
omniORB: (19) 2015-04-08 08:09:45.000226: ObjRef() -- deleted.
omniORB: (38) 2015-04-08 08:09:45.844566: inputMessage: from giop:tcp:[::ffff:192.168.68.54]:57101 64 bytes
omniORB: (38) 2015-04-08 08:09:45.844637: sendChunk: to giop:tcp:[::ffff:192.168.68.54]:57101 25 bytes
omniORB: (1) 2015-04-08 08:09:45.894643: SocketCollection idle. Sleeping.
Can I do anything else to debug this?
Greetings,
Michael
Am 07.04.15 um 18:09 schrieb Michael Teske:
> Hi,
>
> did you ever resolve this problem? I had the same right now with omniORB 4.2:
>
> (gdb) where
> #0 0x00000005 in ?? ()
> #1 0xf7f22d4f in omni::giopRope::match (this=0xf214b728, addrlist=..., info=0x8af0088) at giopRope.cc:647
> #2 0xf7f23ef4 in omni::giopRope::selectRope (addrlist=..., info=0x8af0088, rope=@0xf4883c4c, is_local=@0xf4883c47) at giopRope.cc:588
> #3 0xf7ee4e4a in omni::createIdentity (ior=0xe7dba388, target=0xf7f76465 "IDL:omg.org/CORBA/Object:1.0", locked=false) at omniInternal.cc:734
> #4 0xf7ee6b09 in omni::createObjRef (targetRepoId=0xf7f76465 "IDL:omg.org/CORBA/Object:1.0", ior=0xe7dba388, locked=false, id=0x0) at omniInternal.cc:810
> #5 0xf7f15933 in omni::corbalocURIHandler::locToObject (c=@0xf488401c, cycles=0, def_key=0x0) at uri.cc:889
> #6 0xf7f16c86 in omni::corbalocURIHandler::toObject (this=0xf7fced34, uri=0x8a525ac "corbaloc::sbz1:5572/ExchangeAgent", cycles=0) at uri.cc:488
> #7 0xf7f142f4 in omni::omniURI::stringToObject (uri=0x8a525ac "corbaloc::sbz1:5572/ExchangeAgent", cycles=0) at uri.cc:277
> #8 0xf7eb5be1 in omniOrbORB::string_to_object (this=0x8a38238, uri=0x8a525ac "corbaloc::sbz1:5572/ExchangeAgent") at corbaOrb.cc:426
> #9 0x0807b4da in UsedEA::run (this=0x8a525d8) at /home/build/Builds/Trader-build857/Trader/src/LoginServer/EAInfo.cpp:500
> #10 0x0808215e in tools::P0Hook<UsedEA>::run (this=0x8a526e8) at /home/build/Builds/Trader-build857/Trader/include/tools/ThreadHook.hpp:41
> #11 0x081c54d8 in omniJTCThread::entrance_hook (this=0x8a526e8) at /home/build/Builds/Trader-build857/Trader/src/libomniJTC/Thread.cpp:772
> #12 0x081c5612 in lsf_thread_adapter (arg=0x8a526e8) at /home/build/Builds/Trader-build857/Trader/src/libomniJTC/Thread.cpp:85
> #13 0xf7fd3edc in omni_thread_wrapper (ptr=0x8a52990) at posix.cc:447
> #14 0x0036e912 in start_thread () from /lib/libpthread.so.0
> #15 0x002ad7ce in clone () from /lib/libc.so.6
>
> (gdb) p i
> $1 = <value optimized out>
>
> (unfortunately we used -O2). But:
>
> (gdb) p addrlist
> $2 = (const omni::giopAddressList &) @0x8af008c: {
> start = 0x8d922d0,
> finish = 0x8d922d4,
> end_of_storage = 0x8d922d4
> }
>
> (gdb) p *(*addrlist->start)
> $6 = (omni::tcpAddress) {
> <omni::giopAddress> = {
> _vptr.giopAddress = 0xf7fca788
> },
> members of omni::tcpAddress:
> pd_address = {
> host = {
> _ptr = 0x8d3feb8 "sbz1"
> },
> port = 5572
> },
> pd_address_string = {
> _data = 0x8ca90f0 "giop:tcp:sbz1:5572"
> }
> }
>
> looks ok to me.
>
> Same here:
>
> (gdb) p pd_addresses
> $11 = {
> start = 0xf21a0b80,
> finish = 0xf21a0b88,
> end_of_storage = 0xf21a0b88
> }
>
> (gdb) p *(*pd_addresses.start)
> $8 = (omni::tcpAddress) {
> <omni::giopAddress> = {
> _vptr.giopAddress = 0xf7fca788
> },
> members of omni::tcpAddress:
> pd_address = {
> host = {
> _ptr = 0xf21fa5c0 "sbz1"
> },
> port = 5571
> },
> pd_address_string = {
> _data = 0xe7dba018 "giop:tcp:sbz1:5571"
> }
> }
>
> (gdb) p *(*(pd_addresses.start+1))
> $10 = (omni::tcpAddress) {
> <omni::giopAddress> = {
> _vptr.giopAddress = 0xf7fca788
> },
> members of omni::tcpAddress:
> pd_address = {
> host = {
> _ptr = 0xf218a848 "192.168.68.194"
> },
> port = 5571
> },
> pd_address_string = {
> _data = 0xe7dba400 "giop:tcp:192.168.68.194:5571"
> }
> }
>
>
> I can't see anything obviously wrong here. Maybe other threads?
>
> At the moment it seems reproducible, maybe I can test some more?
> If I add
> -ORBtraceLevel 25 -ORBtraceThreadId 1 -ORBtraceTime 1
> to the start parameters, it doesn't crash. Missing mutex somewhere?
>
> Thanks for any help,
> Michael
>
>
>
> Am 16.09.14 um 19:42 schrieb Duncan Grisby:
>> On Fri, 2014-09-12 at 19:20 -0500, Michael Lim wrote:
>>> More update to the omni:giopRope::match() crash...
>>
>> [...]
>>
>>> pd_addresses
>>> start 0x10062170 (contains 0x20202020)
>>> finish 0x10062180 (contains 0xE9000003)
>>>
>>> 00000000 20 20 20 20 54 20 20 20 10 93 86 c0 10 06 21 88 | T
>>> ......!.|
>>> 00000010 e9 00 00 03 00 00 00 19 0f 6c 30 e0 10 05 cb 38
>>> |.........l0....8|
>>>
>>> So that lead to the crash. The question right now is how/when/where
>>> pd_addresses is getting initialized?
>>>
>> Are you able to reproduce the problem with a debug build of omniORB?
>> Edit src/lib/omniORB/orbcore/dir.mk and uncomment the line that sets
>>
>> CXXDEBUGFLAGS = -g
>>
>> That will make it easier to see what is going on. It might also reveal
>> something useful to run with lots of omniORB tracing:
>>
>> -ORBtraceLevel 25 -ORBtraceThreadId 1 -ORBtraceTime 1
>>
>>
>> Duncan.
>>
>
>
> _______________________________________________
> omniORB-list mailing list
> omniORB-list at omniorb-support.com
> http://www.omniorb-support.com/mailman/listinfo/omniorb-list
>
More information about the omniORB-list
mailing list