[omniORB] omni::giopRope::match() crash

Michael Teske subscribe at teskor.de
Wed Apr 8 07:13:08 BST 2015


Hi,

I got another crash
(gdb) where
#0  0xf7edfd47 in omni::giopRope::match (this=0x86a7738, addrlist=..., info=0xf0d2b6e0) at giopRope.cc:647
#1  0xf7ee0ef4 in omni::giopRope::selectRope (addrlist=..., info=0xf0d2b6e0, rope=@0xf35f2c4c, is_local=@0xf35f2c47) at giopRope.cc:588
#2  0xf7ea1e4a in omni::createIdentity (ior=0xf0d0dfe8, target=0xf7f33465 "IDL:omg.org/CORBA/Object:1.0", locked=false) at omniInternal.cc:734
#3  0xf7ea3b09 in omni::createObjRef (targetRepoId=0xf7f33465 "IDL:omg.org/CORBA/Object:1.0", ior=0xf0d0dfe8, locked=false, id=0x0) at omniInternal.cc:810
#4  0xf7ed2933 in omni::corbalocURIHandler::locToObject (c=@0xf35f301c, cycles=0, def_key=0x0) at uri.cc:889
#5  0xf7ed3c86 in omni::corbalocURIHandler::toObject (this=0xf7f8bd34, uri=0xf4e0460c "corbaloc::localhost:5571/ExchangeAgent", cycles=0) at uri.cc:488
#6  0xf7ed12f4 in omni::omniURI::stringToObject (uri=0xf4e0460c "corbaloc::localhost:5571/ExchangeAgent", cycles=0) at uri.cc:277
...

 this time I can see i and j in omni::giopRope::match():

(gdb) p *i
$8 = (omni::tcpAddress *) 0xf0d3ab30
(gdb) p *(*i)
$9 = (omni::tcpAddress) {
  <omni::giopAddress> = {
    _vptr.giopAddress = 0xf7f87788
  },
  members of omni::tcpAddress:
  pd_address = {
    host = {
      _ptr = 0xf0d0b808 "localhost"
    },
    port = 5571
  },
  pd_address_string = {
    _data = 0xf0d22f18 "giop:tcp:localhost:5571"
  }
}
(gdb) p *(*j)
Cannot access memory at address 0x1
(gdb) p (*j)
$10 = (omni::giopAddress * const) 0x1
(gdb) p j
$11 = (omni::giopAddress * const *) 0x8758a20
(gdb) p pd_addresses
$12 = {
  start = 0x8793f78,
  finish = 0x8793f80,
  end_of_storage = 0x8793f80
}

This can only mean, some other thread must have changed pd_addresses in beween. I notice, there's another thread in this giopRope 0x86a7738

#0  0xffffe410 in __kernel_vsyscall ()
#1  0x00375839 in __lll_lock_wait () from /lib/libpthread.so.0
#2  0x00370e9f in _L_lock_885 () from /lib/libpthread.so.0
#3  0x00370d66 in pthread_mutex_lock () from /lib/libpthread.so.0
#4  0xf7edf2ac in lock (this=0x86a7738, ior=0x8750370, key=0x87be56c "ExchangeAgentingl\345{\b\r", keysize=13, calldesc=0xf5bdafb0) at ../../../../include/omnithread.h:257
#5  ~omni_tracedmutex_unlock (this=0x86a7738, ior=0x8750370, key=0x87be56c "ExchangeAgentingl\345{\b\r", keysize=13, calldesc=0xf5bdafb0) at ../../../../include/omniORB4/tracedthread.h:172
#6  omni::giopRope::acquireClient (this=0x86a7738, ior=0x8750370, key=0x87be56c "ExchangeAgentingl\345{\b\r", keysize=13, calldesc=0xf5bdafb0) at giopRope.cc:164
#7  0xf7ed77c8 in omni::IOP_C_Holder::IOP_C_Holder (this=0xf5bdaef8, ior=0x8750370, key=0x87be56c "ExchangeAgentingl\345{\b\r", keysize=13, rope=0x86a7738, calldesc=0xf5bdafb0) at omniTransport.cc:74
#8  0xf7eca85c in omniRemoteIdentity::dispatch (this=0x87be568, call_desc=...) at remoteIdentity.cc:93
#9  0xf7eab7f0 in omniObjRef::_invoke (this=0x86b0414, call_desc=..., do_assert=false) at omniObjRef.cc:674
#10 0xf7eac6a9 in omniObjRef::_remote_is_a (this=0x86b0414, a_repoId=0x81ce888 "IDL:exchangeAgent/eaServer:1.0") at omniObjRef.cc:338
#11 0xf7eac79d in omniObjRef::_real_is_a (this=0x86b0414, repoId=0x81ce888 "IDL:exchangeAgent/eaServer:1.0") at omniObjRef.cc:109
#12 0xf7eac954 in omniObjRef::_realNarrow (this=0x86b0414, repoId=0x81ce888 "IDL:exchangeAgent/eaServer:1.0") at omniObjRef.cc:172


But it's correctly waiting in the lock (which we have,
(gdb) p omniTransportLock->posix_mutex.__data.__owner
$19 = 15174, this is the right thread id
), so something else must have happened.

when running with trace, I see this often (It tries to connect to some non-existing objects)
:
omniORB: (19) 2015-04-08 08:09:44.998692: Creating ref to remote: key<ExchangeAgent>
 target id      : IDL:omg.org/CORBA/Object:1.0
 most derived id:
omniORB: (19) 2015-04-08 08:09:44.998719: Resolve name 'sbz1'...
omniORB: (21) 2015-04-08 08:09:44.998731: Name 'sbz1' resolved to 192.168.68.194
omniORB: (21) 2015-04-08 08:09:44.998769: Client attempt to connect to giop:tcp:192.168.68.194:5571
omniORB: (21) 2015-04-08 08:09:44.998922: Failed to connect (no peer name): 192.168.68.194:5571
omniORB: (21) 2015-04-08 08:09:44.998958: Switch rope to use address giop:tcp:192.168.68.194:5571
omniORB: (21) 2015-04-08 08:09:44.998971: Unable to open new connection: giop:tcp:192.168.68.194:5571
omniORB: (21) 2015-04-08 08:09:44.998982: throw giopStream::CommFailure from giopStream.cc:1235(0,NO,TRANSIENT_ConnectFailed)
omniORB: (21) 2015-04-08 08:09:44.999036: throw TRANSIENT from omniObjRef.cc:723 (NO,TRANSIENT_ConnectFailed)
omniORB: (21) 2015-04-08 08:09:44.999094: omniRemoteIdentity deleted.
omniORB: (21) 2015-04-08 08:09:44.999109: ObjRef() -- deleted.
omniORB: (22) 2015-04-08 08:09:44.999648: Name 'sbz1' resolved to 192.168.68.194
omniORB: (22) 2015-04-08 08:09:44.999680: Client attempt to connect to giop:tcp:192.168.68.194:46136
omniORB: (19) 2015-04-08 08:09:44.999765: Name 'sbz1' resolved to 192.168.68.194
omniORB: (19) 2015-04-08 08:09:44.999799: Client attempt to connect to giop:tcp:192.168.68.194:5572
omniORB: (22) 2015-04-08 08:09:44.999832: Failed to connect (no peer name): 192.168.68.194:46136
omniORB: (22) 2015-04-08 08:09:44.999853: Switch rope to use address giop:tcp:192.168.68.194:46136
omniORB: (22) 2015-04-08 08:09:44.999866: Unable to open new connection: giop:tcp:192.168.68.194:46136
omniORB: (22) 2015-04-08 08:09:44.999885: throw giopStream::CommFailure from giopStream.cc:1235(0,NO,TRANSIENT_ConnectFailed)
omniORB: (22) 2015-04-08 08:09:44.999938: throw TRANSIENT from omniObjRef.cc:723 (NO,TRANSIENT_ConnectFailed)
omniORB: (22) 2015-04-08 08:09:44.999996: omniRemoteIdentity deleted.
omniORB: (22) 2015-04-08 08:09:45.000011: ObjRef() -- deleted.
omniORB: (19) 2015-04-08 08:09:45.000062: Failed to connect (no peer name): 192.168.68.194:5572
omniORB: (19) 2015-04-08 08:09:45.000083: Switch rope to use address giop:tcp:192.168.68.194:5572
omniORB: (19) 2015-04-08 08:09:45.000095: Unable to open new connection: giop:tcp:192.168.68.194:5572
omniORB: (19) 2015-04-08 08:09:45.000106: throw giopStream::CommFailure from giopStream.cc:1235(0,NO,TRANSIENT_ConnectFailed)
omniORB: (19) 2015-04-08 08:09:45.000154: throw TRANSIENT from omniObjRef.cc:723 (NO,TRANSIENT_ConnectFailed)
omniORB: (19) 2015-04-08 08:09:45.000211: omniRemoteIdentity deleted.
omniORB: (19) 2015-04-08 08:09:45.000226: ObjRef() -- deleted.
omniORB: (38) 2015-04-08 08:09:45.844566: inputMessage: from giop:tcp:[::ffff:192.168.68.54]:57101 64 bytes
omniORB: (38) 2015-04-08 08:09:45.844637: sendChunk: to giop:tcp:[::ffff:192.168.68.54]:57101 25 bytes
omniORB: (1) 2015-04-08 08:09:45.894643: SocketCollection idle. Sleeping. 


Can I do anything else to debug this?

Greetings,
  Michael

Am 07.04.15 um 18:09 schrieb Michael Teske:
> Hi,
> 
> did you ever resolve this problem?  I had the same right now with omniORB 4.2:
> 
> (gdb) where
> #0  0x00000005 in ?? () 
> #1  0xf7f22d4f in omni::giopRope::match (this=0xf214b728, addrlist=..., info=0x8af0088) at giopRope.cc:647
> #2  0xf7f23ef4 in omni::giopRope::selectRope (addrlist=..., info=0x8af0088, rope=@0xf4883c4c, is_local=@0xf4883c47) at giopRope.cc:588
> #3  0xf7ee4e4a in omni::createIdentity (ior=0xe7dba388, target=0xf7f76465 "IDL:omg.org/CORBA/Object:1.0", locked=false) at omniInternal.cc:734
> #4  0xf7ee6b09 in omni::createObjRef (targetRepoId=0xf7f76465 "IDL:omg.org/CORBA/Object:1.0", ior=0xe7dba388, locked=false, id=0x0) at omniInternal.cc:810
> #5  0xf7f15933 in omni::corbalocURIHandler::locToObject (c=@0xf488401c, cycles=0, def_key=0x0) at uri.cc:889
> #6  0xf7f16c86 in omni::corbalocURIHandler::toObject (this=0xf7fced34, uri=0x8a525ac "corbaloc::sbz1:5572/ExchangeAgent", cycles=0) at uri.cc:488
> #7  0xf7f142f4 in omni::omniURI::stringToObject (uri=0x8a525ac "corbaloc::sbz1:5572/ExchangeAgent", cycles=0) at uri.cc:277
> #8  0xf7eb5be1 in omniOrbORB::string_to_object (this=0x8a38238, uri=0x8a525ac "corbaloc::sbz1:5572/ExchangeAgent") at corbaOrb.cc:426
> #9  0x0807b4da in UsedEA::run (this=0x8a525d8) at /home/build/Builds/Trader-build857/Trader/src/LoginServer/EAInfo.cpp:500
> #10 0x0808215e in tools::P0Hook<UsedEA>::run (this=0x8a526e8) at /home/build/Builds/Trader-build857/Trader/include/tools/ThreadHook.hpp:41
> #11 0x081c54d8 in omniJTCThread::entrance_hook (this=0x8a526e8) at /home/build/Builds/Trader-build857/Trader/src/libomniJTC/Thread.cpp:772
> #12 0x081c5612 in lsf_thread_adapter (arg=0x8a526e8) at /home/build/Builds/Trader-build857/Trader/src/libomniJTC/Thread.cpp:85
> #13 0xf7fd3edc in omni_thread_wrapper (ptr=0x8a52990) at posix.cc:447
> #14 0x0036e912 in start_thread () from /lib/libpthread.so.0
> #15 0x002ad7ce in clone () from /lib/libc.so.6
> 
> (gdb) p i
> $1 = <value optimized out>
> 
> (unfortunately we used -O2). But:
> 
> (gdb) p addrlist
> $2 = (const omni::giopAddressList &) @0x8af008c: {
>   start = 0x8d922d0,
>   finish = 0x8d922d4,
>   end_of_storage = 0x8d922d4
> }
> 
> (gdb) p *(*addrlist->start)
> $6 = (omni::tcpAddress) {
>   <omni::giopAddress> = {
>     _vptr.giopAddress = 0xf7fca788
>   },
>   members of omni::tcpAddress:
>   pd_address = {
>     host = {
>       _ptr = 0x8d3feb8 "sbz1"
>     },
>     port = 5572
>   },
>   pd_address_string = {
>     _data = 0x8ca90f0 "giop:tcp:sbz1:5572"
>   }
> }
> 
> looks ok to me.
> 
> Same here:
> 
> (gdb) p pd_addresses
> $11 = {
>   start = 0xf21a0b80,
>   finish = 0xf21a0b88,
>   end_of_storage = 0xf21a0b88
> }
> 
> (gdb) p *(*pd_addresses.start)
> $8 = (omni::tcpAddress) {
>   <omni::giopAddress> = {
>     _vptr.giopAddress = 0xf7fca788
>   },
>   members of omni::tcpAddress:
>   pd_address = {
>     host = {
>       _ptr = 0xf21fa5c0 "sbz1"
>     },
>     port = 5571
>   },
>   pd_address_string = {
>     _data = 0xe7dba018 "giop:tcp:sbz1:5571"
>   }
> }
> 
> (gdb) p *(*(pd_addresses.start+1))
> $10 = (omni::tcpAddress) {
>   <omni::giopAddress> = {
>     _vptr.giopAddress = 0xf7fca788
>   },
>   members of omni::tcpAddress:
>   pd_address = {
>     host = {
>       _ptr = 0xf218a848 "192.168.68.194"
>     },
>     port = 5571
>   },
>   pd_address_string = {
>     _data = 0xe7dba400 "giop:tcp:192.168.68.194:5571"
>   }
> }
> 
> 
> I can't see anything obviously wrong here. Maybe other threads?
> 
> At the moment it seems reproducible, maybe I can test some more?
> If I add
> -ORBtraceLevel 25 -ORBtraceThreadId 1 -ORBtraceTime 1
> to the start parameters, it doesn't crash. Missing mutex somewhere?
> 
> Thanks for any help,
>   Michael
> 
> 
> 
> Am 16.09.14 um 19:42 schrieb Duncan Grisby:
>> On Fri, 2014-09-12 at 19:20 -0500, Michael Lim wrote:
>>> More update to the omni:giopRope::match() crash... 
>>
>> [...]
>>
>>> pd_addresses
>>>  start  0x10062170 (contains 0x20202020)
>>>  finish 0x10062180 (contains 0xE9000003)
>>>
>>> 00000000  20 20 20 20 54 20 20 20 10 93 86 c0 10 06 21 88  |    T
>>>  ......!.|
>>> 00000010  e9 00 00 03 00 00 00 19  0f 6c 30 e0 10 05 cb 38
>>>  |.........l0....8|
>>>
>>> So that lead to the crash.  The question right now is how/when/where
>>> pd_addresses is getting initialized? 
>>>
>> Are you able to reproduce the problem with a debug build of omniORB?
>> Edit src/lib/omniORB/orbcore/dir.mk and uncomment the line that sets
>>
>> CXXDEBUGFLAGS = -g
>>
>> That will make it easier to see what is going on. It might also reveal
>> something useful to run with lots of omniORB tracing:
>>
>> -ORBtraceLevel 25 -ORBtraceThreadId 1 -ORBtraceTime 1
>>
>>
>> Duncan.
>>
> 
> 
> _______________________________________________
> omniORB-list mailing list
> omniORB-list at omniorb-support.com
> http://www.omniorb-support.com/mailman/listinfo/omniorb-list
> 




More information about the omniORB-list mailing list