[omniORB] FW: [EXTERNAL]: Re: Segfault from libomniOrb

Tue Mar 25 11:21:46 UTC 2025

On Tue, 2025-03-25 at 03:12 +0000, Krishnamohan, Srivathsan via
omniORB-list wrote:

> Thanks for the reply. We already have the full stack trace of where
> libomniORB crashes. Every time it crashes in the same place with same
> stack trace. We do not know how to trigger it, so cannot debug using
> gdb. How will recompiling with -g help? It might prevent the crash
> from happening.

If you compile with -g then first you will get much better details in
the stack traces, and second, you can get rid of your signal handler
and contrive to get a core file. You can look at the core file in gdb
and properly see the state.

> There are multiple c++ applications which dynamically link to
> libomniOrb . When segfault happens in libomniOrb, I can see as many
> as 10 processes terminating. In the /var/log/messages , sometimes, I
> see messages such as:

That strongly suggests that something is broken about your operating
system or the VM. There are no circumstances in which a problem in one
process should be able to cause another process to crash.

>From your original traceback:

[... your signal handler...]

#4  <signal handler called>
#5  0x00007fe124f1a907 in omni::giopStream::inputMessage() () from
/home/user/libs_gcc92/libomniORB4.so.2
#6  0x00007fe124f2fe53 in
omni::giopImpl12::inputNewServerMessage(omni::giopStream*) () from
/home/user/libs_gcc92/libomniORB4.so.2

That is what you expect to see when there is a connection that
currently has no activity, or that has just received some data. I
suspect it is an innocent victim of the actual problem at an OS level.

> When multiple applications crash, they all show the same instruction
> pointer. Sometimes, I see this message too:
> 
> [136523.109657] traps: ServiceApp_6.2.[6406] trap stack segment
> ip:7f473e8758c4 sp:7ffcc54424d0 error:0 in
> libomniORB4.so.2[7f473e787000+1b5000]traps:

That suggests that something in the OS or VM is corrupting the code
loaded from the omniORB library. The operating system maps the shared
libraries into readonly pages, so it is impossible for the code in the
processes to damage it that way.

Could something be changing the .so file on disk?  If the OS memory
maps the file, changing the file contents could change the code that is
being executed.

Duncan.

-- 
 -- Duncan Grisby
  -- duncan at grisby.org
   -- https://www.grisby.org/