[omniORB] LocateRequest gets dropped on RHEL 6.1 system
Brad Fawcett
bfawcett at us.ibm.com
Fri Sep 23 17:07:41 BST 2011
Hi,
> Somehow, when omniORB tries to receive a message, it gets the correct
> length for the message, but has the old data. Now, omniORB gets the data
> by calling recv() on the socket, giving it a buffer. The buffer is
> reused between messages, so it contains the old data at the time recv()
> is called. recv() returns the correct message length, but either it
> fills the buffer with the old data, or it doesn't fill the buffer at
> all, leaving the old data. Either way, it looks to me like an OS /
> platform bug rather than an omniORB bug.
> It might be illuminating to edit src/lib/omniORB/orbcore/giopStream.cc
> in inputMessage so it uses memset to clear the data just before calling
> Recv. That will show if recv is returning old data or not returning data
> at all.
inserted a memset of 0x right before tcp::recv call. for the bad call, the
buffer remains
all zero's. so, it looks like no data is being transferred.
yes, it does look like a OS/platform bug in most ways. currently pursuing
the tcp recv
code to understand what conditions would cause it to return 0 bytes, but
profess something different.
also, the IOATDMA (dma transfer support for TCP) seems to be involved in
this picture. when it is
disabled, the data is transferred correctly. when it is enabled, the data
is not transferred.
but the curious thing is that it is so darn consistent with relationship to
the CORBA calls.
It is always on the 23rd corba call of this testcase. & is not dependent
upon timing between
CORBA requests. & if is a pure tcp/OS bug, then wouldn't it be appearing
more often in other cases?
current theory is that there is something special in the way CORBA manages
the TCP connections such that it
exposes a bug in the tcp::recv code with ioatdma driver installed.
can u think of anything from the corba side of things that would happen
that might interrupt a tcp::recv before it completes successfully? does
it do many peeks? or selects? or something like that? are the any trace
points you would suggest inserting to see what is happening during this
time?
Thanks,
Brad.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.omniorb-support.com/pipermail/omniorb-list/attachments/20110923/0cade8f4/attachment.htm
More information about the omniORB-list
mailing list