We're running an app that has client/server processes co-resident on a virtual server. It's been running fine for years.<div><br></div><div>We recently made a slight change to one of our (many) applications that has changed the timing of when the client attempts to contact (read "narrow on an object serviced by") the server (the narrow call was moved earlier in the lifetime of the client, but still long after the server was activated).</div>
<div><br></div><div>Every so often, a narrow on an object will throw a COMM_FAILURE_MarshalArguments (1096024067) exception. After reviewing the exception trace (which I've unfortunately deleted and am trying to reproduce), I poked through the omniORB source (4.1.2) and the initial obvious source is a timeout -- except all of our timeouts are set to "0" (forever). Looking further, it seems the next likely culprit is send(2) experiencing some sort of a (transient?) error. Since these processes are on the same machine, I can't imagine there being any sort of intramachine congestion in the TCP stack. There doesn't seem to be any obvious processor/resource overload (per sar) -- that other (different application) clients simultaneously running on the same machine continue to execute perfectly would seem to refute any obvious resource issue.</div>
<div><br></div><div>Are there other less likely sources for this exception?</div><div><br></div><div>Thanks,</div><div>Jeff</div>