[omniORB] omniorb and omniorbpy speed

Wed, 29 Nov 2000 10:07:31 +1100 (EST)

Duncan Grisby writes::

 > As Sai-Lai says, you should compile omniORB and omniORBpy with Sun CC
 > to make it fair. I'm mainly surprised that omniORB C++ is so much
 > slower than ORBacus. Every comparison I've ever seen shows omniORB to
 > be faster. I assume that you are using an ORBacus server. Are the
 > tests performed on a single machine?  If so, ORBacus might be using a
 > shared memory or Unix domain socket transport, so that would explain
 > the difference.

No, the server is running remotely. The orbacus client is running in a 
unthreaded, blocking mode - this might account for some of the
difference. As you say I should try with sun CC42, but I have had
trouble getting C++ programs compiled with CC42 to work as python
extensions, which is why I used gcc in the first place. If I can find
some time, I will recompile orbacus with gcc, and make the comparison
that way.

 > Anyway, I suspect that there are two reasons for omniORBpy being
 > slower than omniORB C++. First, Python is just slow. Building Python
 > objects takes a lot of time.

Is there significant python code getting executed as part of the
request, or do you mean just that constructing python objects via the
C api is slow?

The python objects that end up getting created in this case are pretty 
straightforward (although big!).

 > Second, and probably more significantly, omniORBpy fully unmarshals
 > the contents of an Any when it is received, whereas omniORB C++ (and I
 > assume ORBacus) keeps the marshalled form in memory, and unmarshals it
 > when you extract the value from the Any. So omniORBpy is doing an
 > awful lot more marshalling work than omniORB C++. I suspect that if
 > you modify the C++ client to actually extract the Anys' contents, the
 > times will be much more similar.

I tried this, and added a simple decoding function that I call on
every any I get back...

 | void
 | extract( CORBA::Any &any  )
 | {
 |     CORBA::Long l;
 |     CORBA::Double d;
 |     char *s;
 | 
 |     if( any >>= l )
 | 	return;
 |     if( any >>= d )
 | 	return;
 |     if( any >>= s )
 | 	return;
 |     if( any.type()->equal( CORBA::_tc_null ) )
 | 	return;
 |     cout << "Unconverted type" << endl;
 | }

The difference between the python code and the C++
is still largely unchanged...

 | tcc2:any_test $ time ./omnitest 
 | 6211 rows retrieved
 | 
 | real    0m5.281s
 | user    0m2.640s
 | sys     0m0.180s
 | 
 | tcc2:any_test $ time ./omnitest_extract_anys
 | 6211 rows retrieved
 | 
 | real    0m6.098s
 | user    0m3.350s
 | sys     0m0.180s
 | 
 | tcc2:any_test $ time ./omnipy_test
 | 6103 rows retrieved
 | (with 12 cols)
 | 
 | real    0m15.092s
 | user    0m12.150s
 | sys     0m0.400s

This seems to suggest that all that extra time is spent just creating
the python respresentation of the objects :-( The data in question is
really quite simple, just large. By far the bulk of it is a structure
containing 2 sequences of sequences of Anys, where the anys can only
be strings/longs/doubles/null....

 | tcc2:any_test $ ./omnipy_test -i
 | 6103 rows retrieved
 | (with 12 cols)
 | >>> results
 | <hoops.oidl.CHoopsRealTime.Results instance at 184edbc>
 | >>> len(results.categories)
 | 6103
 | >>> [ any.value() for any in results.categories[0]]
 | [276.64934992283952, 'LND_152742']
 | >>> len(results.values)
 | 6103
 | >>> [ any.value() for any in results.values[0]]
 | [2, 76161462.299999997, 70816842.626244411, None, None, None, None, None, None, None]

I'm starting to like list comprehensions in python 2.0!

If I can find the time, I might take a closer look at where the time
is going, but I'm not sure where to look without a profiling tool.

Thanks,

Tim