[omniORB] Measuring omniorb's footprint. New footprint benchmark.
   
    Duncan Grisby
     
    dgrisby@uk.research.att.com
       
    Mon, 25 Feb 2002 12:25:30 +0000
    
    
  
On Friday 22 February, Dan Kegel wrote:
> I'm doing both.  The send_deferred() part should be easy compared to
> the other part of the benchmark, which is forking N copies of the
> echo server.
I'm not so sure. The send_deferred() part creates n threads each test
iteration. There are far more calls to Linux's underlying clone()
system call due to thread creation than due to forking.
I've done some experiments based on omniORB 4, that show some
interesting things. The only tweak to omniORB 4 is to turn off the
mutex tracing by commenting out the relevant line in
include/omniORB4/tracedthread.h, which is only turned while we're
debugging.
These tests are on a Dual Pentium II 450MHz machine, with 512MB RAM,
running RedHat 7.1, Linux 2.4.12. It never gets close to running out
of memory, so these tests are not affected by that.
First, before I changed it, omniORB 4 was creating a new thread for
each deferred call, just like omniORB 3 does. There, I got 145 servers
before invocation time went over 100ms. By selecting the Unix domain
socket transport (by setting the ORBendPoint environment variable to
giop:unix:), that figure went up to 157.
Then I modified the deferred request to use omniORB 4's thread pool,
instead of creating a new thread each time. That greatly reduces the
number of thread creations. It put the figure up to 181 for TCP and
213 for Unix sockets. The thread pool limits the total number of
threads in the pool; the default limit is 100. Setting it to 5
increases the number of servers to 230 with Unix sockets.
As a final comparison, removing the deferred calls, and just doing
normal synchronous calls to each server in turn gives a figure of 364
servers over Unix sockets.
In summary
 New thread per call, TCP:                    145
 New thread per call, Unix:                   157
 Thread pool, 100 threads, TCP                181
 Thread pool, 100 threads, Unix               213
 Thread pool, 5 threads, Unix                 230
 Synchronous calls, Unix                      364
I'll try everything on a laptop with 64MB RAM next, to see if I can
manage to run out of memory.
One thought about the memory use is that it's possible the fork() from
the client is copying unnecessary stuff that isn't reclaimed during
the exec(). Some platforms have difficulties forking from
multi-threaded programs, or ones with lots of file descriptors open.
It's not especially easy to test if this is happening.
The normal way to avoid potential problems is to create a simple
single-threaded forking proxy. The basic idea is that the main program
creates two pipes and forks before it does anything else. The parent
becomes the CORBA client, and the child listens for commands on its
pipe. When the parent wants to create a new server, it gives a command
to its child which, in turn, forks to create the new server. I've no
idea whether it will make any difference in this case, but it might be
something to try.
I'd be interested to see the results for your machines based on the
current CVS version of omniORB 4 (or wait for tomorrow's FTP
snapshot).
Cheers,
Duncan.
-- 
 -- Duncan Grisby  \  Research Engineer  --
  -- AT&T Laboratories Cambridge          --
   -- http://www.uk.research.att.com/~dpg1 --