[omniORB] Canceling a blocking function
Han Kiliccote
kiliccote@cmu.edu
Tue, 25 Apr 2000 14:11:08 -0400
Follow-up:
First of all thanks for all the replies. We tried some of the suggestions
but decided to use a different approach. I would like to explain how we
decided to attack this problem so that other people may also use it if they
run into a similar problem.
We decided to limit who can send a message to who. Using the same
interconnection network idea I explained in my previous message, every
client in the system is assigned a set of servers as neighbors.
We decided to create a thread per neighbor and add a message queue to each
thread. When the client needs to send a message to a server, it generates a
route to the server (locally; no network communication) such that the route
passes through one of the neighbors. The client appends the message to the
queue and wakes up the thread. The thread sends the message to the neighbor
and blocks.
The neighbor, which may not be the final destination, generates another
route, which also pass from one of the neighbors (care must be taken not to
enter into a cycle).
When one of the servers is down only at most two threads are affected per
machine ((a) the thread created by omniorb per each request (b) the thread
for the neighbor) and we can assume that the maximum number of threads
affected will be less than 8 in a reasonably large system (10^9 servers).
The downside is increased message traffic. But this is now independent of
the implementation of the orb.
The paper that explains this idea can be accessed from
http://pasis.ices.cmu.edu/network_embedded_databases.htm
A simplified version was submitted to ieee/srds 2000.
We welcome suggestions or questions.