Monday, January 21, 2013

Implementing Remote Procedure Calls


This paper deals with the implementation details of RPC on a test infrastructure at Xerox PARC. It incorporates the various design principles defined as a part of RPC's theoretical description by B. J. Nelson and gives relevant explanations of why the parameters are defined in the way they are and how they benefit distributed computing.

Primarily aimed at easing the development of distributed applications, RPC is based on simplicity, efficiency and generality. Simplicity is mainly observed at the programmer's end, for whom, executing a remote functionality is equivalent to making a local function call. The efficiency lies in the fact that simplicity of procedure calls guarantees rapid over-the-wire communication. Finally, the generality is exhibited in two aspects: firstly, the fact that RPC is essentially a function call, which is an omnipresent phenomenon in every computing environment and secondly, in the fact that the design is not tightly bound to procedure-call mechanism and can be implemented over an underlying message-passing system, if need be. Another very important thought maintained throughout the design of RPC is the closeness to local procedure calls. This deliberate insistence sets correct expectations for programmers implementing distributed applications both, at the caller and the callee ends.

In the view of maintaining simplicity of design and implementation, the part where the bulk data transfers constitute twice the number of packets (an acknowledgement each for every packet, but the last) seems an overhead in today's scheme of things. As the authors pointed out, typically only 10% of the bandwidth of the network was used at that time, easily accommodating the additional traffic generated in the scenario mentioned above. Keeping in mind that the bandwidth today is many times as much as it was then, the sheer amount of data exchange today will probably result in significant bandwidth being used for otherwise avoidable information.

Today, message passing and RPC form the main methods of communication. They can be analogously compared to UDP and TCP respectively. While RPCs ensure synchronous communication, and hence are inherently reliable; MPIs are asynchronous in nature and hence, non-reliable. On the other hand, concurrency is very easily possible in MPIs while it can be a tedious task to implement in case of RPCs, as the authors themselves mention, viz. in multicast and broadcast like scenarios.

1 comment:

  1. Implementing of Remote Procedure Calls presents RPC which is simple, easy to implement and therefore literally widely using so far. RPC is an inter process communication based on server/client system, and there is object-oriented version of RPC which name is remote method invocation (RMI). For RPC's accurate working, stub carries out quite important roles. It passes parameter from one node to another node, changes memory allocation information for next node, and represent data using external data representation (XDR). However, it has security problem and strong time, synchronization and space coupling as The Many Faces of Publish/Subscribe mentioned. With a couple of different approaches, synchronization decoupling was accomplished in RPC but space and time decoupling have been unresolved.

    Publish/Subscribe solves this problem. Regardless it is a topic-based, content-based or type-based Publish/Subscribe, it has decoupling abilities in terms of space, time and synchronization. The role of this systems is to permit the exchange of events between producers and consumers in an asynchronous manner. Those events are published by producers and consumers subscribe it using event service system. In the centralized architecture, events is delivered through one storage so it can be not persistent in case the storage encounters severe problem. Publish/Subscribe systems also has scalability issue conflicting other desirable properties.

    Resolving problems of Publish/Subscribe system is still challenging. We can see that Hadoop Distributed File System (HDFS) uses modified RPC instead of Publish/Subscribe system as well as Google File System (GFS).

    ReplyDelete