Sunday, January 27, 2013

Summary of "SCRIBE: A large-scale and decentralized application-level multicast infrastructure"


1. Title: SCRIBE: A large-scale and decentralized application-level multicast infrastructure

Authors: Miguel Castro, Peter Druschel, Anne-Marie Kermarrec and Antony Rowstron

2. In this paper, the application-layer IP multicast tool Scribe, along with its supporting framework Pastry, are discussed in terms of their implementation, advantages and disadvantages, and performance.

3. Scribe and Pastry, sitting at the application level, avoid the limitations of network-level multicast faced in the past: namely, the lack of wide-scale development (and any motivation for such development), and the inability to track group membership. This system works in a very similar way to previously discussed distributed hashing schemes, including the focus of our project, Kademlia. One clever design idea the authors note is the flexibility of the reliability constraints that can be built on top of Scribe. Scribe is based on best effort delivery, much like its TCP base, however, the application can be adjusted to provide better reliability guarantees, if necessary.

4. The experiment they run seems infeasible without some sort of cap, or bottleneck elimination routine, which they apply later. Having thousands of children tables or children table entries would not be feasible on an end-user machine, and bringing down such a well-connected node would not be an option either. Thus, it seems that the bottleneck elimination is necessary in their design. The authors also mention the infeasibility of the software with small networks, but it's unlikely that this software would be adopted en masse, and smaller networks, (for instance, a private chat client), would be extremely inefficient or even unusable due to the stresses and overhead imposed by Scribe on small networks. Finally, although the paper lists a variety of other, related projects, the authors make no attempt to discuss the shortcomings of Scribe, or the potential advantages of their competitors.

5. A system of IP multicast is still relevant today. While certain applications have made attempts at solving this problem, there are still considerable reliability and performance concerns for application-layer multicast. The age of this paper (about a decade), coupled with ISPs, might stifle development of a Scribe-like system for multicast. However, similar issues of reliability and large-scale deployment, as presented in the paper, are very real and relevant to this day.

3 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. A distributed hash table (DHT) pass the responsibility for maintaining the mapping from keys to values to the nodes, so DHT could be more scalable for extremely large numbers of nodes and easily handle continuous node addition, deletion and failure. Therefore, DHT is now able to help to make some infrastructures for various services such as Pastry and Scribe. With DHT, Pastry form robust, scalable and reliable routing substrate, and Scribe, efficient multicast infrastructure got inherited lots of advantages and nice feature of DHT and Pastry, could be formed. Even though Scribe was suggested about a decade ago, it is still quite useful and applied in various way though better reliability and scalability are still issue.

    One thing I've wondered is the difference according to location of rendez-vous point among very widely distributed Scribe nodes. There is no explicit way to designate or change rendez-vous point. So even if overall performance of the multicast tree is affected by the location of rendez-vous point, there must be no choice. However, there is an alternative way authors mentioned about that is choosing good creator and making it rendez-vous point. It could possibly be the solution of this problem.

    ReplyDelete
  3. One problem that I have with the multicast system is that the multicast trees that are created do not take advantage of locality in the network. One goal of a multicast protocol is that each node should only has to forward messages to a few close-by neighbors, but in the Scribe protocol, messages are forwarded to nodes that are close in the ID space. If, as they suggest, the ID is based on a hash of the IP address, this ensures that locality in the ID-space corresponds to distance in the underlying network. Thus, while the number of messages may be reduced, the messages might have to traverse large parts of the network to reach nodes that are close to the root.

    ReplyDelete