Sunday, February 17, 2013

P. Hunt et al., Zookeeper: Wait-Free Coordination in Internet-scale Systems, USENIX ATC, 2010

In this paper, the author introduce ZooKeeper, a wait-free solution for coordination of large scale distributed system.

One of the most important requirement of distributed system is coordination. We've been in touch with this since the first day we were introduced to multi-thread programming: we need to implement lock to prevent multiple thread from accessing the same piece of memory at the same time. This is just the simplest example and there are millions of other instances that requirement different degree and different forms of coordination. As mentioned in the paper, Amazon Simple Queue Service focus on queing, and Chubby guarantee synchronization. All of these services are developed for specific coordination needs individually and they are all implemented separately on server side, which takes large amount of unnecessary time and work. ZooKeeper, however, is heading another direction. It enable developers to implement their own primitives by an API backed up by a coordination kernel which support new primitives without requiring changes to service core. Another very nice property of ZooKeeper is that it moves away from blocking primitives, which may slow down the whole system with one of two exceptionally slow process. This property, called wait-free, is the most important feature of ZooKeeper and unsurprisingly it appeared on the topic of this paper. Furthermore, with hierarchy of Herlihy, ZooKeeper implements a universal object which provides order guarantee for operation. Caching data on client side, and pipe-lining also help enhancing the performance of ZooKeeper. 

7 comments:

  1. The idea of an API of basic operations is useful, and I think a move in the right direction for distributed systems. Having a low-level, accessible API allows systems designers to create their own primitives and build up their own structures on top of the service, rather than being locked into specific constructs. This idea is in the spirit of the original idea for the Internet.

    The problem I see with Zookeeper is performance. Performance severely degrades across read and write operations as the number of connected servers increases. As pervasive and large as distributed systems used on today's Internet have become, this kind of performance degradation is unacceptable, and counter balances the usefulness of the open API.

    ReplyDelete
  2. This comment has been removed by the author.

    ReplyDelete
  3. Concisely, Zookeeper is a centralized service that creates a platform for maintaining configuration, synchronization, grouping and such purposes.

    It has a very simple API and structure, which encourages developers willing to build on top of Zookeeper. The system itself is inspired from an OS kernel, in my opinion. Tree structure is very much like a file system and notification system looks like "inotfy". However, being applicable to multiple servers and achieving high throughput for read intensive applications can put this system into much use, like for a news site.

    Couple of points that this architecture lacks in my opinion are;
    -The lack of replication sophistication; there is no efficiency in replication such as erasure coding. Even though this brings extra overhead for the znodes, and adds latency.
    -Low capacity for storage.

    ReplyDelete
  4. Zookeeper has servers maintain client session state. So, as the number of client grows exponentially, so the distributed state which can seriously affect the performance.

    Also, clients need to send heart beat messages in absence of activity. In many systems, systems go through long amounts of inactivity which makes the system inefficient by making it client dependent.

    ReplyDelete
  5. I think one problem with the paper is that two of their main goals seem to be in conflict. They claim that they are producing a general tool for building distributed systems, but then they make performance trade-offs in service of improving its utility for the type of systems that they use at Yahoo, i.e., read-heavy with admittedly excessive logging overhead for writes.

    ReplyDelete
  6. Zookeeper is one of the NoSQL distributed system and as other NoSQL distributed system, it partially gives up its consistency (for read operation in Zookeeper) in CAP. But in my view, leader election system seems like having some problem. Centralized distributed system inevitably have possibilities on significant fault of leader and their evaluation shows this. They said re-election of new leader takes less than 200ms if Zookeeper encounters failure of leader. But we can see it takes much more time to get operations per second back to normal level. I guess authors need to choose this system to maintain Zookeeper's main features, say, FIFO and linearizability, but it could be improved more.

    ReplyDelete
  7. Honestly, while an API is interesting, I would rather implement my own API so long as I could increase optimization. Adding the Zookeeper layer would negatively affect performance without the aforementioned performance issues. APIs are nice so long as they don't affect your performance (or if they do, at least marginally).

    ReplyDelete