Sunday, March 10, 2013

The Costs and Limits of Availability of Replicated Services


From Georgios T., who can't seem to post:

The Costs and Limits of Availability of Replicated Services. Haifeng Yu & Amin Vahdat

The motivation of this paper comes from the realization that utility of networked services is limited by availability and not performance. The different contemporary, in regard to this paper, approaches to improve availability used caching & replication which create inconsistency problems. The authors identify the two extremes for dealing with inconsistency as 'strong consistency' and 'optimistic consistency' set forth to explore the costs and limits of dynamically exchanging consistency for availability using a contiguous consistency model. In a contiguous consistency model applications specify the maximum distance from strong consistency, thus bounding optimistic consistency.
    In order to evaluate different implementations, a methodology for computing the upper bounds of service availability as a function of faultload, workload, consistency and replication level is described. The methodology is based on Q(offline) and Q(online) -- sets of questions that a protocol will have to answer and can be resolved optimally offline and online respectively. The answers to the questions form a dominant algorithm; Q(online) answers are specified by the inputs of this algorithm, thus less questions in the online set improves the chance for a tractable solution.
    Another contribution of this paper is a comparison, using a prototype based on TACT, of service availability that can be achieved with different consistency protocols(Voting, Primary, Golding), faultload and consistency levels. Through the experiments it is shown that small optimization in the existing protocols (i.e. aggressive write propagation) can bring them closer to the availability upper bounds. An interesting finding of the evaluation section is that maximizing availability through maintaining an as strong as possible consistency level creates a trade-off between availability and communication. Finally, the authors also explore the effect of the degree of replication and network reliability to availability and it is shown that more replicas are not always good, they can improve read reliability but degrade write availability due to consistency constrains.

1 comment:

  1. Replica and caching are very important in distributed system as we know, and this paper studied proper replication.

    One of the interesting evaluation results of this paper is that voting in aggressive order error bounding achieves the highest level of availability in comparison with primary copy and Golding's algorithm, though it has the lowest level of service availability with baseline order error bounding. They analyzed that this difference results from the inherent properties of the voting algorithm, and it makes sense.
    However, as we can consequently worried, voting incurs huge communication overhead to vote though primary copy cause significantly less overhead than voting with slightly less availability. It was impressive because I have not recognized this merit of primary copy when I read it last paper, but I got it from this more than a decade year old paper.

    ReplyDelete