IRSDP 读书笔记 - 2. Basic Abstractions - 7

《Introduction to Reliable and Secure Distributed Programming》

2.7 Distributed-System Models

组合 (1) a process abstraction, (2)a link abstraction 和 (3) a failure-detector abstraction 三者定义了 a distributed-system model

2.7.1 Combining Abstractions

本书只会讨论几种不同的组合,通过这些组合可以发现不同的假设对算法设计的影响。

  • Fail-stop. crash-stop process abstraction,link 是 perfect 的(Module 2.3),存在 perfect failure detector(Module 2.6)。这些假设将会简化分布式算法的设计。
  • Fail-noisy. crash-stop process abstraction,link 是 perfect 的(Module 2.3),存在 eventually perfect failure detector (Module 2.8)或者 eventual leader detector (Module 2.9)。
  • Fail-silent. crash-stop process abstraction,link 是 perfect 的(Module 2.3),但是不假设存在 failure detector 或 leader election abstraction。也就是 process 无法获取其它 process crash 的信息。
  • Fail-recovery. crash-recovery process abstraction,link 是 stubborn 的(Module 2.2),存在 eventual leader detector(Module 2.9)。
  • Fail-arbitrary. fail-arbitrary (Byzantine) process abstraction,link 是 authenticated perfect link abstraction(Module 2.5),也被称为 fail-silent-arbitrary model。 如果再假设存在 Byzantine eventual leader-detector abstraction(Module 2.10),那么就是 fail-noisy-arbitrary model。
  • Randomized. The randomized model is of a different nature than the other distributed-system models, and can be thought of being orthogonal to all of them. Randomization is sometimes the only way to solve a problem or to circumvent inherent inefficiencies of deterministic algorithms.

2.7.2 Setup

所有 process 的 identity 是全局已知的,可以启动前静态配置,也可以通过 membership service 自动配置。

cryptographic abstractions 的相关配置也是预先定义好的。

2.7.3 Quorums

A quorum in a system with N crash-fault process abstractions (according to the fail-stop, fail-noisy, fail-silent, or fail-recovery system model) is any majority of processes, i.e., any set of more than N/2 processes.

即使系统中 f < N/2 个 process 发生故障,也会至少存在一个没有故障的 quorum。

在 arbitrary-fault process abstractions 中并不是这样,A Byzantine quorum tolerating f faults is a set of more than (N +f)/2 processes. algorithms tolerating Byzantine faults usually require that only f < N/3 processes may fail.

2.7.4 Measuring Performance

通常使用两个 metric 分析算法的成本:

  1. the number of messages required to terminate an operation of the abstraction
  2. the number of communication steps required to terminate such an operation

对于某些算法也需要评估 total communication size,对于 crash-recovery model 还需要考虑 the number of accesses to stable storage。

通常来说需要统计 message 数量、communication steps 次数和 disk accesses 次数。

Performance measurements are often stated in Big-O Notation。

精确的性能研究并不属于本书的内容。

    原文作者:袁世超
    原文地址: https://www.jianshu.com/p/b427d34ce921
    本文转自网络文章,转载此文章仅为分享知识,如有侵权,请联系博主进行删除。
点赞