Technical blog
"We must view with profound respect the infinite capacity of the human mind to resist the introduction of useful knowledge." - Thomas R. Lounsbury
| github | goodreads | linkedin | twitter |
ansible 2 / elasticsearch 2 / kernel 2 / leadership 1 / linux 2 / mnemonics 1 / nginx 1 / paper 40 / personal 5 / rust 1 / tools 2 /WC 228 / RT 2min
TweetDistributed peer-to-peer applications require weakly-consistent knowledge of process group membership information at all participating processes.
SWIM separates the failure detection and membership update dissemination functionalities of the membership protocol.
Swim focus on a weaker variant of group membership, where membership lists at different members need not be consistent across the group at the same (causal) point in time.
The design of a distributed membership algorithm has traditionally been approached through the technique of heart-beating.
Popular class of all-to-all heart-beating protocols arises from the implicit decision therein to fuse the two principal functions of the membership problem specification:
SWIM, provides a membership substrate that:
(1) imposes a constant message load per group member; (2) detects a process failure in an (expected) constant time at some non-faulty process in the group; (3) provides a deterministic bound (as a function of group size) on the local time that a non-faulty process takes to detect failure of another process; (4) propagates membership updates, including information about failures, in infection-style (also gossip-style or epidemic-style); the dissemination latency in the group grows slowly (logarithmically) with the number of members; (5) provides a mechanism to reduce the rate of false positives by “suspecting” a process before “declaring” it as failed within the group.