diff options
author | Jeff Darcy <jdarcy@redhat.com> | 2016-11-17 10:42:02 -0500 |
---|---|---|
committer | Pranith Kumar Karampuri <pkarampu@redhat.com> | 2016-11-28 22:41:24 -0800 |
commit | 77f03db0131c88d607886bb02dd2a4276ab584d4 (patch) | |
tree | 8a3c464e02f2c9b391c94281821a2c8605c8c612 /xlators/cluster/afr/src/afr-transaction.c | |
parent | 1876454d2e7950f25d1e5bb8e2c07ab27d521498 (diff) |
afr: fix auto-quorum
(1) afr_have_quorum is dead code. It was copied to afr_has_quorum,
and everything else uses that, but the original was never deleted
(until now).
(2) Auto-quorum should be default for any N>2. Leaving quorum
disabled is BAD, but apparently deemed acceptable for N=2 because
there's no real quorum in that case. For any larger number (including
arbiter configurations) there is such a thing as real quorum and we
should use it by default. Note that for N=3 the answers we get from
"N % 2" (the old check) and "N > 2" (the new one) are the same.
(3) The special case for even N in afr_has_quorum has been simplified and
explained more thoroughly in a comment.
Change-Id: I48b33c15093512fecf516b26dcf09afecb7ae33b
Signed-off-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-on: http://review.gluster.org/15873
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Diffstat (limited to 'xlators/cluster/afr/src/afr-transaction.c')
-rw-r--r-- | xlators/cluster/afr/src/afr-transaction.c | 39 |
1 files changed, 29 insertions, 10 deletions
diff --git a/xlators/cluster/afr/src/afr-transaction.c b/xlators/cluster/afr/src/afr-transaction.c index eb7571db5f1..d23654d8354 100644 --- a/xlators/cluster/afr/src/afr-transaction.c +++ b/xlators/cluster/afr/src/afr-transaction.c @@ -681,17 +681,36 @@ afr_has_quorum (unsigned char *subvols, xlator_t *this) up_children_count = AFR_COUNT (subvols, priv->child_count); if (priv->quorum_count == AFR_QUORUM_AUTO) { - /* - * Special case for even numbers of nodes in auto-quorum: - * if we have exactly half children up - * and that includes the first ("senior-most") node, then that counts - * as quorum even if it wouldn't otherwise. This supports e.g. N=2 - * while preserving the critical property that there can only be one - * such group. - */ - if ((priv->child_count % 2 == 0) && - (up_children_count == (priv->child_count/2))) + /* + * Special case for auto-quorum with an even number of nodes. + * + * A replica set with even count N can only handle the same + * number of failures as odd N-1 before losing "vanilla" + * quorum, and the probability of more simultaneous failures is + * actually higher. For example, with a 1% chance of failure + * we'd have a 0.03% chance of two simultaneous failures with + * N=3 but a 0.06% chance with N=4. However, the special case + * is necessary for N=2 because there's no real quorum in that + * case (i.e. can't normally survive *any* failures). In that + * case, we treat the first node as a tie-breaker, allowing + * quorum to be retained in some cases while still honoring the + * all-important constraint that there can not simultaneously + * be two partitioned sets of nodes each believing they have + * quorum. Of two equally sized sets, the one without that + * first node will lose. + * + * It turns out that the special case is beneficial for higher + * values of N as well. Continuing the example above, the + * probability of losing quorum with N=4 and this type of + * quorum is (very) slightly lower than with N=3 and vanilla + * quorum. The difference becomes even more pronounced with + * higher N. Therefore, even though such replica counts are + * unlikely to be seen in practice, we might as well use the + * "special" quorum then as well. + */ + if ((up_children_count * 2) == priv->child_count) { return subvols[0]; + } } if (priv->quorum_count == AFR_QUORUM_AUTO) { |