diff options
| author | Richard Wareing <rwareing@fb.com> | 2016-04-01 20:42:50 -0700 |
|---|---|---|
| committer | Kevin Vigor <kvigor@fb.com> | 2016-12-29 11:15:39 -0800 |
| commit | 02f8b7300bc635dea9ae1fee6ef14c0d4725591a (patch) | |
| tree | 3d1b55756f09cbcd6763d87ad1c26ec3b7a5ddf9 | |
| parent | 141e879d83da3e6fa2f3891d8aa1c7895017c401 (diff) | |
cluster/afr: Hybrid mounts must honor _marked_ up/down states
Summary:
- Qsorted latencies must also take into account that a host can be
marked down yet still have a (slightly) better latency. This can
happen if max-replicas is reached, and there are multiple bricks which
meet the halo threshold. In such a case we must prefer the brick
which is marked up albeit with a perhaps slightly higher latency.
It's not worth doing any sort of swap in this case as it's going to be
jarring to the cluster and introduce flapping behavior.
Test Plan: - Halo prove tests @ https://phabricator.fb.com/P56251020
Reviewers: sshreyas, kvigor
Reviewed By: kvigor
Blame Revision: Change-Id: I858bbf44638a4978093dc56d4a98c96be4f8b45e
Change-Id: Ie3b79700ec63398bc30e7bcf75fb05931dd5d0d0
FB-commit-id: deb2f35db6f8b979857b4e3779b2a41c59a2e416
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: http://review.gluster.org/16307
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
| -rw-r--r-- | xlators/cluster/afr/src/afr-common.c | 23 |
1 files changed, 23 insertions, 0 deletions
diff --git a/xlators/cluster/afr/src/afr-common.c b/xlators/cluster/afr/src/afr-common.c index 0c621271405..d5002a2070b 100644 --- a/xlators/cluster/afr/src/afr-common.c +++ b/xlators/cluster/afr/src/afr-common.c @@ -4293,6 +4293,17 @@ __get_heard_from_all_status (xlator_t *this) * * Passed to the qsort function to order a list of children by the latency * and/or up/down states. + * + * Note: This isn't as simple as taking the latencies and calling it a + * a day. Children can be marked down, which overrides their latency + * signal. Having a lower-latency child available doesn't guarentee this + * child shall be marked up: we don't want to constantly be swapping + * slightly better bricks for others...this is jarring to clients and + * could cause all sorts of issues. Plus, the fail-over, max-replicas + * flags must all be honored which manage the up/down state of children. + * + * In short, the (as marked) up/down down state of the brick shall always + * take precedence when sorting by latency. */ static int _afr_cmp_child (const void *child1, const void *child2) @@ -4300,6 +4311,18 @@ _afr_cmp_child (const void *child1, const void *child2) struct afr_child *child11 = (struct afr_child *)child1; struct afr_child *child22 = (struct afr_child *)child2; + /* If both children are _marked_ down they are equal */ + if (!child11->child_up && !child22->child_up) + return 0; + + /* Prefer child 2, child 1 is _marked_ down, child 2 is not */ + if (!child11->child_up && child22->child_up) + return 1; + + /* Prefer child 1, child 2 is _marked_ down, child 1 is not */ + if (child11->child_up && !child22->child_up) + return -1; + if (child11->latency > child22->latency) { return 1; } |
