diff options
author | Richard Wareing <rwareing@fb.com> | 2015-08-21 21:44:44 -0700 |
---|---|---|
committer | Kevin Vigor <kvigor@fb.com> | 2017-03-06 19:53:31 -0500 |
commit | f6cc23fb1d8f157ec598e0bbb63081c881388380 (patch) | |
tree | bdb0a579a0a548e3e2113d5641ffa951bb3fbaa9 /tests | |
parent | 259d65ffb7296415cb9110ba1877d0378265bf52 (diff) |
cluster/afr: AFR2 discovery should always do entry heal flow
Summary:
- Fixes case where when a brick is completely wiped, the AFR2 discovery
mechanism would potentially (1/R chance where R is your replication
factor) pin a NFSd or client to the wiped brick. This would in turn
prevent the client from seeing the contents of the (degraded)
subvolume.
- The fix proposed in this patch is to force the entry-self
heal code path when the discovery process happens. And furthermore,
forcing a conservative merge in the case where no brick is found to be
degraded.
- This also restores the property of our 3.4.x builds where-by bricks
automagically rebuild via the SHDs without having to run any sort of
"full heal". SHDs are given enough signal via this patch to figure
out what they need to heal.
Test Plan:
Run "prove -v tests/bugs/fb8149516.t"
Output: https://phabricator.fb.com/P19989638
Prove test showing failed run on v3.6.3-fb_10 without the patch -> https://phabricator.fb.com/P19989643
Reviewers: dph, moox, sshreyas
Reviewed By: sshreyas
FB-commit-id: 3d6f171
Change-Id: I7e0dec82c160a2981837d3f07e3aa6f6a701703f
Signed-off-by: Kevin Vigor <kvigor@fb.com>
Reviewed-on: https://review.gluster.org/16862
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shreyas Siravara <sshreyas@fb.com>
Diffstat (limited to 'tests')
-rw-r--r-- | tests/bugs/fb8149516.t | 40 |
1 files changed, 40 insertions, 0 deletions
diff --git a/tests/bugs/fb8149516.t b/tests/bugs/fb8149516.t new file mode 100644 index 00000000000..54372794c6f --- /dev/null +++ b/tests/bugs/fb8149516.t @@ -0,0 +1,40 @@ +#!/bin/bash + +. $(dirname $0)/../include.rc +. $(dirname $0)/../volume.rc + +cleanup; + +TEST glusterd +TEST pidof glusterd +TEST $CLI volume create $V0 replica 3 $H0:$B0/${V0}{0,1,2} +TEST $CLI volume set $V0 cluster.read-subvolume-index 2 +TEST $CLI volume set $V0 cluster.background-self-heal-count 0 +TEST $CLI volume set $V0 cluster.heal-timeout 30 +TEST $CLI volume set $V0 cluster.choose-local off +TEST $CLI volume set $V0 cluster.entry-self-heal off +TEST $CLI volume set $V0 cluster.data-self-heal off +TEST $CLI volume set $V0 cluster.metadata-self-heal off +TEST $CLI volume set $V0 nfs.disable off +TEST $CLI volume start $V0 +TEST glusterfs --volfile-id=/$V0 --volfile-server=$H0 $M0 --attribute-timeout=0 --entry-timeout=0 +cd $M0 +for i in {1..10} +do + dd if=/dev/urandom of=testfile$i bs=1M count=1 2>/dev/null +done +cd ~ +TEST kill_brick $V0 $H0 $B0/${V0}2 +TEST rm -rf $B0/${V0}2/testfile* +TEST rm -rf $B0/${V0}2/.glusterfs + +TEST $CLI volume start $V0 force +EXPECT_WITHIN 20 "1" afr_child_up_status_in_shd $V0 2 + +# Verify we see all ten files when ls'ing, without the patch this should +# return no files and fail. +FILE_LIST=($(\ls $M0)) +TEST "((${#FILE_LIST[@]} == 10))" +EXPECT_WITHIN 30 "0" get_pending_heal_count $V0 + +cleanup |