diff options
author | Pranith Kumar K <pkarampu@redhat.com> | 2017-01-25 15:31:44 +0530 |
---|---|---|
committer | Pranith Kumar Karampuri <pkarampu@redhat.com> | 2017-02-26 22:06:55 -0500 |
commit | c1fc1fc9cb5a13e6ddf8c9270deb0c7609333540 (patch) | |
tree | a3876aa8a0c1b087429ba916c9380b90bcda6b72 /tests/bitrot | |
parent | 4638dfc1fee80f9338f2941f3cccb17bec63989a (diff) |
cluster/ec: Don't trigger data/metadata heal on Lookups
Problem-1
If Lookup which doesn't take any locks observes version mismatch it can't be
trusted. If we launch a heal based on this information it will lead to
self-heals which will affect I/O performance in the cases where Lookup is
wrong. Considering self-heal-daemon and operations on the inode from client
which take locks can still trigger heal we can choose to not attempt a heal on
Lookup.
Problem-2:
Fixed spurious failure of
tests/bitrot/bug-1373520.t
For the issues above, what was happening was that ec_heal_inspect()
is preventing 'name' heal to happen
Problem-3:
tests/basic/ec/ec-background-heals.t
To be honest I don't know what the problem was, while fixing
the 2 problems above, I made some changes to ec_heal_inspect() and
ec_need_heal() after which when I tried to recreate the spurious
failure it just didn't happen even after a long time.
BUG: 1414287
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Change-Id: Ife2535e1d0b267712973673f6d474e288f3c6834
Reviewed-on: https://review.gluster.org/16468
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
Diffstat (limited to 'tests/bitrot')
-rw-r--r-- | tests/bitrot/bug-1373520.t | 37 |
1 files changed, 4 insertions, 33 deletions
diff --git a/tests/bitrot/bug-1373520.t b/tests/bitrot/bug-1373520.t index 271bb3de287..225d3b1a9bc 100644 --- a/tests/bitrot/bug-1373520.t +++ b/tests/bitrot/bug-1373520.t @@ -49,39 +49,10 @@ EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" get_bitd_count #Delete file and all links from backend TEST rm -rf $(find $B0/${V0}5 -inum $(stat -c %i $B0/${V0}5/FILE1)) -# The test for each file below used to look like this: -# -# TEST stat $M0/FILE1 -# EXPECT_WITHIN $HEAL_TIMEOUT "$SIZE" stat $B0/${V0}5/FILE1 -# -# That didn't really work, because EXPECT_WITHIN would bail immediately if -# 'stat' returned an error - which it would if the file wasn't there yet. -# Since changing this, I usually see at least a few retries, and sometimes more -# than twenty, before the check for HL_FILE1 succeeds. The 'ls' is also -# necessary, to force a name heal as well as data. With both that and the -# 'stat' on $M0 being done here for every retry, there's no longer any need to -# have them elsewhere. -# -# If we had EW_RETRIES support (https://review.gluster.org/#/c/16451/) we could -# use it here to see how many retries are typical on the machines we use for -# regression, and set an appropriate upper bound. As of right now, though, -# that support does not exist yet. -ugly_stat () { - local client_dir=$1 - local brick_dir=$2 - local bare_file=$3 - - ls $client_dir - stat -c %s $client_dir/$bare_file - stat -c %s $brick_dir/$bare_file 2> /dev/null || echo "UNKNOWN" -} - #Access files -EXPECT_WITHIN $HEAL_TIMEOUT "$SIZE" ugly_stat $M0 $B0/${V0}5 FILE1 -EXPECT_WITHIN $HEAL_TIMEOUT "$SIZE" ugly_stat $M0 $B0/${V0}5 HL_FILE1 +TEST cat $M0/FILE1 +EXPECT_WITHIN $HEAL_TIMEOUT "$SIZE" path_size $B0/${V0}5/FILE1 +TEST cat $M0/HL_FILE1 +EXPECT_WITHIN $HEAL_TIMEOUT "$SIZE" path_size $B0/${V0}5/HL_FILE1 cleanup; -#G_TESTDEF_TEST_STATUS_NETBSD7=BAD_TEST,BUG=1417540 -#G_TESTDEF_TEST_STATUS_CENTOS6=BAD_TEST,BUG=1417540 - - |