diff options
author | Brian Foster <bfoster@redhat.com> | 2012-11-01 09:46:12 -0400 |
---|---|---|
committer | Vijay Bellur <vbellur@redhat.com> | 2012-11-29 09:00:28 -0800 |
commit | 0314f16ec59d8c22597c8c14b53a473b736b8b1f (patch) | |
tree | 1e0e1dc470be5d04ff41c470c3812bc10bd0afb3 /tests/bugs/bug-853690.t | |
parent | c85a3eee54b4028573c905829d5b46c0b6512c56 (diff) |
afr: handle short writes in afr_writev_wind and self-heal to avoid corruption
The current failure to handle short writes on writev fops leaves
us open to file corruption. A short write on a user request is
ignored and leaves replicas in an inconsistent state. A short write
during a self-heal is ignored and incorrectly marks the files as
consistent if the heal completes.
Modify user writev handling to return the best case return value
from each of the replicas. Short writes that occur relative to this
value are marked as failed and will require a heal. Modify
self-heal to set an error on a short write and abort the heal.
BUG: 853690
Change-Id: I18b30f58702326249230eeebb361b29e40b535f5
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/4150
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Diffstat (limited to 'tests/bugs/bug-853690.t')
-rwxr-xr-x | tests/bugs/bug-853690.t | 94 |
1 files changed, 94 insertions, 0 deletions
diff --git a/tests/bugs/bug-853690.t b/tests/bugs/bug-853690.t new file mode 100755 index 00000000000..77a581f5444 --- /dev/null +++ b/tests/bugs/bug-853690.t @@ -0,0 +1,94 @@ +#!/bin/bash +# +# Bug 853690 - Test that short writes do not lead to corruption. +# +# Mismanagement of short writes in AFR leads to corruption and immediately +# detectable split-brain. Write a file to a replica volume using error-gen +# to cause short writes on one replica. +# +# Short writes are also possible during heal. If ignored, the files are marked +# consistent and silently differ. After reading the file, cause a lookup, wait +# for self-heal and verify that the afr xattrs do not match. +# +######## + +. $(dirname $0)/../include.rc + +cleanup; + +TEST mkdir -p $B0/test{1,2} + +# Our graph is a two brick replica with 100% frequency of short writes on one +# side of the replica. This guarantees a single write fop leads to an out-of-sync +# situation. +cat > $B0/test.vol <<EOF +volume test-posix-0 + type storage/posix + option directory $B0/test1 +end-volume + +volume test-error-0 + type debug/error-gen + option failure 100 + option enable writev + option error-no GF_ERROR_SHORT_WRITE + subvolumes test-posix-0 +end-volume + +volume test-locks-0 + type features/locks + subvolumes test-error-0 +end-volume + +volume test-posix-1 + type storage/posix + option directory $B0/test2 +end-volume + +volume test-locks-1 + type features/locks + subvolumes test-posix-1 +end-volume + +volume test-replicate-0 + type cluster/replicate + option background-self-heal-count 0 + subvolumes test-locks-0 test-locks-1 +end-volume +EOF + +TEST glusterd + +TEST glusterfs --volfile=$B0/test.vol --attribute-timeout=0 --entry-timeout=0 $M0 + +# Send a single write, guaranteed to be short on one replica, and attempt to +# read the data back. Failure to detect the short write results in different +# file sizes and immediate split-brain (EIO). +TEST dd if=/dev/zero of=$M0/file bs=128k count=1 +TEST dd if=$M0/file of=/dev/null bs=128k count=1 + +######## +# +# Test self-heal with short writes... +# +######## + +# Cause a lookup and wait a few seconds for posterity. This self-heal also fails +# due to a short write. +TEST ls $M0/file + +# Verify the attributes on the healthy replica do not reflect consistency with +# the other replica. +TEST "getfattr -n trusted.afr.test-locks-0 $B0/test2/file --only-values > $B0/out1 2> /dev/null" +TEST "getfattr -n trusted.afr.test-locks-1 $B0/test2/file --only-values > $B0/out2 2> /dev/null" +TEST ! cmp $B0/out1 $B0/out2 + +TEST rm -f $B0/out1 $B0/out2 +TEST rm -f $M0/file +TEST umount $M0 + +rm -f $B0/test.vol +rm -rf $B0/test1 $B0/test2 + +cleanup; + |