diff options
author | Pranith Kumar K <pkarampu@redhat.com> | 2015-04-16 09:25:31 +0530 |
---|---|---|
committer | Vijay Bellur <vbellur@redhat.com> | 2015-05-08 05:06:00 -0700 |
commit | 33fdc310700da74a4142dab48d00c4753100904b (patch) | |
tree | 6cf1a6e415a18b1641ebff092106b4371b5ce509 /xlators/protocol | |
parent | df919238505a74a9d89b308bbc6d01b797140439 (diff) |
cluster/ec: metadata/name/entry heal implementation for ec
Metadata self-heal:
1) Take inode lock in domain 'this->name' on 0-0 range (full file)
2) perform lookup and get the xattrs on all the bricks
3) Choose the brick with highest version as source
4) Setattr uid/gid/permissions
5) removexattr stale xattrs
6) Setxattr existing/new xattrs
7) xattrop with -ve values of 'dirty' and difference of highest and its own
version values for version xattr
8) unlock lock acquired in 1)
Entry self-heal:
1) take directory lock in domain 'this->name:self-heal' on 'NULL' to prevent
more than one self-heal
2) we take directory lock in domain 'this->name' on 'NULL'
3) Perform lookup on version, dirty and remember the values
4) unlock lock acquired in 2)
5) readdir on all the bricks and trigger name heals
6) xattrop with -ve values of 'dirty' and difference of highest and its own
version values for version xattr
7) unlock lock acquired in 1)
Name heal:
1) Take 'name' lock in 'this->name' on 'NULL'
2) Perform lookup on 'name' and get stat and xattr structures
3) Build gfid_db where for each gfid we know what subvolumes/bricks have
a file with 'name'
4) Delete all the stale files i.e. the file does not exist on more than
ec->redundancy number of bricks
5) On all the subvolumes/bricks with missing entry create 'name' with same
type,gfid,permissions etc.
6) Unlock lock acquired in 1)
Known limitation: At the moment with present design, it conservatively
preserves the 'name' in case it can not decide whether to delete it. this can
happen in the following scenario:
1) we have 3=2+1 (bricks: A, B, C) ec volume and 1 brick is down (Lets say A)
2) rename d1/f1 -> d2/f2 is performed but the rename is successful only on one
of the bricks (Lets say B)
3) Now name self-heal on d1 and d2 would re-create the file on both d1 and d2
resulting in d1/f1 and d2/f2.
Because we wanted to prevent data loss in the case above, the following
scenario is not healable, i.e. it needs manual intervention:
1) we have 3=2+1 (bricks: A, B, C) ec volume and 1 brick is down (Lets say A)
2) We have two hard links: d1/a, d2/b and another file d3/c even before the
brick went down
3) rename d3/c -> d2/b is performed
4) Now name self-heal on d2/b doesn't heal because d2/b with older gfid will
not be deleted. One could think why not delete the link if there is
more than 1 hardlink, but that leads to similar data loss issue I described
earlier:
Scenario:
1) we have 3=2+1 (bricks: A, B, C) ec volume and 1 brick is down (Lets say A)
2) We have two hard links: d1/a, d2/b
3) rename d1/a -> d3/c, d2/b -> d4/d is performed and both the operations are
successful only on one of the bricks (Lets say B)
4) Now name self-heal on the 'names' above which can happen in parallel can
decide to delete the file thinking it has 2 links but after all the
self-heals do unlinks we are left with data loss.
Change-Id: I3a68218a47bb726bd684604efea63cf11cfd11be
BUG: 1213358
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/10298
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Diffstat (limited to 'xlators/protocol')
0 files changed, 0 insertions, 0 deletions