cluster/afr: Fix missing name indices due to EEXIST error

PROBLEM: Consider a volume with granular-entry-heal and sharding enabled. When a replica is down and a shard is created as part of a write, the name index is correctly created under indices/entry-changes/<dot-shard-gfid>. Now when a read on the same region triggers another MKNOD, the fop fails on the online bricks with EEXIST. By virtue of this being a symmetric error, the failed_subvols[] array is reset to all zeroes. Because of this, before post-op, the GF_XATTROP_ENTRY_OUT_KEY will be set, causing the name index, which was created in the previous MKNOD operation, to be wrongly deleted in THIS MKNOD operation. FIX: The ideal fix would have been for a transaction to delete the name index ONLY if it knows it is the one that created the index in the first place. This would involve gathering information as to whether THIS xattrop created the index from individual bricks, aggregating their responses and based on the various posisble combinations of responses, decide whether to delete the index or not. This is rather complex. Simpler fix would be for post-op to examine local->op_ret in the event of no failed_subvols to figure out whether to delete the name index or not. This can occasionally lead to creation of stale name indices but they won't be affecting the IO path or mess with pending changelogs in any way and self-heal in its crawl of "entry-changes" directory would take care to delete such indices. Change-Id: Ic1b5257f4dc9c20cb740a866b9598cf785a1affa BUG: 1408712 Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com> Reviewed-on: http://review.gluster.org/16286 Smoke: Gluster Build System <jenkins@build.gluster.org> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
author: Krutika Dhananjay <kdhananj@redhat.com> 2016-12-26 21:08:03 +0530
committer: Pranith Kumar Karampuri <pkarampu@redhat.com> 2016-12-27 03:53:04 -0800
commit: da5ece887c218a7c572a1c25925a178dbd08d464 (patch)
tree: 81a868bea2c30790255bdb4455bbde315dffa403 /tests/include.rc
parent: 5a7c86e578f5bbd793126a035c30e6b052177a9f (diff)
1 files changed, 1 insertions, 0 deletions
diff --git a/tests/include.rc b/tests/include.rc
index 74c279cb8e1..5b5804ea7ab 100644
--- a/tests/include.rc
+++ b/tests/include.rc
@@ -11,6 +11,7 @@ B0=${B0:=/d/backends};        # top level of brick directories
 WORKDIRS="$B0 $M0 $M1 $M2 $N0 $N1"
 
 ROOT_GFID="00000000-0000-0000-0000-000000000001"
+DOT_SHARD_GFID="be318638-e8a0-4c6d-977d-7a937aa84806"
 
 META_VOL=${META_VOL:=gluster_shared_storage}; # shared gluster storage volume used by snapshot scheduler, nfs ganesha and geo-rep.
 META_MNT=${META_MNT:=/var/run/gluster/shared_storage}; # Mount point of shared gluster volume.
author	Krutika Dhananjay <kdhananj@redhat.com>	2016-12-26 21:08:03 +0530
committer	Pranith Kumar Karampuri <pkarampu@redhat.com>	2016-12-27 03:53:04 -0800
commit	da5ece887c218a7c572a1c25925a178dbd08d464 (patch)
tree	81a868bea2c30790255bdb4455bbde315dffa403 /tests/include.rc
parent	5a7c86e578f5bbd793126a035c30e6b052177a9f (diff)