diff options
author | Pranith Kumar K <pkarampu@redhat.com> | 2017-03-30 14:58:38 +0530 |
---|---|---|
committer | Shyamsundar Ranganathan <srangana@redhat.com> | 2017-04-13 11:31:49 -0400 |
commit | 560c92862f72128ccd962192b9c4d728386c7bb1 (patch) | |
tree | 00fccce7fd7f37b0329c560244424164e4af46f8 | |
parent | 226c841bad19d0348acba22808cc65853c2b15ea (diff) |
cluster/dht: Modify local->loc.gfid in thread safe manner
Backport of https://review.gluster.org/16986
Problem:
local->loc.gfid in dht_lookup_directory() will be null-gfid for a fresh lookup.
dht_lookup_dir_cbk() updates local->loc.gfid while in other thread dht_lookup_directory()
is still winding lookup calls to subvolumes so there is a chance of partial gfid being
seen by EC.
We saw in 12x(4+2) volume, ec is receiving an loc where the gfid has last 10 bytes matching
with the gfid of the directory and the first 4 bytes are all-zeros. This is leading to EC
erroring out the lookup with EINVAL which leads to NFS failing lookup with EIO.
snip from gdb:
$37 = (dht_local_t *) 0x7fde5de5b3cc
(gdb) p /x $37->loc.gfid
$39 = {0x3b, 0x82, 0x10, 0x5e, 0x40, 0x65, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5,
0x6c, 0x2c, 0xb8, 0x56}
(gdb) fr 7
state=<optimized out>) at ec-generic.c:837
837 ec_lookup_rebuild(fop->xl->private, fop, cbk);
(gdb) p /x fop->loc[0].gfid
$40 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c,
0x2c, 0xb8, 0x56}
snip from log:
[2017-01-29 03:22:30.132328] W [MSGID: 122019]
[ec-helpers.c:354:ec_loc_gfid_check] 0-butcher-disperse-4: Mismatching GFID's
in loc [2017-01-29 03:22:30.132709] W [MSGID: 112199]
[nfs3-helpers.c:3515:nfs3_log_newfh_res] 0-nfs-nfsv3:
/linux-4.9.5/Documentation => (XID: b27b9474, MKDIR: NFS: 5(I/O error), POSIX:
5(Input/output error)), FH: exportid 00000000-0000-0000-0000-000000000000, gfid
00000000-0000-0000-0000-000000000000, mountid
00000000-0000-0000-0000-000000000000 [Invalid argument]
Fix:
update local->loc.gfid in last-call to make sure there are no races.
>BUG: 1438411
>Change-Id: Ifcb7e911568c1f1f83123da6ff0cf742b91800a0
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
BUG: 1438423
Change-Id: I804822a1d50215301881ac18318282c1a6951cfb
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: https://review.gluster.org/16987
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
-rw-r--r-- | xlators/cluster/dht/src/dht-common.c | 5 |
1 files changed, 2 insertions, 3 deletions
diff --git a/xlators/cluster/dht/src/dht-common.c b/xlators/cluster/dht/src/dht-common.c index da7be9b1009..7d686eaf87f 100644 --- a/xlators/cluster/dht/src/dht-common.c +++ b/xlators/cluster/dht/src/dht-common.c @@ -810,8 +810,6 @@ dht_lookup_dir_cbk (call_frame_t *frame, void *cookie, xlator_t *this, if (!op_ret && gf_uuid_is_null (local->gfid)) memcpy (local->gfid, stbuf->ia_gfid, 16); - memcpy (local->loc.gfid, local->gfid, 16); - /* Check if the gfid is different for file from other node */ if (!op_ret && gf_uuid_compare (local->gfid, stbuf->ia_gfid)) { @@ -879,6 +877,8 @@ unlock: this_call_cnt = dht_frame_return (frame); if (is_last_call (this_call_cnt)) { + gf_uuid_copy (local->loc.gfid, local->gfid); + if (local->need_selfheal) { local->need_selfheal = 0; dht_lookup_everywhere (frame, this, &local->loc); @@ -919,7 +919,6 @@ unlock: selfheal: FRAME_SU_DO (frame, dht_local_t); - gf_uuid_copy (local->loc.gfid, local->gfid); ret = dht_selfheal_directory (frame, dht_lookup_selfheal_cbk, &local->loc, layout); out: |