diff options
author | Pranith Kumar K <pkarampu@redhat.com> | 2013-06-06 11:25:07 +0530 |
---|---|---|
committer | Vijay Bellur <vbellur@redhat.com> | 2013-06-18 22:49:01 -0700 |
commit | 3521020078e822d74eed235074dc978981fb56ed (patch) | |
tree | 5df4acaf4d904f020a14e0b33670d1ab1d2f0fa3 /doc/legacy/docbook | |
parent | 1a7e6053d3842761f946fbbdd693c72aa3945a97 (diff) |
cluster/afr: detect in-progress creation in lookup and return ENOENT
Port of http://review.gluster.org/4625
if any subvol returned ENOENT while parent entrylk lock was held,
yield and return ENOENT for the entire lookup.
This is how the issue happens:
Multiple clients A, B and C are attempting 'mkdir -p /mnt/a/b/c'
1 Client A is in the middle of mkdir(/a). It has acquired lock.
It has performed mkdir(/a) on one subvol, and second one is still
in progress
2 Client B performs a lookup, sees directory /a on one,
ENOENT on the other, succeeds lookup.
3 Client B performs lookup on /a/b on both subvols, both return ENOENT
(one subvol because /a/b does not exist, another because /a
itself does not exist)
4 Client B proceeds to mkdir /a/b. It obtains entrylk on inode=/a with
basename=b on one subvol, but fails on other subvol as /a is yet to
be created by Client A.
5 Client A finishes mkdir of /a on other subvol
6 Client C also attempts to create /a/b, lookup returns ENOENT on
both subvols.
7 Client C tries to obtain entrylk on on inode=/a with basename=b,
obtains on one subvol (where B had failed), and waits for B to unlock
on other subvol.
8 Client B finishes mkdir() on one subvol with GFID-1 and completes
transaction and unlocks
9 Client C gets the lock on the second subvol, At this stage second
subvol already has /a/b created from Client B, but Client C does not
check that in the middle of mkdir transaction
10 Client C attempts mkdir /a/b on both subvols. It succeeds on
ONLY ONE (where Client B could not get lock because of
missing parent /a dir) with GFID-2, and gets EEXIST from ONE subvol.
This way we have /a/b in GFID mismatch. One subvol got GFID-1 because
Client B performed transaction on only one subvol (because entrylk()
could not be obtained on second subvol because of missing parent dir --
caused by premature/speculative succeeding of lookup() on /a when locks
are detected). Other subvol gets GFID-2 from Client C because while
it was waiting for entrylk() on both subvols, Client B was in the
middle of creating mkdir() on only one subvol, and Client C does not
"expect" this when it is between lock() and pre-op()/op() phase of the
transaction.
Change-Id: I40107d4638ffdcb7b1ff4748c8e5ea92e62697e8
BUG: 860210
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5173
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Diffstat (limited to 'doc/legacy/docbook')
0 files changed, 0 insertions, 0 deletions