| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/#/c/10134/
1) Provided setfattr command to set timeout for split-brain
choice.
2) If split-brain inspection/resolution is being done
from the mount for a file, ref the inode when
split-brain-choice is set.
This inode will be unconditionally unref-ed after timeout
seconds set by the user/default otherwise.
3) Updated the doc and testcase to reflect the changes.
Change-Id: I15c9037dee28855f21e680e7e3632e1f48dba4e1
BUG: 1219388
Reviewed-on: http://review.gluster.org/10134
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/10679
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/#/c/10258/
Add logic in afr to work in conjunction with the arbiter xlator when a
replica 3 arbiter volume is created. More specifically, this patch:
* Enables full locks for afr data transaction for such volumes.
* Removes the upfront marking of pending xattrs at the time of pre-op
and defer it to post-op. (This is an arbiter independent change and is made for all afr transactions.)
* After pre-op stage, check if we can proceed with the fop stage without
ending up in split-brain by examining the changelog xattrs.
* Unwinds the fop with failure if only one source was available at the
time of pre-op and the fop happened to fail on particular source brick.
* Skips data self-heal if arbiter brick is the only source available.
* Adds the arbiter-count option to the shd graph.
This patch is a part of the arbiter logic implementation for 3 way AFR
details of which can be found at http://review.gluster.org/#/c/9656/
Change-Id: I9603db9d04de5626eb2f4d8d959ef5b46113561d
BUG: 1217689
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/10514
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/10257
Logic for adding the 'glusterd_brickinfo->group' member and using it to
find the brick positon has been taken from http://review.gluster.org/#/c/9919.
Thanks to Jeff Darcy for that.
This patch is a part of the arbiter logic implementation for 3 way AFR
details of which can be found at http://review.gluster.org/#/c/9656/
Change-Id: Idbfe4f29ee8e098e0102def8f38b32314316b188
BUG: 1217689
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/10479
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Changed the implementation of marker xattr handling to take just a
function which populates important data that is different from
default 'gauge' values and subvolumes where the call needs to be
wound.
- Removed duplicate code I found while reading the code and moved it to
cluster_marker_unwind. Removed unused structure members.
- Changed dht/afr/stripe implementations to follow the new implementation
- Implemented marker xattr handling for ec.
Change-Id: Ib0c3626fe31eb7c8aae841eabb694945bf23abd4
BUG: 1200372
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/9892
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Part 2/2 patch to enable users analyze and resolve
split-brain.
This patch enables :
1) Users to inspect the files in data and metadata split-brain.
2) Resolve the split-brain.
Both using a series of setfattr commands.
Consider a volume "test" with 2 bricks.
1) To inspect a file f1:
setfattr -n replica.split-brain-choice -v test-client-0 f1
After the execution of this command, if no read_subvol
is found, reads will be served from test-client-0 (corresponding
to brick-0).
2) To resolve split-brain :
setfattr -n replica.split-brain-heal-finalize -v test-client-0 f1
Execution of this command will lead to the resolution
of data and metadata split-brain with subvol mentioned in the
command (test-client-0 here) as the source and the rest as sink.
Change-Id: Ia20f3ee5abd3119e3d54fcc599f1e55ac65fd179
BUG: 1191396
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/9743
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having this particular check which was introduced by
commit c78998c39f0857ea7aacba360632c148afc54a55 causes a drop in
performance in readdirp. So the behavior is made configurable with this
patch.
Change-Id: I2858fc18b3539df7aa6d3f489e0d5cfaeb8a9b3c
BUG: 1202669
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/9917
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Provide a way of disabling reads when quorum is not met.
Change-Id: Ic4f57c2b87a0b8514600759de3a7a47e217fe3b5
BUG: 1187885
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/9543
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These will be used by both afr and ec. Moved syncop_dirfd, syncop_ftw,
syncop_dir_scan functions also into syncop-utils.c
Change-Id: I467253c74a346e1e292d36a8c1a035775c3aa670
BUG: 1177601
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/9740
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Anuradha Talur <atalur@redhat.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is one part to enable users analyze and resolve
split-brain.
Problem : To know if a file is in data/metadata split-brain
Solution : Performing "getfattr -n afr.split-brain-status
<path-to-file>" from the mount provides this information.
Also provides the list of afr children to analyse to
get more information.
Change-Id: I4d9b429794759a906371416cb84c84a212e2c7b9
BUG: 1191396
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/9633
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extend the AFR heal command to include automated split-brain resolution.
This patch [3/3] is the final patch for afr automated split-brain resolution
implementation.
"gluster volume heal <VOLNAME> [full | statistics [heal-count [replica
<HOSTNAME:BRICKNAME>]] |info [healed | heal-failed | split-brain]| split-brain
{bigger-file <FILE> |source-brick <HOSTNAME:BRICKNAME> [<FILE>]}]"
The new additions being:
1.gluster volume heal <VOLNAME> split-brain bigger-file <FILE>
Locates the replica containing the FILE, selects bigger-file as source and
completes heal.
2.gluster volume heal <VOLNAME> split-brain source-brick <HOSTNAME:BRICKNAME>
<FILE>
Selects <FILE> present in <HOSTNAME:BRICKNAME> as source and completes heal.
3.gluster volume heal <VOLNAME> split-brain <HOSTNAME:BRICKNAME>
Selects all split-brained files in <HOSTNAME:BRICKNAME> as source and completes
heal.
Note: <FILE> can be either the full file name as seen from the root of the
volume (or) the gfid-string representation of the file, which sometimes gets
displayed in the heal info command's output.
Entry/gfid split-brain resolution is not supported.
Example can be found in the test case.
Change-Id: I4649733922d406f14f28ee9033a5cb627b9538b3
BUG: 1136769
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/9377
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Afr winds inodelk calls without any order, so blocking inodelks
from two different mounts can lead to dead lock when mount1 gets
the lock on brick-1 and blocked on brick-2 where as mount2 gets
lock on brick-2 and blocked on brick-1
Fix:
Serialize the inodelks whether they are blocking inodelks or
non-blocking inodelks.
Non-blocking locks also need to be serialized.
Otherwise there is a chance that both the mounts which issued same
non-blocking inodelk may endup not acquiring the lock on any-brick.
Ex:
Mount1 and Mount2 request for full length lock on file f1. Mount1 afr may
acquire the partial lock on brick-1 and may not acquire the lock on brick-2
because Mount2 already got the lock on brick-2, vice versa. Since both the
mounts only got partial locks, afr treats them as failure in gaining the locks
and unwinds with EAGAIN errno.
Change-Id: Ie6cc3d564638ab3aad586f9a4064d81e42d52aef
BUG: 1176008
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/9372
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The purpose of encoding d_off in AFR is to indicate the
selected subvolume for the first readdir, and continue all
further readdirs of the session on the same subvolume. This is
required because, unlike files, dir d_offs are specific to the
backend and cannot be re-used on another subvolume. The d_off
transformation encodes the subvolume id and prevents such
invalid use of d_offs on other servers.
However, this approach could be quite wasteful of precious d_off
bit-space. Unlike DHT, where server id can change from entry to
entry and thus encoding the server id in the transformed d_off
is necessary, we could take a slightly relaxed approach in AFR.
The approach is to save the subvolume where the last readdir
request was sent in the fd_ctx. This consumes constant space (i.e
no per-entry cache), and serves the purpose of avoiding d_off
"misuse" (i.e using d_off from one server on another).
The compromise here is NFS resuming readdir from a non-0 cookie
after an extended delay (either anonymous FD has been reclaimed,
or server has restarted). In such cases a subvolume is picked
freshly. To make this fresh picking more deterministic (i.e, to
pick the same subvolume whenever possible, even after reboots),
the function afr_hash_child (used by afr_read_subvol_select_by_policy)
is modified to skip all dynamic inputs (i.e PID) for the case
of directories.
Change-Id: I46ad95feaeb21fb811b7e8d772866a646330c9d8
BUG: 1163161
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/9332
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gluster volume heal <volname> info command
will now also display if the files listed (in the output
of the command) are in split-brain or possibly being
healed.
This patch also fixes build warning that occurs.
Change-Id: I1fc92e62137f23b2b9ddf6e05819cee6230741d1
BUG: 1163804
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/9119
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Afr should ignore quota-size-key as part of self-heal
but should heal quota-limit key.
Change-Id: Ic0b06bd20a563a00d6bfdc2dc5a76c661e533ecb
BUG: 1161106
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/9061
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Added logging for metadata and data self-heals which helped
in debugging this issue.
- Added checks to skip self-heals when no sinks are available to heal
Change-Id: I0d50dceb84cd9ad4fe00e0b749ddf7d4ff42348a
BUG: 1128721
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8709
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AFR_STACK_RESET previously didn't cleanup afr_local_t,
leading to memory leaks. With this patch, cleanup is
done.
All credit goes to Pranith Kumar Karampuri.
Change-Id: I3c727ff4bb323dccb81da4b3168ac69bb340d17d
BUG: 1145471
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/8821
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When one of the brick is taken down and brough back up in a replica pair, locks
on that brick will be allowed. Afr returns inodelk success even when one of the
bricks already has the lock taken.
Fix:
If any brick returns EAGAIN return failure to parent xlator.
Change-Id: I5b842d0fc094359cc4231494053d2bfeb606bbbe
BUG: 1141539
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8710
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Detect and heal mismatching user extended attributes during lookup.
'Forward' port of http://review.gluster.org/#/c/7444/
Change-Id: Id03c9746f083ffd3014711d0b3a2e5a71a45eed4
BUG: 1134691
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/8558
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Based on type of file, set appropriate pending changelogs
for new entries.
Change-Id: Ifd124bf9bc54b996ce83ab9f39d03b3ccca7eb3c
BUG: 1130892
Signed-off-by: Anuradha <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/8555
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dict_t objects that are ref'd in alloca'd "replies" in
afr_replies_copy() are not unref'd after "replies" go out of scope.
Change-Id: Id5a6ca3c17a8de72b94b3e0f92165609da5a36ea
BUG: 1134221
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/8553
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Counts may give wrong results when the number of bricks is > 2. If the
locks are acquired on one source and sink, but the source accuses even the
down sink then there will be 2 sinks and lock is acquired on 2 bricks so
even when there is a clear source and sink **_finalize_source functions think
the file/directory is in split-brain.
Fix:
Check that all the bricks which are locked are sinks.
Change-Id: Ia43790e8e1bfb5e72a3d0b56bcad94abd0dc58ab
BUG: 1128721
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8456
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Allowing lookup with 'gfid-req' will lead to assigning gfid at posix layer.
When two mounts perform lookup in parallel that can lead to both bricks getting
different gfids leading to gfid-mismatch/EIO for the lookup.
Fix:
Perform gfid heal inside lock.
BUG: 1129529
Change-Id: I20c6c5e25ee27eeb906bff2f4c8ad0da18d00090
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8512
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problems with fuse/server:
Fuse loc touch up sets loc->name even when pargfid
is not known. Server lookup does (pargfid, name) based
lookup when name is set ignoring the gfid. Because of this server
resolver finds that the lookup came on (null-pargfid, name) and
fails the lookup with EINVAL.
Fix:
Don't set loc->name in loc_touchup if the pargfid is not known.
Did the same even for server-resolver
Problem with afr:
Lets say there is a directory hierarchy a/b/c/d on the mount and the
user is cd'ed into the directory. Bring down one of the bricks of replica and
remove all directories/files to simulate disk replacement on that brick. Now
this brick is brought back up. Creates on the cd'ed directory fail with ESTALE.
Basically before sending a create of 'f' inside 'd', fuse sends a lookup to
make sure the file is not present. On one of the bricks 'd' is present and
'f' is not so it sends ENOENT as response. On the new brick 'd' itself is not
present. So it sends ESTALE. In afr ESTALE is considered to be special errno on
witnessing which lookup has to fail. And ESTALE is given more priority than
ENOENT. Due to these reasons lookup fails with ESTALE rather than ENOENT. Since
lookup didn't fail with ENOENT, 'create' can't be issued so the command is
failed with ESTALE.
Solution:
Afr needs to consider ESTALE errno normally and ENOENT needs to
be given more priority so that operations like create can proceed even when
only one of the brick is up and running. Whenever client xlator identifies
that gfid-changed, it sets that information in lookup xdata. Afr uses this
information to fail the lookup with ESTALE so that top xlator can send
fresh lookup.
Change-Id: Ica6ce01baef08620154050a635e6f97d51029ef6
BUG: 1106408
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/8015
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change important (from a diagnostics point of view) log messages to use
the gf_msg() framework.
Change-Id: I0a58184bbb78989db149e67f07c140a21c781bc2
BUG: 1075611
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/7784
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Have common place to perform quorum fop wind check
- Check if fop succeeded in a way that matches quorum
to avoid marking changelog in split-brain.
BUG: 1066996
Change-Id: Ibc5b80e01dc206b2abbea2d29e26f3c60ff4f204
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/7600
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
For write fops afr's transaction eager-lock init adds transactions
that can share eager-lock to fdctx list. But if eager-lock finodelk
fop fails the stub remains in the list. This could later lead to
corruption of the list and lead to infinite loop on the list
leading to a mount hang.
Fix:
Remove the stub when finodelk fails.
Change-Id: I0ed4bc6b62f26c5e891c1181a6871ee6e4f4f5fd
BUG: 1063190
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6944
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Remove client side self-healing completely (opendir, openfd, lookup)
- Re-work readdir-failover to work reliably in case of NFS
- Remove unused/dead lock recovery code
- Consistently use xdata in both calls and callbacks in all FOPs
- Per-inode event generation, used to force inode ctx refresh
- Implement dirty flag support (in place of pending counts)
- Eliminate inode ctx structure, use read subvol bits + event_generation
- Implement inode ctx refreshing based on event generation
- Provide backward compatibility in transactions
- remove unused variables and functions
- make code more consistent in style and pattern
- regularize and clean up inode-write transaction code
- regularize and clean up dir-write transaction code
- regularize and clean up common FOPs
- reorganize transaction framework code
- skip setting xattrs in pending dict if nothing is pending
- re-write self-healing code using syncops
- re-write simpler self-heal-daemon
Change-Id: I1e4080c9796c8a2815c2dab4be3073f389d614a8
BUG: 1021686
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/6010
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Under the entry self heal, readlink is done at the
source and sink. When readlink is done at the sink,
because link is not present at the sink, afr expects
ENOENT. AFR translator takes decisions for new link
creation based on ENOENT but server translator is modified
to return ESTALE because of which afr xlator is not able
to heal.
Fix:
The check for inode absence at server includes ESTALE as
well.
Change-Id: I319e4cb4156a243afee79365b7b7a5a7823e9a24
BUG: 1046624
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/6599
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also renamed allow-sh-for-running-transaction -> attempt-self-heal
Change-Id: I134cc79e663b532e625ffc342c59e49e71644ab3
BUG: 1039544
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6463
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: venkatesh somyajulu <vsomyaju@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glfs_zerofill() can be potentially called to zero-out entire file and
hence allow for bigger value of length parameter.
Change-Id: I75f1d11af298915049a3f3a7cb3890a2d72fca63
BUG: 1028673
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-on: http://review.gluster.org/6266
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: M. Mohan Kumar <mohan@in.ibm.com>
Tested-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for a new ZEROFILL fop. Zerofill writes zeroes to a file in
the specified range. This fop will be useful when a whole file needs to
be initialized with zero (could be useful for zero filled VM disk image
provisioning or during scrubbing of VM disk images).
Client/application can issue this FOP for zeroing out. Gluster server
will zero out required range of bytes ie server offloaded zeroing. In
the absence of this fop, client/application has to repetitively issue
write (zero) fop to the server, which is very inefficient method because
of the overheads involved in RPC calls and acknowledgements.
WRITESAME is a SCSI T10 command that takes a block of data as input and
writes the same data to other blocks and this write is handled
completely within the storage and hence is known as offload . Linux ,now
has support for SCSI WRITESAME command which is exposed to the user in
the form of BLKZEROOUT ioctl. BD Xlator can exploit BLKZEROOUT ioctl to
implement this fop. Thus zeroing out operations can be completely
offloaded to the storage device , making it highly efficient.
The fop takes two arguments offset and size. It zeroes out 'size' number
of bytes in an opened file starting from 'offset' position.
This patch adds zerofill support to the following areas:
- libglusterfs
- io-stats
- performance/md-cache,open-behind
- quota
- cluster/afr,dht,stripe
- rpc/xdr
- protocol/client,server
- io-threads
- marker
- storage/posix
- libgfapi
Client applications can exloit this fop by using glfs_zerofill introduced in
libgfapi.FUSE support to this fop has not been added as there is no system call
for this fop.
Changes from previous version 3:
* Removed redundant memory failure log messages
Changes from previous version 2:
* Rebased and fixed build error
Changes from previous version 1:
* Rebased for latest master
TODO :
* Add zerofill support to trace xlator
* Expose zerofill capability as part of gluster volume info
Here is a performance comparison of server offloaded zeofill vs zeroing
out using repeated writes.
[root@llmvm02 remote]# time ./offloaded aakash-test log 20
real 3m34.155s
user 0m0.018s
sys 0m0.040s
[root@llmvm02 remote]# time ./manually aakash-test log 20
real 4m23.043s
user 0m2.197s
sys 0m14.457s
[root@llmvm02 remote]# time ./offloaded aakash-test log 25;
real 4m28.363s
user 0m0.021s
sys 0m0.025s
[root@llmvm02 remote]# time ./manually aakash-test log 25
real 5m34.278s
user 0m2.957s
sys 0m18.808s
The argument log is a file which we want to set for logging purpose and
the third argument is size in GB .
As we can see there is a performance improvement of around 20% with this
fop.
Change-Id: I081159f5f7edde0ddb78169fb4c21c776ec91a18
BUG: 1028673
Signed-off-by: Aakash Lal Das <aakash@linux.vnet.ibm.com>
Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Reviewed-on: http://review.gluster.org/5327
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Bail out of fops if split brain has been detected during lookup
Change-Id: Id387dbb1a25eec4a121dedceadc6069bdea24b5d
BUG: 1010834
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/5988
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently to know the number of files to be healed, either user
has to go to backend and check the number of entries present in
indices/xattrop directory. But if a volume consists of large
number of bricks, going to each backend and counting the number
of entries is a time-taking task. Otherwise user can give
gluster volume heal vol-name info command but with this
approach if no. of entries are very hugh in the indices/
xattrop directory, it will comsume time.
So as a feature, new command is implemented.
Command 1: gluster volume heal vn statistics heal-count
This command will get the number of entries present in
every brick of a volume. The output displays only entries
count.
Command 2: gluster volume heal vn statistics heal-count
replica 192.168.122.1:/home/user/brickname
Here if we are concerned with just one replica.
So providing any one of the brick of a replica will get
the number of entries to be healed for that replica only.
Example:
Replicate volume with replica count 2.
Backend status:
--------------
[root@dhcp-0-17 xattrop]# ls -lia | wc -l
1918
NOTE: Out of 1918, 2 entries are <xattrop-gfid> dummy
entries so actual no. of entries to be healed are
1916.
[root@dhcp-0-17 xattrop]# pwd
/home/user/2ty/.glusterfs/indices/xattrop
Command output:
--------------
Gathering count of entries to be healed on volume volume3 has been successful
Brick 192.168.122.1:/home/user/22iu
Status: Brick is Not connected
Entries count is not available
Brick 192.168.122.1:/home/user/2ty
Number of entries: 1916
Change-Id: I72452f3de50502dc898076ec74d434d9e77fd290
BUG: 1015990
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/6044
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"gluster volume heal volumename statistics" command gives the summary
of the afr crawl done based on the entries present in the xattrop
directory. Whenever afr crawls are attempted, the beginning time of
crawl, end time of crawl, no of files healed, heal-failed count and
number of files in split brain are shown along with the type of the
crawl. If crawl is already in progress then it will give the number
of files healed, heal failed count and number of files in split-brain
from the beginning of the crawl and instead of telling the end time of
the crawl, "CRAWL IN PROGRESS" message will be shown.
Output format:
command: "gluster volume heal volume-name statistics"
Output:
Gathering afr crawl statistics crawl statistics on volume volume-name
has been successful
------------------------------------------------
Crawl statistics for brick no 0
Hostname of brick 192.168.122.248
Starting time of crawl: Wed Jul 10 15:52:38 2013
Ending time of crawl: Wed Jul 10 15:52:38 2013
Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0
Starting time of crawl: Wed Jul 10 15:52:38 2013
Ending time of crawl: Wed Jul 10 15:52:38 2013
Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0
------------------------------------------------
Crawl statistics for brick no 1
Hostname of brick 192.168.122.1
Starting time of crawl: Wed Jul 10 15:52:42 2013
Ending time of crawl: Wed Jul 10 15:52:42 2013
Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0
Starting time of crawl: Wed Jul 10 15:52:42 2013
Ending time of crawl: Wed Jul 10 15:52:42 2013
Type of crawl: INDEX
No. of entries healed: 0
No. of entries in split-brain: 0
No. of heal failed entries: 0
--------------------------------------------------
Change-Id: I10bf9d10b005741db9973fb1352e0dd59ed99aa9
BUG: 949400
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/4790
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'-' can be present in a volume. This may lead to domain
collisions in future.
Tests:
Checked in gdb that domain comes with ':' separator:
Breakpoint 1, pl_common_inodelk (frame=0x7fdabcce51a4,
this=0x8bde20, volume=0x8b50d0 "r2-replicate-0:self-heal",
inode=0x7fdab822f0e8, cmd=6, flock=0x7fdabc76eee4,
loc=0x7fdabc76ede4, fd=0x0, xdata=0x7fdabc6e0ab0) at inodelk.c:597
Change-Id: I4456ae35ac8bf21e6361c34e9ad437f744a2e84b
BUG: 993981
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/6025
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Ia7b324b86d6a7051d187106d7a060155e77defc5
BUG: 910217
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5238
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Additional information for source and sinks are added.
Change-Id: I1704956ff86ac3ae36744efe7499c1d1c43faeaf
BUG: 968301
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/5638
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Durability of appending writes is implicit in the file size. Therefore
performing an explicit fsync() is unnecessary in such cases as self-heal
can check for the size of file when pending changelog is not unambiguous.
Change-Id: I05446180a91d20e0dbee5de5a7085b87d57f178a
BUG: 927146
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/5501
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lets say mount1 has eager-lock(full-lock) and after the eager-lock
is taken mount2 opened the same file, it won't be able to
perform any data operations until mount1 releases eager-lock.
To avoid such scenario do not enable eager-lock for transaction
if open-fd-count is > 1. Delaying of changelog piggybacking is
avoided in this situation.
Change-Id: I51b45d6a7c216a78860aff0265a0b8dabc6423a5
BUG: 910217
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5432
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: venkatesh somyajulu <vsomyaju@redhat.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I95e47e589419dc6a032cbd8ba01964b6c176c2d5
BUG: 927146
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5408
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
At the moment data-self-heal acquires locks in following
pattern. It takes full file lock then gets xattrs on files on both
replicas. Decides sources/sinks based on the xattrs. Now it acquires
lock from 0-128k then unlocks the full file lock. Syncs 0-128k range
from source to sink now acquires lock 128k+1 till 256k then unlocks
0-128k, syncs 128k+1 till 256k block... so on finally it takes full file
lock again then unlocks the final small range block.
It decrements pending counts and then unlocks the full file lock.
This pattern of locks is chosen to avoid more than 1 self-heal
to be in progress. BUT if another self-heal tries to take a full
file lock while a self-heal is already in progress it will be put in
blocked queue, further inodelks from writes by the application will
also be put in blocked queue because of the way locks xlator grants
inodelks. So until the self-heal is complete writes are blocked.
Here is the code:
xlators/features/locks/src/inodelk.c - line 225
if (__blocked_lock_conflict (dom, lock) && !(__owner_has_lock (dom, lock))) {
ret = -EAGAIN;
if (can_block == 0)
goto out;
gettimeofday (&lock->blkd_time, NULL);
list_add_tail (&lock->blocked_locks, &dom->blocked_inodelks);
}
This leads to hangs in applications.
Fix:
Since we want to prevent two parallel self-heals. We let them compete
in a separate "domain". Lets call the domain on which the locks have
been taken on in previous approach as "data-domain".
In the new approach When a self-heal is triggered,
it acquires a full lock in the new domain "self-heal-domain".
After this it performs data-self-heal using the locks in
"data-domain" as before.
unlock the full file lock in "self-heal-domain"
With this approach, application's writevs don't have to wait
in pending queue when more than 1 self-heal is triggered.
Change-Id: Id79aef3dfa888945977fb9758374ac41c320d0d5
BUG: 967717
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5100
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- afr_local_copy should not be memduping locked nodes, that would
mean that lock is taken in self-heal on those nodes even before
it actually takes the lock. So removed memdup code. Even entry
lock related copying (lockee info) is also not necessary for
self-heal functionality, so removing that as well. Since it is
not local_copy anymore changed its name.
- My editor changed tabs to spaces.
Change-Id: I8dfb92cb8338e9a967c06907a8e29a8404782d61
BUG: 967717
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5099
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Change-Id: I40eec20ca6b3f857245a2438883822e251077ee9
BUG: 979365
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5269
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
At the moment afr-flush makes sure that a delayed post-op
is woken up but it does not wait for it to complete the
post-op before flush unwinds.
These are the steps that are happening:
1) flush fop comes on an fd which wakes up a delayed post-op
and continues with the flush fop.
2) post-op sends fsync on the wire.
3) flush completes and unwinds to fuse.
4) graph switch happens on the fuse mount disconnecting the
old graph's client connections to bricks.
5) xattrop after fsync fails with ENOTCONN because the
connections from old graph are taken down now.
Fix:
Wait for post-op to complete before starting to flush.
We could make flush act similar to fsync (i.e.) wind
flush as is but wait for post-op to complete before unwinding
flush, but it is better to send flush as the final fop. So
wind of flush will start after post-op is complete. Had to
change fsync to accommodate this change.
Change-Id: I93aa642647751969511718b0e137afbd067b388a
BUG: 980548
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/5274
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Currently whenever there is metadata split-brain, a variable
sh->op_failed is set to 1 to denote that self heal got failed.
But if we proceed for data self heal, even code-path of data
self heal also relies on the sh->op_failed variable. So if will
check for sh->op_failed variable and will eventually fails to
do data self heal. So needed a mechanism to allow data self heal
even if metadata is in split brain.
Fix:
Some data structure revamp is done in
http://review.gluster.com/#/c/5106/ fix and this patch is
based on the above fix. Now we can store which particular self-heal
got failed i.e GFID_OR_MISSING_ENTRY_SELF_HEAL, METADATA, DATA,
ENTRY. And we can do two types of self heal failure check.
1. Individual type check: We can check which among all four
(Metadata, Data, Gfid or missing entry, entry self heal)
got failed.
2. In afr_self_heal_completion_cbk, we need to make check
based on the fact that if any specific self heal got failed treat
the complete self heal as failure so that it will populate
corresponding circular buffer of event history accordingly.
Change-Id: Icb91e513bcc752386fc8a78812405cfabe5cac2d
BUG: 977797
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/5253
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
As the end of the self heal, message logged by
"afr_self_heal_completion_cbk" is inadequate to determine what exactly failed
during the course of afr self heal. It is worth to have knowledge of what all
types of self heal got triggered for an entity and whether the status is success
or failure.
Fix:
At the end of self heal, it will log information about out of 4 types of self
heal (gfid or missing entry self heal, metadata, data and entry self heal),
who all got triggered and who all got failed or successful at the end.
Change-Id: I5360762fbd7d391ac4c6af6706b4835c5801835a
BUG: 968301
Signed-off-by: Venkatesh Somyajulu <vsomyaju@redhat.com>
Reviewed-on: http://review.gluster.org/5106
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for the DISCARD file operation. Discard punches a hole
in a file in the provided range. Block de-allocation is implemented
via fallocate() (as requested via fuse and passed on to the brick
fs) but a separate fop is created within gluster to emphasize the
fact that discard changes file data (the discarded region is
replaced with zeroes) and must invalidate caches where appropriate.
BUG: 963678
Change-Id: I34633a0bfff2187afeab4292a15f3cc9adf261af
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/5090
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement support for the fallocate file operation. fallocate
allocates blocks for a particular inode such that future writes
to the associated region of the file are guaranteed not to fail
with ENOSPC.
This patch adds fallocate support to the following areas:
- libglusterfs
- mount/fuse
- io-stats
- performance/md-cache,open-behind
- quota
- cluster/afr,dht,stripe
- rpc/xdr
- protocol/client,server
- io-threads
- marker
- storage/posix
- libgfapi
BUG: 949242
Change-Id: Ice8e61351f9d6115c5df68768bc844abbf0ce8bd
Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-on: http://review.gluster.org/4969
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Anand Avati <avati@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Today there is a non-obvious dependence of eager-locking on
write-behind. The reason is that eager-locking works as long
as the inheriting transaction has no overlaps with any of the
transactions already in progress. While write-behind provides
non-overlapping writes as a side-effect most of times (and only
guarantees it when strict-write-ordering option is enabled,
which is not on by default) eager-lock needs the behavior
as a guarantee. This is leading to complex and unwanted checks
for the presence of write-behind in the graph, for the simple
task of checking for overlaps.
This patch removes the interdependence between eager-locking
and write-behind by making eager-locking do its own overlap checks
with in-progress writes.
Change-Id: Iccba1185aeb5f1e7f060089c895a62840787133f
BUG: 912581
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4782
Tested-by: Gluster Build System <jenkins@build.gluster.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce AFR_CALL_RESUME macro which cleans up frame->local, like
how AFR_STACK_UNWIND etc. do.
Therefore fix leak in afr_fsync() path.
Change-Id: I3855d8e7e84dbc44e05f507563b7f722bf9621b8
BUG: 927146
Signed-off-by: Anand Avati <avati@redhat.com>
Reviewed-on: http://review.gluster.org/4745
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Gluster Build System <jenkins@build.gluster.com>
|