| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When overwriting an existing file with O_TRUNC, the 'atime' was set to
0, meaning the Epoch (01-Jan-1970 UTC). However, the 'mtime' gets
updated correcty.
In case 'atime' or 'mtime' is not passed in the 'struct iatt', the time
values passed to the systemcall are taken from the current values are
returned by lstat().
Cherry picked from commit 9bed81ada6f91f998e9abd915b18e3f06557cdcb:
> Change-Id: I7021b7161dcd6c9a3e515d98f6d4847533c434b3
> BUG: 1401777
> Reported-by: Eivind Sarto <eivindsarto@gmail.com>
> Signed-off-by: Niels de Vos <ndevos@redhat.com>
> Reviewed-on: http://review.gluster.org/16034
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
Change-Id: I7021b7161dcd6c9a3e515d98f6d4847533c434b3
BUG: 1411010
Reported-by: Eivind Sarto <eivindsarto@gmail.com>
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/16355
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In CentOS-7, the file was receving an extra removexattr(security.ima)
FOP which changed its ctime, breaking the assumption that a particular brick
had the latest ctime based on the writevs done in the .t
Fix:
1. Compare the ctime of both files in the backend and pick the one with
the latest ctime for the fav-child policy. Also unmount the volume
before comparing, to avoid any further FOPS on the file that
can possibly modify the timestamps.
2. Added floating point handling in stat function. Thanks to Pranith for
the helping debugging the regex.
> Reviewed-on: http://review.gluster.org/16288
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
(cherry picked from commit 76fff8cb2a164b596ca67e65c99623f5b68361fd)
Change-Id: I06041a0f39a29d2593b867af8685d65c7cd99150
BUG: 1410072
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/16323
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During snapd graph generation we should check if SSL is
enabled on main volume or not. This is because clients
will communicate with snapd as if it is communicating to
a brick.
> Reviewed-on: http://review.gluster.org/15979
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Kaushal M <kaushal@redhat.com>
(cherry picked from commit 182f0d12040dab5081ca645a3f370f65cd68b528)
Change-Id: I0d7fe86c567b297a8528a48faf06161d4c3cb415
Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
BUG: 1400460
Reviewed-on: http://review.gluster.org/15987
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/16286
PROBLEM:
Consider a volume with granular-entry-heal and sharding enabled. When
a replica is down and a shard is created as part of a write, the name
index is correctly created under indices/entry-changes/<dot-shard-gfid>.
Now when a read on the same region triggers another MKNOD, the fop
fails on the online bricks with EEXIST. By virtue of this being a
symmetric error, the failed_subvols[] array is reset to all zeroes.
Because of this, before post-op, the GF_XATTROP_ENTRY_OUT_KEY will be
set, causing the name index, which was created in the previous MKNOD
operation, to be wrongly deleted in THIS MKNOD operation.
FIX:
The ideal fix would have been for a transaction to delete the name
index ONLY if it knows it is the one that created the index in the first
place. This would involve gathering information as to whether THIS xattrop
created the index from individual bricks, aggregating their responses and
based on the various posisble combinations of responses, decide whether to
delete the index or not. This is rather complex. Simpler fix would be
for post-op to examine local->op_ret in the event of no failed_subvols
to figure out whether to delete the name index or not. This can occasionally
lead to creation of stale name indices but they won't be affecting the IO path
or mess with pending changelogs in any way and self-heal in its crawl of
"entry-changes" directory would take care to delete such indices.
Change-Id: I8c5c08b7a208e840b5970fe5699dabdaf751a150
BUG: 1408785
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16294
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/16193
Replace the EXPECT '00000001' with EXPECT_NOT '00000000'. This is
because occasionally a name-heal is performing new-entry marking on
'c' causing the pending entry changelog on it to become '00000002'.
Change-Id: I89c2129f6969d3ad32d665b25e9fc55d7f9b80a1
BUG: 1406739
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16223
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After sending SIGTERM to gluster process we immediately
check if process exited. We should wait for some time
before checking process state.
> Reviewed-on: http://review.gluster.org/16162
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Avra Sengupta <asengupt@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: N Balachandran <nbalacha@redhat.com>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
(cherry picked from commit e9d8525a0d34130ba2a582109937b8e79eecf6ab)
BUG: 1405451
Change-Id: Iaba0067f6e880a7fe38e11b9fa0fe9bd103b19e2
Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-on: http://review.gluster.org/16165
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Verify if the unlink, rename and other ops are reflected both on
the current mount and other mounts.
>Reviewed-on: http://review.gluster.org/15419
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Vijay Bellur <vbellur@redhat.com>
>(cherry picked from commit 0fd7d0e1c78fdbedfcdb085445c4b0be3c1a97a9)
Change-Id: I5a296cdd557194dcf487e65ee4a14bbeaf4be690
BUG: 1399450
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15960
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See thread
http://www.gluster.org/pipermail/gluster-devel/2016-December/051714.html
for more information.
Change-Id: I9abe4b0e40499e53c1276a10a6bc192fd0f2cef7
BUG: 1405305
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16160
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/16169
Check that shd is up before executing 'volume heal' command
Change-Id: If302c9f4e7a3636e0cd52859f229d2c0018aa180
BUG: 1405889
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16188
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable data, metadata and entry self-heal as xlator-options so that glfs-heal.c
can heal split-brain files even if they are disabled on the volume via volume
set commands.
> Reviewed-on: http://review.gluster.org/11333
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
(cherry picked from commit 209c2d447be874047cb98d86492b03fa807d1832)
Change-Id: Ic191a1017131db1ded94d97c932079d7bfd79457
BUG: 1405126
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/16143
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/15923
Issue:
=====
In certain cases, there was no unwind of read
from read-ahead xlator, thus resulting in hang.
RCA:
====
In certain cases, ioc_readv() issues STACK_WIND_TAIL() instead
of STACK_WIND(). One such case is when inode_ctx for that file
is not present (can happen if readdirp was called, and populates
md-cache and serves all the lookups from cache).
Consider the following graph:
...
io-cache (parent)
|
readdir-ahead
|
read-ahead
...
Below is the code snippet of ioc_readv calling STACK_WIND_TAIL:
ioc_readv()
{
...
if (!inode_ctx)
STACK_WIND_TAIL (frame, FIRST_CHILD (frame->this),
FIRST_CHILD (frame->this)->fops->readv, fd,
size, offset, flags, xdata);
/* Ideally, this stack_wind should wind to readdir-ahead:readv()
but it winds to read-ahead:readv(). See below for
explaination.
*/
...
}
STACK_WIND_TAIL (frame, obj, fn, ...)
{
frame->this = obj;
/* for the above mentioned graph, frame->this will be readdir-ahead
* frame->this = FIRST_CHILD (frame->this) i.e. readdir-ahead, which
* is as expected
*/
...
THIS = obj;
/* THIS will be read-ahead instead of readdir-ahead!, as obj expands
* to "FIRST_CHILD (frame->this)" and frame->this was pointing
* to readdir-ahead in the previous statement.
*/
...
fn (frame, obj, params);
/* fn will call read-ahead:readv() instead of readdir-ahead:readv()!
* as fn expands to "FIRST_CHILD (frame->this)->fops->readv" and
* frame->this was pointing ro readdir-ahead in the first statement
*/
...
}
Thus, the readdir-ahead's readv() implementation will be skipped, and
ra_readv() will be called with frame->this = "readdir-ahead" and
this = "read-ahead". This can lead to corruption / hang / other problems.
But in this perticular case, when 'frame->this' and 'this' passed
to ra_readv() doesn't match, it causes ra_readv() to call ra_readv()
again!. Thus the logic of read-ahead readv() falls apart and leads to
hang.
Solution:
=========
Modify STACK_WIND_TAIL() as:
STACK_WIND_TAIL (frame, obj, fn, ...)
{
next_xl = obj /* resolve obj as the variables passed in obj macro
can be overwritten in the further instrucions */
next_xl_fn = fn /* resolve fn and store in a tmp variable, before
modifying any variables */
frame->this = next_xl;
...
THIS = next_xl;
...
next_xl_fn (frame, next_xl, params);
...
}
>Reviewed-on: http://review.gluster.org/15923
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
(Cherry picked from commit 8943c19a2ef51b6e4fa66cb57211d469fe558579)
BUG: 1399015
Change-Id: Ie662ac8f18fa16909376f1e59387bc5b886bd0f9
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15933
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Currently, I/O on a split-brained file fails even when the
favorite-child-policy is set until the self-heal is complete.
Fix:
If a valid 'source' is found using the set favorite-child-policy,inspect
and reset the afr pending xattrs on the 'sinks' (inside appropriate
locks),refresh the inode and then proceed with the read or write transaction.
The resetting itself happens in the self-heal code and hence can also
happen in the client side background-heal or by the shd's index-heal in
addition to the txn code path explained above. When it happens in via
heal, we also add checks in undo-pending to not reset the sink xattrs
again.
> Reviewed-on: http://review.gluster.org/15673
> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626
BUG: 1403121
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Simon Turcotte-Langevin <simon.turcotte-langevin@ubisoft.com>
Reviewed-on: http://review.gluster.org/16088
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/16075
Incorrect initialisation of local->optimistic_change_log was leading
to skipped pre-op and post-op even when a brick didn't participate in
the txn because it was down.
The result - missing granular name index resulting in some entries
never getting healed.
FIX:
Initialise local->optimistic_change_log just before pre-op.
Also fixed granular entry heal to create the granular name index in
pre-op as opposed to post-op. This is to prevent loss of granular
information when during an entry txn, the good (src) brick goes
offline before the post-op is done. This would cause self-heal to
do conservative merge (since dirty xattr is the only information
available), which when granular-entry-heal is enabled, expects
granular indices, the lack of which can lead to loss of data in
the worst case.
Change-Id: I213d98ca9b3c4604b095478bf427fa69c04a7d64
BUG: 1403743
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16106
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
The issue as seen by the user is detailed in the BZ but what is
happening is if the no. of items in the wait queue == max-qlen,
syncop_mt_dir_scan() does a pthread_cond_wait until the launched
synctask workers dequeue the queue. But if for some reason the worker
fails, the queue is never emptied due to which further invocations of
syncop_mt_dir_scan() are blocked forever.
Fix: Made some changes to _dir_scan_job_fn
- If a worker encounters error while processing an entry, notify the
readdir loop in syncop_mt_dir_scan() of the error but continue to process
other entries in the queue, decrementing the qlen as and when we dequeue
elements, and ending only when the queue is empty.
- If the readdir loop in syncop_mt_dir_scan() gets an error form the
worker, stop the readdir+queueing of further entries.
> Reviewed-on: http://review.gluster.org/16073
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
(cherry picked from commit 2d012c4558046afd6adb3992ff88f937c5f835e4)
Change-Id: I39ce073e01a68c7ff18a0e9227389245a6f75b88
BUG: 1403187
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/16095
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problems:
1) Inodelk is not taking quorum into account
2) finodelk, [f]entrylk are not implemented correctly
3) By default afr doesn't go for non-blocking parallel locks.
Fix:
Implemented a common framework which can be used by
[f]inodelk/[f]entrylk. Used quorum for the same.
>Change-Id: I239f13875a065298630d266941df10cfa3addc85
>BUG: 1369077
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/15802
>Tested-by: Krutika Dhananjay <kdhananj@redhat.com>
>Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Ravishankar N <ravishankar@redhat.com>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
BUG: 1402482
Change-Id: I0c5fed6ca87c6432bb20d00f76cdf5c328a52a85
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/16056
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Consider a replica setup, where one mount writes data to a
file and the other mount reads the file. In afr, read operations
are not transaction based, a brick(read subvolume) is chosen as
a part of lookup or other operations, read is always wound only
to the read subvolume, even if there was write from a different client
that failed on this brick. This stale read continues until there is
a lookup or any write operation from the mount point. Currently, this
is not a major issue, as a lookup is issued before every read and it will
switch the read subvolume to a correct one. But with the plan of
increasing md-cache timeout to 600s, the stale read problem will be
more pronounced, i.e. stale read can continue for 600s(or more if cascaded
with readdirp), as there will be no lookups.
Solution:
Afr doesn't have any built-in solution for stale read(without affecting
the performance). The solution that came up, was to use upcall. When a file
on any brick is marked bad for the first time, upcall sends a notification
to all the clients that had recently accessed the file. The solution has
2 parts:
- Identifying when a file is marked bad, on any of the bricks,
for the first time
- Client side actions on recieving the notifications
Identifying when a file is marked bad on any of the bricks for the first time:
-----------------------------------------------------------------------------
The idea is to track xattrop in upcall. xattrop currently comes with 2 afr
xattrs - afr dirty bit and afr pending xattrs.
Dirty xattr is set to 1 before every write, and is unset if write succeeds.
In certain scenarios, dirty xattr can be 0 and still the file could be bad
copy. Hence do not track dirty xattr.
Pending xattr is set on the good copy, indicating the other bricks that have
bad copy. It is still not as simple as, notifying when any of the pending xattrs
change. It could lead to flood of notifcations, in case the other brick is
completely down or consistantly failing. Hence it is important to notify only
once, the first time a good copy is marked bad.
Client side actions on recieving pending xattr change, notification:
--------------------------------------------------------------------
md-cache will invalidate the cache of that file, so that further lookup is
passed down to afr and hence update the read subvolume. Invalidating only in
md-cache is not enough, consider the folling oder of opertaions:
- pending xattr invalidation - invalidate md-cache
- readdirp on the bad read subvolume - fill md-cache
- lookup (served from md-cache)
- read - wound to the old read subvol.
Hence, along with invalidating md-cache, it is very important to reset the
read subvolume for that file, in afr.
Design Credit: Anuradha Talur, Ravishankar N
1. xattrop doesn't carry info saying post op/pre op.
2. Pre xattrop will have 0 value for all pending xattrs,
the cbk of pre xattrop carries the on-disk xattr value.
Non zero indicated healing is required.
3. Post xattrop will have non zero value for any of the
pending xattrs, if the fop failed on any of the bricks.
>Reviewed-on: http://review.gluster.org/15398
>Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
>Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Signed-off-by: Poornima G <pgurusid@redhat.com>
Change-Id: I469cbc111714c433984fe1c922be2ef113c25804
BUG: 1399450
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15958
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/15747
When there are already existing non-granular indices created that are
yet to be healed, if granular-entry-heal option is toggled from 'off' to
'on', AFR self-heal whenever it kicks in, will try to look for granular
indices in 'entry-changes'. Because of the absence of name indices,
granular entry healing logic will fail to heal these directories, and
worse yet unset pending extended attributes with the assumption that
are no entries that need heal.
To get around this, a new CLI is introduced which will invoke glfsheal
program to figure whether at the time an attempt is made to enable
granular entry heal, there are pending heals on the volume OR there
are one or more bricks that are down. If either of them is true, the
command will be failed with the appropriate error.
New CLI: gluster volume heal <VOL> granular-entry-heal {enable,disable}
Change-Id: Ic79519468a087cd337df664b968188c4adcba43a
BUG: 1398500
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15941
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
during crawl
Backport of: http://review.gluster.org/15880
If granular name indices are already in existence for a volume, and
before they are healed, granular entry heal be disabled, a crawl on
indices/xattrop will clear the changelogs on these directories. When
their corresponding entry-changes indices are crawled subsequently,
if it is found that the directories don't need heal anymore, the
granular indices are not cleaned up.
This patch fixes that problem by ensuring that the zero-xattrop
also deletes the stale indices at the level of index translator.
Change-Id: If4a2f14e33a78f2217e9fea8733ebb552af56059
BUG: 1398500
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15926
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/15826
On recieving a rename fop, marker_rename() stores the,
oldloc and newloc in its 'local' struct, once the rename
is done, the xtime marker(last updated time) is set on
the file, but sending a setxattr fop. When upcall
receives the setxattr fop, the loc->inode is NULL and
it crashes. The loc->inode can be NULL only in one valid
case, i.e. in rename case where the inode of new loc
can be NULL. Hence, marker should have filled the inode
of the new_loc before issuing a setxattr.
> Reviewed-on: http://review.gluster.org/15826
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Kotresh HR <khiremat@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
(cherry picked from commit 46e5466850311ee69e6ae9a11c2bba2aabadd5de)
Change-Id: Id638f678c3daaf4a5c29b970b58929d377ae8977
BUG: 1396414
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15877
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a remove brick operation is preceded by a fix-layout,
running remove-brick status on a node which does not
contain any of the bricks that were removed displays
fix-layout status.
The defrag_cmd variable was not updated in glusterd
for the nodes not hosting removed bricks causing the
status parsing to go wrong. This is now updated.
Also made minor modifications to the spacing in
the fix-layout status output.
> Change-Id: Ib735ce26be7434cd71b76e4c33d9b0648d0530db
> BUG: 1389697
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
> Reviewed-on: http://review.gluster.org/15749
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
(cherry picked from commit 35b085ba345cafb2b0ee978a4c4475ab0dcba5a6)
Change-Id: I3da89c61da07bc5e037527aafc84d184dcd1f764
BUG: 1396109
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/15870
Tested-by: Atin Mukherjee <amukherj@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: Id645918fa236f1fc00ab5fa427f394e853c44bf8
BUG: 1389675
Signed-off-by: Kaushal M <kaushal@redhat.com>
Reviewed-on: http://review.gluster.org/15750
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If fd is unref'd at the end of async call then the unref in cbks would
lead to double unref and possible crash. Removing duplicate unrefs.
Added unref only in failure cases.
A simple test case has been added to test async write case. Need to
extend the same for other async APIs too.
Details:
All glfd based calls in libgfapi, except for glfs_open and glfs_close,
behave in the same way. At the start of the operation, they take a ref
on glfd and fd. At the end of the operation, they unref it. Async calls
are a little different as they unref in the cbk function. A successfull
open call does not unref either the glfd or fd, thereby functioning as a
reference for a OPEN file object. glfs_close makes a syncop_flush call
sandwiched between a fd ref and unref(this can be removed, more on this
below), followed by a call to glfs_mark_glfd_for_deletion which unrefs
glfd and also calls glfs_fd_destroy as a release function thereby doing
a unref on fd too.
Functionally, there is no problem with how everything works when as
described above. However, it is a little non-intuitive that we need to
perform a fd_unref as a consequence of a implicit fd_ref that happens
within glfs_resolve_fd. As we perform a GF_REF_GET(glfd) at the start of
every operation, it would be worthwhile to remove the fd_ref that
glfs_resovle_fd takes and do away with explicit fd_unref()s at the end
of every operation. This is the same reason why we don't need the fd_ref
in glfs_close. This is however not in the scope of this patch.
Change-Id: I86b1d3b2ad846b16ea527d541dc82b5e90b0ba85
BUG: 1392286
Signed-off-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-on: http://review.gluster.org/15768
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Prasanna Kumar Kalever <pkalever@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
(cherry picked from commit e65738818dd22462ec00dda021566654d1c702b1)
Reviewed-on: http://review.gluster.org/15778
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the
fix.
>BUG: 1384297
>Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/15728
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Change-Id: Ie7977caf7c98c91fca64752c56731c37ad27df4d
BUG: 1388912
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15734
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/#/c/15654/
1. Address of a local variable @args is copied into state->req
in server3_3_compound (). But even after the function has gone out of
scope, in server_compound_resume () this pointer is accessed and
dereferenced. This patch fixes that.
2. Compound fops, by virtue of NOT having a vector sizer (like the one
writev has), ends up having both the header and the data (in case one of
its member fops is WRITEV) in the same hdr_iobuf. This buffer was not
being preserved through the lifetime of the compound fop, causing it to
be overwritten by a parallel write fop, even when the writev associated
with the currently executing compound fop is yet to hit the desk, thereby
corrupting the file's data. This is fixed by associating the hdr_iobuf with
the iobref so its memory remains valid through the lifetime of the fop.
3. Also fixed a use-after-free bug in protocol/client in compound fops cbk,
missed by Linux but caught by NetBSD.
Finally, big thanks to Pranith Kumar K and Raghavendra Gowdappa for their
help in debugging this file corruption issue.
Change-Id: I58da39ae544ad81192849926399a971c4c01c986
BUG: 1387984
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15709
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, for all the update operations, metadata or data,
we set the dirty flag at the end of the operation only if
a brick is down. This leads to delay in healing and in some
cases not at all.
In this patch we set (+1) the dirty flag
at the start of the metadata or data update operations and
after successfull completion of the fop, we unset (-1) it again.
>Change-Id: Ide5668bdec7b937a61c5c840cdc79a967598e1e9
>BUG: 1316873
>Signed-off-by: Ashish Pandey <aspandey@redhat.com>
>Reviewed-on: http://review.gluster.org/13733
>Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Change-Id: Ide5668bdec7b937a61c5c840cdc79a967598e1e9
BUG: 1377570
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/15534
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: Currently heal info command prints all
the files/directories if the index for the
file/directory is present in .glusterfs/indices folder.
After implementing patch http://review.gluster.org/#/c/13733/
indices of the file which is going through update fop
will also be present in .glusterfs/indices even
if the fop is successful on all the brick. At this time
if heal info command is being used, it will also display this
file which is actually healthy and does not require any heal.
Solution: Take lock on a file corresponding to the indices
and inspect xattrs to decide if the file needs heal or not.
>Reviewed-on: http://review.gluster.org/15543
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
>Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Change-Id: I6361e2813ece369be12d02e74816df4eddb81cfa
BUG: 1383913
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/15627
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The glfs_callback_arg and glfs_callback_inode_arg were allocated by
gfapi, and expected to be free()'d by the application. However it is not
reasonable to expect that applications use the same memory allocator to
as the compiled libgfapi.so. For instance, it is possible that gfapi
uses glibc malloc/free, and an application like NFS-Ganesha the versions
from jemalloc. Mismatching of the malloc() and free() functions causes
segmentation faults at best.
In order to prevent problems like this in the future, the API for
applications that consume upcalls has been remodeled. Any of the
structures that gfapi allocates, should be free'd with glfs_free(). The
members of the structures can not be accessed directly anymore, each
has its own function to access now.
Correcting the naming of the functions, structures and constants is a
continuation of commit 2775dc64101ed37c8d9809bf9852dbf0746ee2b6. These
new improvements not only have correct prefixes for the functions and
structures, the naming also reflects more to the upcall framework and
does not use "callback" anymore.
Cherry picked from commit 4721188a154acd9a0a4c096d8d73e97f3bf1b2a9:
> Change-Id: I2b8bd5a0a82036d2abea1a217f5e5975a1d4fe93
> BUG: 1344714
> Signed-off-by: Niels de Vos <ndevos@redhat.com>
> Reviewed-on: http://review.gluster.org/14701
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
> Reviewed-by: soumya k <skoduri@redhat.com>
> Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
Change-Id: I2b8bd5a0a82036d2abea1a217f5e5975a1d4fe93
BUG: 1378948
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/15597
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: jiffin tony Thottan <jthottan@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If connect fails with any other error than EINPROGRESS we cannot get
the error status using getsockopt (... SO_ERROR ... ). Hence we need
to remember the state of connect and take appropriate action in the
event_handler for the same.
As an added note, a event can come where poll_err is HUP and we have
poll_in as well (i.e some status was written to the socket), so for
such cases we need to finish the connect, process the data and then
the poll_err as is the case in the current code.
Special thanks to Kaushal M & Raghavendra G for figuring out the issue.
>Signed-off-by: Shyam <srangana@redhat.com>
>Reviewed-on: http://review.gluster.org/15440
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Change-Id: Ic45ad59ff8ab1d0a9d2cab2c924ad940b9d38528
BUG: 1377386
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/15533
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
wait for remove brick to complete before attempt for a commit.
>Reviewed-on: http://review.gluster.org/15457
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Vijay Bellur <vbellur@redhat.com>
Change-Id: I66ea6c48b6a69fe33d79f9d9080b6f2c1462578e
BUG: 1375042
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/15458
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When a file with hardlink is corrupted in ec volume,
the recovery steps mentioned was not working.
Only name and metadata was healing but not the data.
Cause:
The bad file marker in the inode context is not removed.
Hence when self heal tries to open the file for data
healing, it fails with EIO.
Background:
The bitrot deletes inode context during forget.
Briefly, the recovery steps involves following steps.
1. Delete the entry marked with bad file xattr
from backend. Delete all the hardlinks including
.glusters hardlink as well.
2. Access the each hardlink of the file including
original from the mount.
The step 2 will send lookup to the brick where the files
are deleted from backend and returns with ENOENT. On
ENOENT, server xlator forgets the inode if there are
no dentries associated with it. But in case hardlinks,
the forget won't be called as dentries (other hardlink
files) are associated with the inode. Hence bitrot stube
won't delete it's context failing the data self heal.
Fix:
Bitrot-stub should delete the inode context on getting
ENOENT during lookup.
>Change-Id: Ice6adc18625799e7afd842ab33b3517c2be264c1
>BUG: 1373520
>Signed-off-by: Kotresh HR <khiremat@redhat.com>
>Reviewed-on: http://review.gluster.org/15408
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Raghavendra Bhat <raghavendra@redhat.com>
(cherry picked from commit b86a7de9b5ea9dcd0a630dbe09fce6d9ad0d8944)
Change-Id: Ice6adc18625799e7afd842ab33b3517c2be264c1
BUG: 1374567
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/15434
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-by: Aravinda VK <avishwan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements functionalities for fast encoding/decoding
using hardware support. Currently optimized x86_64, SSE and AVX is
added.
Additionally this patch implements a caching mecanism for inverse
matrices to reduce computation time, as well as a new method for
computing the inverse that takes quadratic time instead of cubic.
Finally some unnecessary memory copies have been eliminated to
further increase performance.
>Change-Id: I26c75f26fb4201bd22b51335448ea4357235065a
>BUG: 1289922
>Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
>Reviewed-on: http://review.gluster.org/12837
>Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
BUG: 1374841
Change-Id: I83731663922ed11ca84536deab5737463416e1e0
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15455
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/#/c/15385/
Currently md-cache invalidation feature is enabled by setting
"performance.cache-invalidation", but this case was sent when
"features.cache-invalidation" was enabling md-cache invalidation.
Hence, fix the same.
Change-Id: If044f6208179748a120fbe1d63b676367e707f73
BUG: 1372586
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15386
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch, there will be additional entries seen in
the profile info:
UPCALL : Total number of upcall events that were sent from
the brick(in brick profile), and number of upcall
notifications recieved by client(in client profile)
Cache invalidation events:
-------------------------
CI_IATT : Number of upcalls that were cache invalidation and
had one of the IATT_UPDATE_FLAGS set. This indicates
that one of the iatt value was changed.
CI_XATTR : Number of upcalls that were cache invalidation, and
had one of the UP_XATTR or UP_XATTR_RM set. This indicates
that an xattr was updated or deleted.
CI_RENAME : Number of upcalls that were cache invalidation,
resulted by the renaming of a file or directory
CI_UNLINK : Number of upcalls that were cache invalidation,
resulted by the unlink of a file.
CI_FORGET : Number of upcalls that were cache invalidation,
resulted by the forget of inode on the server side.
Lease events:
------------
LEASE_RECALL : Number of lease recalls sent by the brick (in
brick profile), and number of lease recalls recieved
by client(in client profile)
Note that the sum of CI_IATT, CI_XATTR, CI_RENAME, CI_UNLINK,
CI_FORGET, LEASE_RECALL may not be equal to UPCALL. This is
because, each cache invalidation can carry multiple flags.
Eg:
- Every CI_XATTR will have CI_IATT
- Every CI_UNLINK will also increment CI_IATT as link count is an
iatt attribute.
Also UP_PARENT_DENTRY_FLAGS is currently not accounted for,
as CI_RENAME and CI_UNLINK will always have the flag
UP_PARENT_DENTRY_FLAGS
Change-Id: Ieb8cd21dde2c4c7618f12d025a5e5156f9cc0fe9
BUG: 1371543
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15193
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixed the way shd up check is done to prevent self-heal daemon
not running error when heal full command is executed.
Change-Id: I93c4a0da12316373d62cd4ea74432cd9bf2b090c
BUG: 1370053
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15341
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Anuradha Talur <atalur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently when a client performs a readdirp it is not stored
in upcall, as one of the clients that have accessed the files.
Hence, when any other client modifies the file, the client that
had performed readdirp will not get any notifications.
Fix this by adding the clients to upcall database when they
perform readdirp.
Change-Id: I7767f1e26bf1bd1f67702a6d01f8aa64526ccc46
BUG: 1369430
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15313
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This has been consistently causing hangs in NetBSD machines. I have not
been able to debug the issue and we have merge deadline for 3.9. It
would be better to disable this for now.
Change-Id: I8c63940aa26f78dd9994bb63293a5757835ec52b
BUG: 1369401
Signed-off-by: Raghavendra Talur <rtalur@redhat.com>
Reviewed-on: http://review.gluster.org/15374
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, md-cache only processes IATT_UPDATE_FLAGS, UP_XATTR and
UP_XATTR_RM. We also need to process UP_RENAME_FLAGS, UP_FORGET,
UP_PARENT_DENTRY_FLAGS and UP_NLINK_FLAGS. Otherwise the files
unlinked or renamed will not be reflected on other mounts.
Change-Id: Icb8b03da51482c3fc2e2a7292d16d56e11a341d9
BUG: 1211863
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15324
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Change-Id: If88efe3db782a6156614af4c650d53b159ade57f
BUG: 1371541
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15354
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The command basically allows replace brick with src and
dst bricks as same.
Usage:
gluster v reset-brick <volname> <hostname:brick-path> start
This command kills the brick to be reset. Once this command is run,
admin can do other manual operations that they need to do,
like configuring some options for the brick. Once this is done,
resetting the brick can be continued with the following options.
gluster v reset-brick <vname> <hostname:brick> <hostname:brick> commit {force}
Does the job of resetting the brick. 'force' option should be used
when the brick already contains volinfo id.
Problem: On doing a disk-replacement of a brick in a replicate volume
the following 2 scenarios may occur :
a) there is a chance that reads are served from this replaced-disk brick,
which leads to empty reads. b) potential data loss if next writes succeed
only on replaced brick, and heal is done to other bricks from this one.
Solution: After disk-replacement, make sure that reset-brick command is
run for that brick so that pending markers are set for the brick and it
is not chosen as source for reads and heal. But, as of now replace-brick
for the same brick-path is not allowed. In order to fix the above
mentioned problem, same brick-path replace-brick is needed.
With this patch reset-brick commit {force} will be allowed even when
source and destination <hostname:brickpath> are identical as long as
1) destination brick is not alive
2) source and destination brick have the same brick uuid and path.
Also, the destination brick after replace-brick will use the same port
as the source brick.
Change-Id: I440b9e892ffb781ea4b8563688c3f85c7a7c89de
BUG: 1266876
Signed-off-by: Anuradha Talur <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/12250
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: libgfapi does not enable SSL on mgmt connection.
Fix: Enable SSL when it is enabled on mgmt connection is enabled,
i.e. presence of /var/lib/glusterd/secure-access file
Change-Id: I1ce4935b04e6140aeab819e42076defd580b0727
BUG: 1362602
Signed-off-by: Rajesh Joseph <rjoseph@redhat.com>
Reviewed-on: http://review.gluster.org/15073
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Kaushal M <kaushal@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use defined HEAL and PROCESS_UP timeouts rather than
hard code them in self-heald.t.
Change-Id: I21586811904c8417b7208bb643f14dff20dc4832
BUG: 1370074
Signed-off-by: Anuradha Talur <atalur@redhat.com>
Reviewed-on: http://review.gluster.org/15316
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Tested-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the volume set option features.cache-invalidation enables upcall
feature on server side and md-cache cache-invalidation on client side.
There are multiple problems that can arise from this:
1. The scenario when user wants to, enable upcall for nfs-ganesha setup,
but do not want to enable md-cache cache-invalidation, as the
nfs-clients have already cached the metadata and upcall is used to
to invalidate the nfs-client cache. In this case, users should have
a way of disabling md-cache invalidation without disabling upcall.
2. Upcall requires a op-version of GD_OP_VERSION_3_7_0, where as
md-cache invalidation requires an op version of GD_OP_VERSION_3_9_0.
Consider a setup where the servers are in op-version GD_OP_VERSION_3_7_0,
and th clients are in op-version GD_OP_VERSION_3_9_0. if there is one
single volume set option, user can enable this feature in this setup.
But it can lead to stale xattr cache as the xattr invalidation was
introduced in upcall only in release 3.8. Hence, we should not be
able to enable md-cache invalidation, if all the servers and clients
are not on opversion >= GD_OP_VERSION_3_9_0.
To solve the above mentioned issues, we have seperate volume options
for enabling md-cache invalidation and upcall. But this can lead to
issues when user enable md-cache invalidation and forgets to enable
upcall. Probably in the next release, these can be enables by default.
Change-Id: Ie70eff97fe12fcb623eec8f4f5861ac065bf483e
BUG: 1211863
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15314
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently there is no existing CLI that can be used to get the
local state representation of the cluster as maintained in glusterd
in a readable as well as parseable format.
The CLI added has the following usage:
# gluster get-state [daemon] [odir <path/to/output/dir>] [file <filename>]
This would dump data points that reflect the local state
representation of the cluster as maintained in glusterd (no other
daemons are supported as of now) to a file inside the specified
output directory. The default output directory and filename is
/var/run/gluster and glusterd_state_<timestamp> respectively. The
option for specifying the daemon name leaves room to add support for
other daemons in the future. Following are the data points captured
as of now to represent the state from the local glusterd pov:
* Peer:
- Primary hostname
- uuid
- state
- connection status
- List of hostnames
* Volumes:
- name, id, transport type, status
- counts: bricks, snap, subvol, stripe, arbiter, disperse,
redundancy
- snapd status
- quorum status
- tiering related information
- rebalance status
- replace bricks status
- snapshots
* Bricks:
- Path, hostname (for all bricks these info will be shown)
- port, rdma port, status, mount options, filesystem type and
signed in status for bricks running locally.
* Services:
- name, online status for initialised services
* Others:
- Base port, last allocated port
- op-version
- MYUUID
Change-Id: I4a45cc5407ab92d8afdbbd2098ece851f7e3d618
BUG: 1353156
Signed-off-by: Samikshan Bairagya <samikshan@gmail.com>
Reviewed-on: http://review.gluster.org/14873
Reviewed-by: Avra Sengupta <asengupt@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bitrot scrubber takes 'hourly/daily/biweekly/monthly'
as the values for 'scrub-frequency'. There is no way
to schedule the scrubbing when the admin wants it.
Ondemand scrubbing brings in the new option 'ondemand'
with which the admin can start scrubbing ondemand.
It starts the scrubbing immediately.
Ondemand scrubbing is successful only if the scrubber
is in 'Active (Idle)' (waiting for it's next frequency
cycle to start scrubbing). It is not entertained when
the scrubber is in 'Paused' or already running.
Here is the command line syntax.
gluster volume bitrot <vol name> scrub ondemand
Change-Id: I84c28904367eed827a7dae8d6a535c14b28e9f4d
BUG: 1366195
Signed-off-by: Kotresh HR <khiremat@redhat.com>
Reviewed-on: http://review.gluster.org/15111
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Venky Shankar <vshankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the volume option 'features.cache-invalidation' is enabled, upcall
events are sent from the brick process to the client. Even if the client
is not interested in upcall events itself, md-cache or other xlators may
benefit from them.
By adding a new 'cache_upcalls' boolean in the 'struct glfs', we can
enable the caching of upcalls when the application called
glfs_h_poll_upcall(). NFS-Ganesha sets up a thread for handling upcalls
in the initialization phase, and calls glfs_h_poll_upcall() before any
NFS-client accesses the NFS-export.
In the future there will be a more flexible registration API for
enabling certain kind of upcall events. Until that is available, this
should work just fine.
Verificatio of this change is not trivial within our current regression
test framework. The bug report contains a description on how to reliably
reproduce the problem with the glusterfs-coreutils.
Change-Id: I818595c92db50e6e48f7bfe287ee05103a4a30a2
BUG: 1368842
Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-on: http://review.gluster.org/15191
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Poornima G <pgurusid@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: soumya k <skoduri@redhat.com>
Reviewed-by: Kaleb KEITHLEY <kkeithle@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
snap status --xml errors out if a brick is down and
doesn't have pid. It is handled in the cli of the snap
status where "N/A" is displayed in such a scenario.
Handled the same in xml
snap status <snapname> --xml fails as the writer is
not initialised for the same. Using GF_SNAP_STATUS_TYPE_ITER
instead of GF_SNAP_STATUS_TYPE_SNAP for all snap's
status to differentiate between the two scenarios.
Added testcase volume-snapshot-xml.t to check
all snapshot commands xml outputs
Change-Id: I99563e8f3e84f1aaeabd865326bb825c44f5c745
BUG: 1325831
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/14018
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Display number of snapshots in a volume in volume info
output. This number gets modified, with create, delete,
and restore operations.
Change-Id: Ic9b7c2b6950980f8ce75ca362998c097ea7c863d
BUG: 1360693
Signed-off-by: Avra Sengupta <asengupt@redhat.com>
Reviewed-on: http://review.gluster.org/15029
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Rajesh Joseph <rjoseph@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Unnecessary self probe is removed
2. After every probe a peer_count check is added to give the test to time finish
handhake.
Change-Id: Iab52548f8b781e7968250cd98fdbeaf02472970d
BUG: 1368953
Signed-off-by: Atin Mukherjee <amukherj@redhat.com>
Reviewed-on: http://review.gluster.org/15231
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Vijay Bellur <vbellur@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This hopefully fixes the spurious failures with
error: glusterfs/api/glfs.h No such file or directory
build.gluster.org/job/rackspace-regression-2GB-triggered/22897/consoleFull
BUG: 1365489
Change-Id: Ic3660de810c0daee7284373bbfaed172aba86d69
Signed-off-by: Poornima G <pgurusid@redhat.com>
Reviewed-on: http://review.gluster.org/15194
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra Talur <rtalur@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cyclic order
When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.
This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.
Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a
BUG: 1363721
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15080
Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Oleksandr Natalenko <oleksandr@natalenko.name>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|