| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Updating the layout in the dht inode_ctx in
rebalance_task_completion after the file is migrated
is erroneous in case of files with hardlinks.
This step can be skipped as the layout will be set
in the syncop_lookup call post the migration in
dht_migrate_file.
> Change-Id: I24ac798a919585d91a117d6a207e6a31b88486c6
> BUG: 1415761
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
> Reviewed-on: https://review.gluster.org/16457
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
> Reviewed-by: Susant Palai <spalai@redhat.com>
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Change-Id: Ifccffd67b8bc12208efb23101366a1ac7a8c60f5
BUG: 1420184
Reviewed-on: https://review.gluster.org/16561
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: N Balachandran <nbalacha@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Susant Palai <spalai@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixed the order of the migration phase checks
in dht_fsync_cbk. Phase1 should never be hit if op_ret
is non zero.
> Change-Id: I9222692e04868bffa93498059440f0aa553c83ec
> BUG: 1410777
> Reviewed-on: http://review.gluster.org/16350
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
> Tested-by: Raghavendra G <rgowdapp@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
(cherry picked from commit 83117c71482c9f87a0b6812094cbb22497eb3faa)
Change-Id: I0455871aef230371db5ecb1b2e6b1468e8a6d079
BUG: 1412119
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: https://review.gluster.org/16374
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Rename linkfile cleanup is done as non-root which may not have priviliges to do
the rename so it fails with EACCESS. MKDIR on that name in future will start to
hole on this subvolume. It is not easy to hit on fuse mounts because vfs takes
care of the permission checks even before rename fop is wound. But with
nfs-ganesha mounts it happens.
Fix:
Do rename cleanup as root
>BUG: 1409727
>Change-Id: I414c1eb6dce76b4516a6c940557b249e6c3f22f4
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/16317
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
>Reviewed-by: N Balachandran <nbalacha@redhat.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
BUG: 1412913
Change-Id: I7f891034150d7a0e3210202fb0788040c91e1c09
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/16390
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Rename does two locks. There is a case where when it tries to unlock it sends
xattrop of the directory with new version, callback of these two xattrops can
be picked up by two separate epoll threads. Both of them will try to set the
lk-owner for unlock in parallel on the same frame so one of these unlocks will
fail because the lk-owner doesn't match.
Fix:
Specify the lk-owner which will be set on inodelk frame which will not be over
written by any other thread/operation.
>BUG: 1402710
>Change-Id: I666ffc931440dc5253d72df666efe0ef1d73f99a
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/16074
>Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
BUG: 1412922
Change-Id: I18c553adaa0cbc8df55876accc1e1dcd4ee9c116
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/16394
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
__afr_set_in_flight_sb_status(), which resets event_gen to zero, is
called if failed_subvols[i] is non-zero for any brick. But failed_subvols[i]
is true even if the brick was down *before* the transaction started.
Hence say if 1 brick is down in a replica-3, every writev that comes
will trigger an inode refresh because of this resetting, as seen from
the no. of FSTATs in the profile info in the BZ.
Fix:
Reset event gen only if the brick was previously a valid read child and
the FOP failed on it the first time.
Also `s/afr_inode_read_subvol_reset/afr_inode_event_gen_reset` because
the function only resets event gen and not the data/metadata readable.
> Reviewed-on: http://review.gluster.org/16309
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
(cherry picked from commit 522640be476a3f97dac932f7046f0643ec0ec2f2)
Change-Id: I603ae646cbde96995c35db77916e2ed80b602a91
BUG: 1412888
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/16386
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/16362
* Even on errors like ENOENT, AFR logs split-brain after
read-txn refresh, introduced by commit a07ddd8f.
This can be a cause of much panic and confusion and needs to be fixed.
* Also fixed this issue in write-txns.
* Fixed afr read txns to log about split-brain only after knowing that
there is no split-brain choice configured.
* Removed code duplication
* Fixed incorrect passing of error code in afr_write_txn_refresh_done()
(the function was passing -0 as errno to gf_msg().
Change-Id: I21ac7f6e31840fe3da2f9eecccc495056ab46ece
BUG: 1412915
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16393
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In link fop lookup is happening on the new fop which doesn't exist so the iatt
ec serves parent xlators has size as zero which leads to 'cat' giving empty output
Fix:
Change code so that lookup happens on the existing link instead.
>BUG: 1409730
>Change-Id: I70eb02fe0633e61d1d110575589cc2dbe5235d76
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/16320
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
>Tested-by: Xavier Hernandez <xhernandez@datalab.es>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
BUG: 1412916
Change-Id: I7079f1adbf5c402d1eb6eb9fe8badf8e17e475a4
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/16391
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/#/c/16205/
Before http://review.gluster.org/#/c/16091/, after inode refresh, we
failed read txns in case of EIO or event_generation being zero. For
write transactions, the check was only for EIO. 16091 re-factored the
code to fail both read and write when event_generation=0. This seems to
have caused a regression as explained in the BZ.
This patch restores that behaviour in afr_txn_refresh_done().
Change-Id: Id763ed2d420b6d045d4505893a18959d998c91a3
BUG: 1378547
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/16322
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
> BUG: 1410355
> Change-Id: I867419ca36a81ef7209e6911a46c1c2c898b8eab
> Signed-off-by: Susant Palai <spalai@redhat.com>
> Reviewed-on: http://review.gluster.org/16328
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
(cherry picked from commit 451ca272d12f2d49522c845d53585520f71525f8)
Change-Id: If05094346fc1fc48f4982db6a195740171ae2fad
BUG: 1410764
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/16348
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixed a memleak where dict was not being unrefed
in the dht_migration_complete_check_task and
dht_rebalance_inprogress_task functions.
BUG: 1410369
> Change-Id: I3d42e9a2e5c8596c985bf6431a68fd3905227383
> BUG: 1409186
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
> Reviewed-on: http://review.gluster.org/16308
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: MOHIT AGRAWAL <moagrawa@redhat.com>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
> (cherry picked from commit 11b6a2c9fc5232b58774cab29873406c0fbfef19)
Change-Id: I4e92e4c87b52f6f5ffb859005aa3d525225b4bda
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/16331
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Currently, I/O on a split-brained file fails even when the
favorite-child-policy is set until the self-heal is complete.
Fix:
If a valid 'source' is found using the set favorite-child-policy, inspect
and reset the afr pending xattrs on the 'sinks' (inside appropriate locks),
refresh the inode and then proceed with the read or write transaction.
The resetting itself happens in the self-heal code and hence can also
happen in the client side background-heal or by the shd's index-heal in
addition to the txn code path explained above. When it happens in via
heal, we also add checks in undo-pending to not reset the sink xattrs
again.
> Reviewed-on: http://review.gluster.org/15673
> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Change-Id: Ic8c1317720cb26bd114b6fe6af4e58c73b864626
BUG: 1378547
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Simon Turcotte-Langevin <simon.turcotte-langevin@ubisoft.com>
Reviewed-on: http://review.gluster.org/16091
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
private
If reconfigure is executed parallely (or concurrently with dht_init),
there are races that can corrupt memory. One such race is modification
of regexes stored in conf (conf->rsync_regex_valid and
conf->extra_regex_valid) through dht_init_regex. With change [1],
reconfigure codepath can get executed parallely (with itself or with
dht_init) and this fix is needed.
Also, a reconfigure can race with any thread doing dht_layout_search,
resulting in dht_layout_search accessing regex freed up by reconfigure
(like in bz 1399134).
[1] http://review.gluster.org/15046
>Change-Id: I039422a65374cf0ccbe0073441f0e8c442ebf830
>BUG: 1399134
>Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
>Reviewed-on: http://review.gluster.org/15945
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: N Balachandran <nbalacha@redhat.com>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Change-Id: I039422a65374cf0ccbe0073441f0e8c442ebf830
BUG: 1399423
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
(cherry picked from commit 64451d0f25e7cc7aafc1b6589122648281e4310a)
Reviewed-on: http://review.gluster.org/15793
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generally linkto file is created using root user. Consider following
case, a user is trying to rename a file which he is not permitted.
So the rename fails with EACESS and when rename tries to cleanup the
linkto file, it fails.
The above issue happens when rename/00.t test executed on nfs-ganesha
clients :
Steps executed in script
* create a file "abc" using root
* rename the file "abc" to "xyz" using a non root user, it fails with EACESS
* delete "abc"
* create directory "abc" using root
* again try ot rename "abc" to "xyz" using non root user, test hungs here
which slowly leds to OOM kill of ganesha process
RCA put forwarded by Du for OOM kill of ganesha
Note that when we hit this bug, we've a scenario of a dentry being
present as:
* a linkto file on one subvol
* a directory on rest of subvols
When a lookup happens on the dentry in such a scenario, the control flow
goes into an infinite loop of:
dht_lookup_everywhere
dht_lookup_everywhere_cbk
dht_lookup_unlink_cbk
dht_lookup_everywhere_done
dht_lookup_directory (as local->dir_count > 0)
dht_lookup_dir_cbk (sets to local->need_selfheal = 1 as the entry is a linkto file on one of the subvol)
dht_lookup_everywhere (as need_selfheal = 1).
This infinite loop can cause increased consumption of memory due to:
1) dht_lookup_directory assigns a new layout to local->layout unconditionally
2) Most of the functions in this loop do a stack_wind of various fops.
This results in growing of call stack (note that call-stack is destroyed only after lookup response is
received by fuse - which never happens in this case)
Thanks Du for root causing the oom kill and Sushant for suggesting the fix
Upstream reference :
>Change-Id: I1e16bc14aa685542afbd21188426ecb61fd2689d
>BUG: 1397052
>Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
>Reviewed-on: http://review.gluster.org/15894
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
>(cherry picked from commit 57d59f4be205ae0c7888758366dc0049bdcfe449)
Change-Id: I1e16bc14aa685542afbd21188426ecb61fd2689d
BUG: 1401029
Signed-off-by: Jiffin Tony Thottan <jthottan@redhat.com>
Reviewed-on: http://review.gluster.org/16015
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check for NULL inode before attempting to
set dht inode ctx.
> Change-Id: I7693c18445f138221d8417df5e95b118cedb818a
> BUG: 1395261
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
> Reviewed-on: http://review.gluster.org/15847
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Atin Mukherjee <amukherj@redhat.com>
(cherry picked from commit 8313d53accaa22feb14d284fb91245be0a32e16e)
Change-Id: I7607d32d38d707dd5d71b98efffd1a458ffe90d7
BUG: 1395510
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/15850
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/16286
PROBLEM:
Consider a volume with granular-entry-heal and sharding enabled. When
a replica is down and a shard is created as part of a write, the name
index is correctly created under indices/entry-changes/<dot-shard-gfid>.
Now when a read on the same region triggers another MKNOD, the fop
fails on the online bricks with EEXIST. By virtue of this being a
symmetric error, the failed_subvols[] array is reset to all zeroes.
Because of this, before post-op, the GF_XATTROP_ENTRY_OUT_KEY will be
set, causing the name index, which was created in the previous MKNOD
operation, to be wrongly deleted in THIS MKNOD operation.
FIX:
The ideal fix would have been for a transaction to delete the name
index ONLY if it knows it is the one that created the index in the first
place. This would involve gathering information as to whether THIS xattrop
created the index from individual bricks, aggregating their responses and
based on the various posisble combinations of responses, decide whether to
delete the index or not. This is rather complex. Simpler fix would be
for post-op to examine local->op_ret in the event of no failed_subvols
to figure out whether to delete the name index or not. This can occasionally
lead to creation of stale name indices but they won't be affecting the IO path
or mess with pending changelogs in any way and self-heal in its crawl of
"entry-changes" directory would take care to delete such indices.
Change-Id: Icc642a987d1b6a5097562315aecf1263ed35ceb6
BUG: 1408786
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16293
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
afr_replies_interpret() used the 'readable' matrix to trigger client
side heals after inode refresh. But for arbiter, readable is always
zero. So when `dd` is run with a data brick down, spurious data heals
are are triggered. These heals open an fd, causing eager lock to be
disabled (open fd count >1) in afr transactions, leading to extra FXATTROPS
Fix:
Use the accused matrix (derived from interpreting the afr pending
xattrs) to decide whether we can start heal or not.
> Reviewed-on: http://review.gluster.org/16277
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
(cherry picked from commit 5a7c86e578f5bbd793126a035c30e6b052177a9f)
Change-Id: Ibbd56c9aed6026de6ec42422e60293702aaf55f9
BUG: 1408772
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/16291
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
race: readdirp has read one entry, and doing a lookup on
that entry, but user might have renamed/removed that entry just
after readdirp but before lookup.
Since remove-brick is a costly opertaion,will ingore any
ENOENT/ESTALE failures and move on.
> Change-Id: I62c7fa93c0b9b7e764065ad1574b97acf51b5996
> BUG: 1408115
> Signed-off-by: Susant Palai <spalai@redhat.com>
> Reviewed-on: http://review.gluster.org/15846
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Change-Id: I62c7fa93c0b9b7e764065ad1574b97acf51b5996
BUG: 1408414
Reviewed-on: http://review.gluster.org/15846
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
(cherry picked from commit c20febcb1ffaef3fa29563987e7a3b554aea27b3)
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/16278
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: A hard link is lost during rebalance + lookup.Rebalance skip
files if file has hardlink.In dht_migrate_file
__is_file_migratable () function checks if a file has hardlink,
if yes file is not migrated but if link is created after call
this function then link will lost.
Solution: Call __check_file_has_hardlink to check hardlink existence
after (S+T) bits in migration process ,if file has hardlink
then skip the file for migrate rebalance process.
> BUG: 1396048
> Change-Id: Ia53c07ef42f1128c2eedf959a757e8df517b9d12
> Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
> Reviewed-on: http://review.gluster.org/15866
> Reviewed-by: Susant Palai <spalai@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: N Balachandran <nbalacha@redhat.com>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
> (cherry picked from commit 71dd2e914d4a537bf74e1ec3a24512fc83bacb1d)
BUG: 1399432
Change-Id: I30e21efd5a054d8a3e640ab3ed8aa7955d083926
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Reviewed-on: http://review.gluster.org/15954
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/16075
Incorrect initialisation of local->optimistic_change_log was leading
to skipped pre-op and post-op even when a brick didn't participate in
the txn because it was down.
The result - missing granular name index resulting in some entries
never getting healed.
FIX:
Initialise local->optimistic_change_log just before pre-op.
Also fixed granular entry heal to create the granular name index in
pre-op as opposed to post-op. This is to prevent loss of granular
information when during an entry txn, the good (src) brick goes
offline before the post-op is done. This would cause self-heal to
do conservative merge (since dirty xattr is the only information
available), which when granular-entry-heal is enabled, expects
granular indices, the lack of which can lead to loss of data in
the worst case.
Change-Id: Ibc0fbfb3fa21c578e28868d9e30b274e33c12064
BUG: 1403646
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/16105
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
> Reviewed-on: http://review.gluster.org/15968
> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
(cherry picked from commit fb95eb4da6f4fc0b9c69e3b159a2214fe47e6d1d)
Change-Id: I2beaba829710565a3246f7449a5cd21755cf5f7d
BUG: 1400927
Signed-off-by: Mateusz Slupny <mateusz.slupny@appeartv.com>
Reviewed-on: http://review.gluster.org/16012
Tested-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/#/c/15800/
Change-Id: I2dd7ed69a456e8b9e54a4093f14dc16950bef081
BUG: 1393630
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15813
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/#/c/15788/
On a sharded volume when a brick is replaced while IO is going on, named
lookup on individual shards as part of read/write was failing with
ENOENT on the replaced brick, and as a result AFR initiated name heal in
lookup callback. But since pargfid was empty (which is what this patch
attempts to fix), the resolution of the shards by protocol/server used
to fail and the following pattern of logs was seen:
Brick-logs:
[2016-11-08 07:41:49.387127] W [MSGID: 115009]
[server-resolve.c:566:server_resolve] 0-rep-server: no resolution type
for (null) (LOOKUP)
[2016-11-08 07:41:49.387157] E [MSGID: 115050]
[server-rpc-fops.c:156:server_lookup_cbk] 0-rep-server: 91833: LOOKUP(null)
(00000000-0000-0000-0000-000000000000/16d47463-ece5-4b33-9c93-470be918c0f6.82)
==> (Invalid argument) [Invalid argument]
Client-logs:
[2016-11-08 07:41:27.497687] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-0: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.497755] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-1: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.498500] W [MSGID: 114031]
[client-rpc-fops.c:2930:client3_3_lookup_cbk] 2-rep-client-2: remote
operation failed. Path: (null) (00000000-0000-0000-0000-000000000000)
[Invalid argument]
[2016-11-08 07:41:27.499680] E [MSGID: 133010]
Also, this patch makes AFR by itself choose a non-NULL pargfid even if
its ancestors fail to initialize all pargfid placeholders.
Change-Id: Ica9e1b5b196ac37aafe6128e7aa0694a07245fdb
BUG: 1392846
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15796
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was a misleading message from logs about
available disk space while rebalancing of bricks
while calculating free space.
Bug: 1390870
Backprt of http://review.gluster.org/#/c/15345/
Change-Id: Ie9df0b2cbf00faaf13a0a3f0dbd657377a082755
Signed-off-by: ankitraj <anraj@redhat.com>
Reviewed-on: http://review.gluster.org/15765
Tested-by: ankitraj
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks a lot to xiaoping.wu@nokia.com from Nokia for the bug and the
fix.
>BUG: 1384297
>Change-Id: Ie443237e85d34633b5dd30f85eaa2ac34e45754c
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/15728
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Change-Id: I7646adc3771ff76cdf9c979b575bbcd0b3bc1b9a
BUG: 1388948
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15735
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In afr lookup when NULL dict is received in lookup, afr
is supposed to set all the xattrs it requires in a new dict
it creates, but for 'link-count' it is trying to set to the
dict that is passed in lookup which can be NULL sometimes.
This is leading to error logs. Fixed the same in this patch.
>BUG: 1385104
>Change-Id: I679af89cfc410cbc35557ae0691763a05eb5ed0e
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/15646
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>Reviewed-by: Ravishankar N <ravishankar@redhat.com>
>BUG: 1385236
>Change-Id: I802e74e7ad24e183b6653101ad7bf5ab0bf6e55b
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/15650
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Ravishankar N <ravishankar@redhat.com>
BUG: 1385442
Change-Id: Ie93b25e8cf52b0d58ef335929dfa10a78a0dd734
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15651
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
Sharding exposed a bug in arbiter config. where `dd` throughput was
extremely slow. Shard xlator was sending a fxattrop to update the file
size immediately after a writev. Arbiter was incorrectly over-riding the
LLONGMAX-1 start offset (for metadata domain locks) for this fxattrop,
causing the inodelk to be taken on the data domain. And since the
preceeding writev hadn't released the lock (afr does a 'lazy'
unlock if write succeeds on all bricks), this degraded to a blocking
lock causing extra lock/unlock calls and delays.
Fix:
Modify flock.l_len and flock.l_start to take full locks only for data
transactions.
> Reviewed-on: http://review.gluster.org/15641
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
(cherry picked from commit 3a97486d7f9d0db51abcb13dcd3bc9db935e3a60)
Change-Id: I906895da2f2d16813607e6c906cb4defb21d7c3b
BUG: 1375125
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reported-by: Max Raba <max.raba@comsysto.com>
Reviewed-on: http://review.gluster.org/15647
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In some cases we see that readdir keeps winding to the brick that doesn't have
any blocked locks i.e. first brick. This is leading to the client assuming that
there are no blocking locks on the inode so it won't give away the lock. Other
clients end up blocked on the lock as if the command hung.
Fix:
Proper way to fix this issue is to use infra present in
http://review.gluster.org/14736 This is a stop gap fix where we start taking
inodelks in opendir which goes to all the bricks, this will detect if there is
any contention.
cherry picked from commit f013335400d033a9677797377b90b968803135f4:
>BUG: 1346719
>Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/15309
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Ashish Pandey <aspandey@redhat.com>
>Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Change-Id: I91109107a26f6535b945ac476338e9f21dc31eb9
BUG: 1371397
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/15405
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: For healing of uid/gid we check if local->stbuf.ia_ctime is
lesser than stbuf->ia_ctime (received from brick). If yes then uid/gid
is updated to local->prebuf(source of healing).
But we merge local->stbuf also form the newly added brick. So if we
receive response from the newly added brick first and update the
local->stbuf, then local->prebuf will remain empty since the newly added
brick will have the latest ctime among all servers. And this can result
in healing wrong uid/gids to the rest of servers.
Hence, we should update local->stbuf from servers with a layout which
will ignore merging stbufs from newly added bricks.
> Reviewed-on: http://review.gluster.org/15126
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
(cherry picked from commit 36af81ac7cb2d459f9bfc0c436f0038a68f85235)
Change-Id: If4b64f75a0ea669abdbe9f5a3d1d18ff19374c2f
BUG: 1375096
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/15464
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/#/c/15548/
Problem:
In arbiter configuration, posix-xlator in the arbiter brick always sets the
GF_CONTENT_KEY in the response dict with a value 0. If the file size on the
data bricks is more than quick-read's max-file-size (64kb default), those
bricks don't set the key. Because of this difference in the no. of dict
elements, afr triggers metadata heal in lookup code path, in turn leading to
extra lookups+inodelks.
Fix:
Changed afr dict comparison logic to ignore all virtual xattrs and the on-disk
ones that we should not be healing.
Change-Id: I05730bdd39d8fb0b9a49a5fc9c0bb01f0d3bb308
BUG: 1377193
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/15578
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: In a distributed-replicate volume attribute
"replica.split-brain-status" value does not display split-brain
condition though directory is in split-brain.
If directory is in split brain on mutiple replica-pairs
it does not show full list of replica pairs.
Solution: Update the dht_aggregate code to aggregate the xattr
value in this specific condition.
Fix: 1) function getChoices returns the choices from split-brain
status string.
2) function add_opt adding the choices to local buffer to
store in dictionary
3) For the key "replica.split-brain-status" function dht_aggregate
call dht_aggregate_split_brain_xattr to prepare the list.
Test: To verify the patch followed below steps
1) Create a distributed replica volume and create mount point
2) Stop heal daemon
3) Touch file and directories on mount point
mkdir test{1..5};touch tmp{1..5}
4) Down brick process on one of the replica set
pkill -9 glusterfsd
5) Change permission of dir on mount point
chmod 755 test{1..5}
6) Restart brick process on node with force option
7) kill brick process on other node in same replica set
8) Change permission of dir again on mount point
chmod 766 test{1..5}
9) Reexecute same step from 4-9 on other replica set also
10) After check heal status on server it will show dir's are
in split brain on all replica sets
11) After check the replica.split-brain-status attr on mount
point it will show wrong status of split brain.
12) After apply the patch the attribute shows correct value.
> Change-Id: Icdfd72005a4aa82337c342762775a3d1761bbe4a
> Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
> Reviewed-on: http://review.gluster.org/15201
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
> (cherry picked from commit c4e9ec653c946002ab6d4c71ee8e6df056438a04)
Change-Id: I85a5ae60189066d9e80799f00f1352c2f33ef4f8
Backport of commit c4e9ec653c946002ab6d4c71ee8e6df056438a04
BUG: 1375098
Signed-off-by: Mohit Agrawal <moagrawa@redhat.com>
Reviewed-on: http://review.gluster.org/15467
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Post add-brick event the new brick will have permission of 755
by default. If the root directory permission was other than 755,
that does not get healed to the new brick leading to permission
errors/inconsistencies.
For choosing source of attr heal we can trust the subvols which
have layouts with latest ctime(as part of missing directory heal,
we heal the proper attr). In case none of the subvols have layout,
return ESTALE to retrigger a fresh lookup.
Note: This patch heals the permission of the root directories only.
Since, permission healing of directory is not straight forward and
required intrusive fix, those are not addressed here.
> Reviewed-on: http://review.gluster.org/15195
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
(cherry picked from commit 801cd07a4c6ec65ff930b2ae6bb5e405ccd03334)
Change-Id: If894e3895d070d46b62d2452e52c1eaafcf56c29
BUG: 1374573
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/15465
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During a fix-layout, dht_selfheal_layout_maximize_overlap () does not
consider chunk sizes while calculating layout overlaps, causing smaller
bricks to sometimes get larger ranges than larger bricks. Temporarily
enabling this operation if only if weighted rebalance is disabled
or all bricks are the same size.
> Change-Id: I5ed16cdff2551b826a1759ca8338921640bfc7b3
> BUG: 1366494
> Signed-off-by: N Balachandran <nbalacha@redhat.com>
> Reviewed-on: http://review.gluster.org/15403
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
(cherry picked from commit b93692cce603006d9cb6750e08183bca742792ac)
Change-Id: Icf0dd83f36912e721982bcf818a06c4b339dc974
BUG: 1374135
Signed-off-by: N Balachandran <nbalacha@redhat.com>
Reviewed-on: http://review.gluster.org/15422
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
> Reviewed-on: http://review.gluster.org/15343
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: N Balachandran <nbalacha@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Shyamsundar Ranganathan <srangana@redhat.com>
> Signed-off-by: Susant Palai <spalai@redhat.com>
(cherry picked from commit 15c790b502ba92caa17f2d1870c3d75d547e6bad)
Change-Id: Iad96256218be643b272762b5638a3f6837aff28d
BUG: 1366496
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/15413
Reviewed-by: N Balachandran <nbalacha@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This converts sprintf to gf_asprintf in following components: * quotad.c
* dht
* afr
* protocol/client
* rpc/rpc-lib
* rpc/rpc-transport
This is a backport of http://review.gluster.org/#/c/14102/
> Change-Id: If8a267bab3d91003bdef3a92664077a0136745ee
> BUG: 1332073
> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
> Reviewed-on: http://review.gluster.org/14102
> Tested-by: Manikandan Selvaganesh <mselvaga@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Manikandan Selvaganesh <mselvaga@redhat.com>
Change-Id: If8a267bab3d91003bdef3a92664077a0136745ee
BUG: 1366746
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/15325
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: Manikandan Selvaganesh <mselvaga@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
When io-threads is enabled on the client side, io-threads destroys the
call-stub in which the loc is stored as soon as the c-stack unwinds.
Because afr is creating a syncop with the address of loc passed in
setxattr by the time syncop tries to access it, io-threads would have
already freed the call-stub. This will lead to crash.
Fix:
Copy loc to frame->local and use it's address.
> Reviewed-on: http://review.gluster.org/15070
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
BUG: 1369042
Change-Id: I16987e491e24b0b4e3d868a6968e802e47c77f7a
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-on: http://review.gluster.org/15233
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cyclic order
Backport of: http://review.gluster.org/15080
When the bricks are brought offline and then online in cyclic
order while writes are in progress on a file, thanks to inode
refresh in write txns, AFR will mostly fail the write attempt
when the only good copy is offline. However, there is still a
remote possibility that the file will run into split-brain if
the brick that has the lone good copy goes offline *after* the
inode refresh but *before* the write txn completes (I call it
in-flight split-brain in the patch for ease of reference),
requiring intervention from admin to resolve the split-brain
before the IO can resume normally on the file. To get around this,
the patch does the following things:
i) retains the dirty xattrs on the file
ii) avoids marking the last of the good copies as bad (or accused)
in case it is the one to go down during the course of a write.
iii) fails that particular write with the appropriate errno.
This way, we still have one good copy left despite the split-brain situation
which when it is back online, will be chosen as source to do the heal.
> Change-Id: I9ca634b026ac830b172bac076437cc3bf1ae7d8a
> BUG: 1363721
> Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
> Reviewed-on: http://review.gluster.org/15080
> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Ravishankar N <ravishankar@redhat.com>
> Reviewed-by: Oleksandr Natalenko <oleksandr@natalenko.name>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
(cherry picked from commit fcb5b70b1099d0379b40c81f35750df8bb9545a5)
Change-Id: I157f1025aebd6624fa3d412abc69a4ae6f2fe9e0
BUG: 1367272
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Signed-off-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reviewed-on: http://review.gluster.org/15221
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/15145
AFR sets transaction.pre_op[] array even before actually doing the
pre-op on-disk. Therefore, AFR must not only consider the pre_op[] array
but also the failed_subvols[] information before setting the pre_op_done[]
flag. This patch fixes that.
Change-Id: I726b2acd4025e2e75a87dea547ca6e088bc82c00
BUG: 1367272
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/15164
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Anuradha Talur <atalur@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise inode-link failures in selfheal codepath will result in a
crash.
> Change-Id: I9061629ae9d1eb1ac945af5f448d0d8b397a5022
> BUG: 1345748
> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
> Reviewed-on: http://review.gluster.org/14707
> Reviewed-by: N Balachandran <nbalacha@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: Poornima G <pgurusid@redhat.com>
> Reviewed-by: Susant Palai <spalai@redhat.com>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
(cherry picked from commit a4d35ccb8afeefae4d9cdd36ac19b0e97d0d04d0)
Change-Id: I9061629ae9d1eb1ac945af5f448d00dba97a5022
BUG: 1366482
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/15157
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problems: The maximum number of migratior threads created was static set
to "40". And the number of these threads get created in rebalance depends
on the number of cores user has. If the number of cores exceeds 40, a
crash or memory corruption can be seen.
Fix: Make the migratior thread pool dynamic.
> Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839
> BUG: 1362069
> Reviewed-on: http://review.gluster.org/15000
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
(cherry picked from commit b8e8bfc7e4d3eaf76bb637221bc6392ec10ca54b)
Change-Id: Ifbdac8a1a396363dd75e2f6bcb454070cfdbf839
BUG: 1362069
Signed-off-by: Susant Palai <spalai@redhat.com>
Reviewed-on: http://review.gluster.org/15061
Smoke: Gluster Build System <jenkins@build.gluster.org>
Tested-by: N Balachandran <nbalacha@redhat.com>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add test to fail promotion if estimated block consumption grows
beyond hi watermark.
Skip file migrations until next cycle if tier_get_fs_stat() fails
in tier_migrate_using_query_file()
> Reviewed-on: http://review.gluster.org/14780
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: mohammed rafi kc <rkavunga@redhat.com>
> Reviewed-by: Dan Lambright <dlambrig@redhat.com>
(cherry picked from commit 1f4e41e8c2f5f4af4564caba0a08996853f089f4)
Change-Id: Ice04572fa739c09109c4433e65965197482a7beb
BUG: 1362198
Signed-off-by: Milind Changire <mchangir@redhat.com>
Reviewed-on: http://review.gluster.org/15065
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks to Rafi for hinting a while back that this kind of
problem he saw once. I didn't think the theory was valid.
Could have caught it earlier if I had tested his theory.
>Change-Id: Iac6ffcdba2950aa6f8cf94f8994adeed6e6a9c9b
>BUG: 1344836
>Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
>Reviewed-on: http://review.gluster.org/14703
>Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>Tested-by: mohammed rafi kc <rkavunga@redhat.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
BUG: 1360576
Change-Id: If9ccf0b3db7159b87ddcdc7b20e81cde8c3c76f0
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/15025
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note: This is a backport of http://review.gluster.org/14895.
It contains:
i) fixes that prevent deadlocks (afr-common.c).
ii) fixes over-writing op-errno=ENOMEM with possible other values
(afr-inode-read.c).
iii) prevents doing further operations with a NULL dictionary if
allocation fails (afr-self-heal-data.c).
iv) prevents falsely marking a sink as healed if metadata heal fails
midway(afr-self-heal-metadata.c).
v) other minor fixes.
Considering the above are not trivial fixes, the patch is a good
candidate for merging in 3.8 branch.
Thanks to Krutika for a cleaner way to track inode refs in
afr_set_split_brain_choice().
Change-Id: I2d968d05b815ad764b7e3f8aa9ad95a792b3c1df
BUG: 1360556
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-on: http://review.gluster.org/15018
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: This issue arises when we do a rolling update
from 3.7.5 to 3.7.9.
For 4+2 volume running 3.7.5, if we update 2 nodes
and after heal completion kill 2 older nodes, this
problem can be seen. After update and killing of
bricks, 2 nodes will return inodelk count key in dict
while other 2 nodes will not have inodelk count in dict.
This is also true for get-link-count.
During dictionary match , ec_dict_compare, this will
lead to mismatch of answers and the file operation
on mount point will fail with IO error.
Solution:
Don't match inode, entry and link count keys while
comparing two dictionaries. However, while combining the
data in ec_dict_combine, go through all the dictionaries
and select the maximum values received in different dicts
for these keys.
master -
http://review.gluster.org/#/c/14761/
Change-Id: I33546e3619fe8f909286ee48fb0df2009cd3d22f
BUG: 1360174
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/14761
Reviewed-by: Xavier Hernandez <xhernandez@datalab.es>
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Signed-off-by: Ashish Pandey <aspandey@redhat.com>
Reviewed-on: http://review.gluster.org/15013
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/14832
Specifically when a directory tree is removed (rm -rf)
while a brick is down, both the directory index and the
name indices of the files and subdirs under it will remain.
Self-heal will need to pick up these and remove them.
Towards this, afr sh will now also crawl indices/entry-changes
and call an rmdir on the dir if the directory index is stale.
On the brick side, rmdir fop has been implemented for index xl,
which would delete the directory index and its contents if present
in a synctask.
Change-Id: I08f45201adca56737ec2be1aab5433aebaefefd0
BUG: 1355609
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Reviewed-on: http://review.gluster.org/14920
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: Ravishankar N <ravishankar@redhat.com>
Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A race in timer cancellation for delayed unlock could cause a crash
if the cancelling thread fails to cancel the timer because it has
already been fired but not executed, and the callback is scheduled
out of the CPU, delaying it until the thread has released important
resources needed by the callback.
This patch improves the handling of this case to make it robust.
Backport of:
> Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4
> BUG: 1345855
> Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
> Reviewed-on: http://review.gluster.org/14712
> Smoke: Gluster Build System <jenkins@build.gluster.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
> Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
Change-Id: I5c8a8c6610c5136f71b938aa78b5878ba05238d4
BUG: 1346158
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/14723
Smoke: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of commit a74f8cf4e7edc2ce9f045317a18dacddf25adb8a:
> BUG: 1334164
> Change-Id: I4259d88f2b6e4f9d4ad689bc4e438f1db9cfd177
> Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
> Reviewed-on: http://review.gluster.org/14365
> Tested-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
> Smoke: Gluster Build System <jenkins@build.gluster.com>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
Change-Id: Ic559c220a1f0051e531314d13940604e2dead08c
BUG: 1348060
Signed-off-by: Raghavendra G <rgowdapp@redhat.com>
Reviewed-on: http://review.gluster.org/14351
Smoke: Gluster Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of: http://review.gluster.org/14191
If the application opens a file with O_DIRECT, the shards'
anon fds would also need to inherit the flag. Towards this,
shard xl would be passing the odirect flag in the @flags parameter
to the WRITEV fop. This will be used in anon fd resolution
and subsequent opening by posix xl.
>Change-Id: I3a0593fa46cc25e390a5762a0354b469c2a1532d
>BUG: 1342903
>Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
>Reviewed-on: http://review.gluster.org/14663
>Smoke: Gluster Build System <jenkins@build.gluster.org>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Jeff Darcy <jdarcy@redhat.com>
Change-Id: Ibfc164aa7f9eecd6993255f1c03557f2ec35ac8c
BUG: 1347553
Signed-off-by: Krutika Dhananjay <kdhananj@redhat.com>
Signed-off-by: Pranith Kumar K <pkarampu@redhat.com>
Reviewed-on: http://review.gluster.org/14754
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of http://review.gluster.org/#/c/13389/
Problem: For a read on a file in metadata split-brain:
1.lookup_done resets event_generation to zero.
2. readv is issued, goes to inode refresh due to mismatching event_gen.
3. After refresh is successful, we update event_generation, data and
metdata readable.
3. We then call afr_read_txn_refresh_done() which in turn calls
afr_inode_get_readable() but doesn't check for EIO. So afr_readv_wind
is called with local->readable (which is populated with data_readable),
thus winding the read to a brick.
4. Also, further parallel reads that come directly go to the wind path
because there is no inode_refresh needed.
Fix:
1.For any afr_read_txn(), readable must be an intersection of data and metadata
readable.
2.Check for EIO in afr_read_txn_refresh_done().
Change-Id: I22dd221fdfaf96d7aced2f474e28ed1337d69f0e
BUG: 1349879
Signed-off-by: Ravishankar N <ravishankar@redhat.com>
(cherry picked from commit 7a1c1e2904701496968ed14b6d7479fb706c3188)
Reviewed-on: http://review.gluster.org/14790
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Pranith Kumar Karampuri <pkarampu@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During locking we send lock request to cached subvol,
and normally we unlock to the cached subvol
But with parallel fresh lookup on a directory, there
is a race window where the cached subvol can change
and the unlock can go into a different subvol from
which we took lock.
This will result in a stale lock held on one of the
subvol.
So we will store the details of subvol which we took the lock
and will unlock from the same subvol
Back port of>
>Change-Id: I47df99491671b10624eb37d1d17e40bacf0b15eb
>BUG: 1311002
>Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
>Reviewed-on: http://review.gluster.org/13492
>Reviewed-by: N Balachandran <nbalacha@redhat.com>
>Smoke: Gluster Build System <jenkins@build.gluster.com>
>NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
>Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
>CentOS-regression: Gluster Build System <jenkins@build.gluster.com>
(cherry picked from commit ef0db52bc55a51fe5e3856235aed0230b6a188fe)
Change-Id: Ib821e7355b4937b86d2f9f11e2c8311b7301b6c7
BUG: 1347524
Signed-off-by: Mohammed Rafi KC <rkavunga@redhat.com>
Reviewed-on: http://review.gluster.org/14750
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
Reviewed-by: N Balachandran <nbalacha@redhat.com>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When DHT traverses the inode->fd_list, it does that in an unsafe
way that can generate races with fd_unref() called from other threads.
This patch fixes this problem taking the inode->lock and adding a
reference to the fd while it's being used outside of the mutex
protected region.
A minor change in storage/posix has been done to also access the
inode->fd_list in a safe way.
Backport of:
> Change-Id: I10d469ca6a8f76e950a8c9779ae9c8b70f88ef93
> BUG: 1344340
> Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
> Reviewed-on: http://review.gluster.org/14682
> CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
> Smoke: Gluster Build System <jenkins@build.gluster.org>
> NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
> Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
Change-Id: I10d469ca6a8f76e950a8c9779ae9c8b70f88ef93
BUG: 1346750
Signed-off-by: Xavier Hernandez <xhernandez@datalab.es>
Reviewed-on: http://review.gluster.org/14733
Smoke: Gluster Build System <jenkins@build.gluster.org>
NetBSD-regression: NetBSD Build System <jenkins@build.gluster.org>
CentOS-regression: Gluster Build System <jenkins@build.gluster.org>
Reviewed-by: Raghavendra G <rgowdapp@redhat.com>
|