glusterfs.git/xlators/cluster, branch release-3.9

cluster/ec: Change log level of messages to DEBUG

2017-02-20T08:37:06+00:00

Heal failed or passed should not be logged as info. These can be
observed from heal info if the heal is happening or not. If we require
to debug a case where heal is not happening, we can set the level to
DEBUG.

>Change-Id: I062668eadd145ef809b25e818e6bca1094f54cd6
>BUG: 1420619
>Signed-off-by: Sunil Kumar Acharya 
>Reviewed-on: https://review.gluster.org/16580
>Smoke: Gluster Build System 
>NetBSD-regression: NetBSD Build System 
>CentOS-regression: Gluster Build System 
>Reviewed-by: Ashish Pandey 

Change-Id: I91700a1960b5feb03ef186e4d0ddba1338152469
BUG: 1422783
Signed-off-by: Sunil Kumar Acharya 
Reviewed-on: https://review.gluster.org/16636
Smoke: Gluster Build System 
Reviewed-by: Xavier Hernandez 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System

afr: all children of AFR must be up to resolve s-brain

2017-02-15T15:07:32+00:00

Problem:
The various split-brain resolution policies (favorite-child-policy based,
CLI based and mount (get/setfattr) based) attempt to resolve split-brain
even when not all bricks of replica are up. This can be a problem when
say in a replica 3, the only good copy is down and the other 2 bricks
are up and blame each other (i.e. split-brain). We end up healing the
file in such a  case and allow I/O on it.

Fix:
A decision on whether the file is in split-brain or not must be taken
only if we are able to examine the afr xattrs of *all* bricks of a given
replica.

> Reviewed-on: https://review.gluster.org/16476
> Smoke: Gluster Build System 
> NetBSD-regression: NetBSD Build System 
> CentOS-regression: Gluster Build System 
> Reviewed-by: Pranith Kumar Karampuri 
(cherry picked from commit 0e03336a9362e5717e561f76b0c543e5a197b31b)

Change-Id: Icddb1268b380005799990f5379ef957d84639ef9
BUG: 1420983
Signed-off-by: Ravishankar N 
Reviewed-on: https://review.gluster.org/16588
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Smoke: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri

cluster/ec: Do not start heal on good file while IO is going on

2017-01-23T13:59:55+00:00

Problem:
Write on a file has been slowed down significantly after
http://review.gluster.org/#/c/13733/

RC : When update fop starts on a file, it sets dirty flag at
the start and remove it at the end which make an index entry
in indices/xattrop. During IO, SHD scans this and finds out
an index and starts heal even if all the fragments are healthy
and up tp date. This heal takes inodelk for different types of
heal. If the IO is for long time this will happen in every 60 seconds.
Due to this extra, unneccessary locking, IO gets slowed down.

Solution:
Before starting  any  type of heal check if file needs heal or not.

>Change-Id: Ib9519a43e7e4b2565d3f3153f9ca0fb92174fe51
>BUG: 1409191
>Signed-off-by: Ashish Pandey 
>Reviewed-on: http://review.gluster.org/16377
>NetBSD-regression: NetBSD Build System 
>CentOS-regression: Gluster Build System 
>Smoke: Gluster Build System 
>Reviewed-by: Pranith Kumar Karampuri 
>Reviewed-by: Xavier Hernandez 
>Signed-off-by: Ashish Pandey 

Change-Id: Ib9519a43e7e4b2565d3f3153f9ca0fb92174fe51
BUG: 1415160
Signed-off-by: Ashish Pandey 
Reviewed-on: https://review.gluster.org/16444
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri

cluster/disperse: Do not log fop failed for lockless fops

2017-01-23T13:57:14+00:00

Problem: Operation failed messages are getting logged
based on the callbacks of lockless fop's. If a fop does
not take a lock, it is possible that it will get some
out of sync xattr, iatts. We can not depend on these
callback to psay that the fop has failed.

Solution: Print failed messages only for locked fops.
However, heal would still be triggered.

>Change-Id: I4427402c8c944c23f16073613caa03ea788bead3
>BUG: 1414287
>Signed-off-by: Ashish Pandey 
>Reviewed-on: http://review.gluster.org/16435
>Reviewed-by: Xavier Hernandez 
>Smoke: Gluster Build System 
>NetBSD-regression: NetBSD Build System 
>CentOS-regression: Gluster Build System 
>Signed-off-by: Ashish Pandey 

Change-Id: I4427402c8c944c23f16073613caa03ea788bead3
BUG: 1415082
Signed-off-by: Ashish Pandey 
Reviewed-on: https://review.gluster.org/16439
Reviewed-by: Xavier Hernandez 
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System

cluster/dht: Do rename cleanup as root

2017-01-20T07:18:58+00:00

Problem:
Rename linkfile cleanup is done as non-root which may not have priviliges to do
the rename so it fails with EACCESS. MKDIR on that name in future will start to
hole on this subvolume. It is not easy to hit on fuse mounts because vfs takes
care of the permission checks even before rename fop is wound. But with
nfs-ganesha mounts it happens.

Fix:
Do rename cleanup as root

 >BUG: 1409727
 >Change-Id: I414c1eb6dce76b4516a6c940557b249e6c3f22f4
 >Signed-off-by: Pranith Kumar K 
 >Reviewed-on: http://review.gluster.org/16317
 >Smoke: Gluster Build System 
 >CentOS-regression: Gluster Build System 
 >Reviewed-by: Raghavendra G 
 >Reviewed-by: N Balachandran 
 >NetBSD-regression: NetBSD Build System 

BUG: 1413061
Change-Id: If94121275b141c5f52084b8aafac86451e667d3d
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/16412
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: N Balachandran

cluster/ec: Do lookup on an existing file in link

2017-01-18T09:48:52+00:00

Problem:
In link fop lookup is happening on the new link which doesn't exist so the iatt
ec serves parent xlators has size as zero which leads to 'cat' giving empty output

Fix:
Change code so that lookup happens on the existing link instead.

 >BUG: 1409730
 >Change-Id: I70eb02fe0633e61d1d110575589cc2dbe5235d76
 >Signed-off-by: Pranith Kumar K 
 >Reviewed-on: http://review.gluster.org/16320
 >Smoke: Gluster Build System 
 >Reviewed-by: Xavier Hernandez 
 >Tested-by: Xavier Hernandez 
 >CentOS-regression: Gluster Build System 
 >NetBSD-regression: NetBSD Build System 

BUG: 1413057
Change-Id: Ia48f615fd41f0fa7f3ffe8eb613d90a05eb68c32
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/16411
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Smoke: Gluster Build System 
Reviewed-by: Xavier Hernandez

ec: Invalidations in disperse volume should not update the stat

2017-01-17T14:55:46+00:00

Backport of http://review.gluster.org/16329

Issue:
In disperse volume, the file is present across bricks, hence the stat
from one brick doesn't carry the valid size of the file. Therefore
the upcall from one brick updating the md-cache results in wrong size
being updated.

Fix:
If the notification is cache invalidation then, indicate md-cache that
the attributes is invalid.

>Reviewed-on: http://review.gluster.org/16329
>Smoke: Gluster Build System 
>NetBSD-regression: NetBSD Build System 
>Reviewed-by: Xavier Hernandez 
>CentOS-regression: Gluster Build System 
>Reviewed-by: Pranith Kumar Karampuri 
(cherry picked from commit 95d07a3d2d68805d93d36a447436e27c48777939)

BUG: 1410688
Change-Id: Id89d2283478e70b62b435a8891fffc86d2be8cb2
Signed-off-by: Poornima G 
Reviewed-on: http://review.gluster.org/16341
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri

cluster/afr: Remove backward compatibility for locks with v1

2017-01-17T14:46:53+00:00

When we have cascading locks with same lk-owner there is a possibility for
a deadlock to happen. One example is as follows:

self-heal takes a lock in data-domain for big name with 256 chars of "aaaa...a"
and starts heal in a 3-way replication when brick-0 is offline and healing from
brick-1 to brick-2 is in progress. So this lock is active on brick-1 and
brick-2. Now brick-0 comes online and an operation wants to take full lock and
the lock is granted at brick-0 and it is waiting for lock on brick-1. As part
of entry healing it takes full locks on all the available bricks and then
proceeds with healing the entry. Now this lock will start waiting on brick-0
because some other operation already has a granted lock on it. This leads to a
deadlock. Operation is waiting for unlock on "aaaa..." by heal where as heal is
waiting for the operation to unlock on brick-0. Initially I thought this is
happening because healing is trying to take a lock on all the available bricks
instead of just the bricks that are participating in heal. But later realized
that same kind of deadlock can happen if a brick goes down after the heal
starts but comes back before it completes. So the essential problem is the
cascading locks with same lk-owner which were added for backward compatibility
with afr-v1 which can be safely removed now that versions with afr-v1 are
already EOL. This patch removes the compatibility with v1 which requires
cascading locks with same lk-owner.

In the next version we can make locking-scheme option a dummy and switch
completely to v2.

 >BUG: 1401404
 >Change-Id: Ic9afab8260f5ff4dff5329eb0429811bcb879079
 >Signed-off-by: Pranith Kumar K 
 >Reviewed-on: http://review.gluster.org/16024
 >Smoke: Gluster Build System 
 >Reviewed-by: Ravishankar N 
 >NetBSD-regression: NetBSD Build System 
 >CentOS-regression: Gluster Build System 

BUG: 1413062
Change-Id: I4f5d485d9e0646ad3dc384e5ec36682b0933c9d3
Signed-off-by: Pranith Kumar K 
Reviewed-on: http://review.gluster.org/16413
Smoke: Gluster Build System 
CentOS-regression: Gluster Build System 
NetBSD-regression: NetBSD Build System

cluster/afr: Do not log of split-brain when there isn't one

2017-01-13T12:38:58+00:00

        Backport of: http://review.gluster.org/16362

* Even on errors like ENOENT, AFR logs split-brain after
  read-txn refresh, introduced by commit a07ddd8f.
  This can be a cause of much panic and confusion and needs to be fixed.

* Also fixed this issue in write-txns.

* Fixed afr read txns to log about split-brain only after knowing that
  there is no split-brain choice configured.

* Removed code duplication

* Fixed incorrect passing of error code in afr_write_txn_refresh_done()
  (the function was passing -0 as errno to gf_msg().

Change-Id: Ie40d2c498674a1fe8dc2c521b05e30c0bce85c02
BUG: 1412914
Signed-off-by: Krutika Dhananjay 
Reviewed-on: http://review.gluster.org/16388
Smoke: Gluster Build System 
Reviewed-by: Ravishankar N 
Reviewed-by: Pranith Kumar Karampuri 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System

afr: Avoid resetting event_gen when brick is always down

2017-01-13T08:31:12+00:00

Problem:
__afr_set_in_flight_sb_status(), which resets event_gen to zero, is
called if failed_subvols[i] is non-zero for any brick. But failed_subvols[i]
is true even if the brick was down *before* the transaction started.
Hence say if 1 brick is down in  a replica-3, every writev that comes
will trigger an inode refresh because of this resetting, as seen from
the no. of FSTATs in the profile info in the BZ.

Fix:
Reset event gen only if the brick was previously a valid read child and
the FOP failed on it the first time.

Also `s/afr_inode_read_subvol_reset/afr_inode_event_gen_reset` because
the function only resets event gen and not the data/metadata readable.

> Signed-off-by: Ravishankar N 
> Reviewed-on: http://review.gluster.org/16309
> Smoke: Gluster Build System 
> Reviewed-by: Pranith Kumar Karampuri 
> Tested-by: Pranith Kumar Karampuri 
> NetBSD-regression: NetBSD Build System 
> CentOS-regression: Gluster Build System 
(cherry picked from commit 522640be476a3f97dac932f7046f0643ec0ec2f2)

Change-Id: I603ae646cbde96995c35db77916e2ed80b602a91
BUG: 1412886
Reviewed-on: http://review.gluster.org/16385
Tested-by: Ravishankar N 
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
Reviewed-by: Krutika Dhananjay 
Reviewed-by: Pranith Kumar Karampuri 
CentOS-regression: Gluster Build System