glusterfs.git/xlators/cluster, branch v3.8.13

afr: add errno to afr_inode_refresh_done()

2017-06-19T04:54:40+00:00

Backport of https://review.gluster.org/17413 and
https://review.gluster.org/17436

Problem:
When parellel `rm -rf`s were being done from cifs clients, opendir might
fail on some replicas with ENOENT. DHT ignores partial opendir failures
in dht_fd_cbk() and winds readdirs on those replicas. Afr inode refresh
(as a part of readdirp read_txn) sees in its fd context that the state
of the fds is *not* AFR_FD_OPENED and bails out to
afr_inode_refresh_done() without doing a refresh. When this happens, the
errno is set as EIO due to lack of readable subvols, logging split-brain
messages in the logs.

Fix:
Introduce an errno argument to afr_inode_refresh_do() to bail out with
the right error value when inode refresh is not performed.

Change-Id: I8eed4d6e6c85332c1f5813c74cb54ae73693a369
BUG: 1460661
Signed-off-by: Ravishankar N 
Reviewed-on: https://review.gluster.org/17518
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri

afr: send the correct iatt values in fsync cbk

2017-05-17T08:52:09+00:00

Problem:
afr unwinds the fsync fop with an iatt buffer from one of its children
on whom fsync was successful. But that child might not be a valid read
subvolume for that inode because of pending heals or because it happens
to be the arbiter brick etc. Thus we end up sending the wrong iatt to
mdcache which will in turn serve it to the application on a subsequent
stat call as reported in the BZ.

Fix:
Pick a child on whom the fsync was successful *and* that is readable as
indicated in the inode context.

> Reviewed-on: https://review.gluster.org/17227
> CentOS-regression: Gluster Build System 
> Reviewed-by: Pranith Kumar Karampuri 
> NetBSD-regression: NetBSD Build System 
> Smoke: Gluster Build System 
(cherry picked from commit 1a8fa910ccba7aa941f673302c1ddbd7bd818e39)

Change-Id: Ie8647289219cebe02dde4727e19a729b3353ebcf
BUG: 1449941
RCA'ed-by: Miklós Fokin 
Signed-off-by: Ravishankar N 
Reviewed-on: https://review.gluster.org/17248
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
Reviewed-by: Pranith Kumar Karampuri 
CentOS-regression: Gluster Build System 
Reviewed-by: jiffin tony Thottan

afr: propagate correct errno for fop failures in arbiter

2017-05-17T08:51:57+00:00

Problem:
If quorum is not met in fop cbk, arbiter sends an ENOTCONN error to the
upper xlators. In a VM workload with sharding enabled, this was leading
to the VM pausing when replace-brick was performed as described in the BZ.

Fix:
Move the fop cbk arbitration logic to afr_handle_quorum() because in
normal replica volumes, that is the function that has the quorum and
errno checks in the fop cbk path before doing a post-op.

Thanks to Pranith for suggesting this approach.

> Reviewed-on: https://review.gluster.org/17235
> Smoke: Gluster Build System 
> Reviewed-by: Pranith Kumar Karampuri 
> NetBSD-regression: NetBSD Build System 
> CentOS-regression: Gluster Build System 
(cherry picked from commit 93c850dd2a513fab75408df9634ad3c970a0e859)

Change-Id: Ie6315db30c5e36326b71b90a01da824109e86796
BUG: 1450937
Signed-off-by: Ravishankar N 
Reviewed-on: https://review.gluster.org/17296
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri 
Reviewed-by: jiffin tony Thottan

cluster/ec: fix incorrect answer check in seek fop

2017-05-11T08:36:04+00:00

A bad check in the answer of a seek request caused a segmentation
fault when seek reported an error.

> Change-Id: Ifb25ae8bf7cc4019d46171c431f7b09b376960e8
> BUG: 1439068
> Signed-off-by: Xavier Hernandez 
> Reviewed-on: https://review.gluster.org/16998
> Smoke: Gluster Build System 
> NetBSD-regression: NetBSD Build System 
> CentOS-regression: Gluster Build System 
> Reviewed-by: Amar Tumballi 
> Reviewed-by: Pranith Kumar Karampuri 

Change-Id: Ifb25ae8bf7cc4019d46171c431f7b09b376960e8
BUG: 1442933
Signed-off-by: Xavier Hernandez 
Reviewed-on: https://review.gluster.org/17231
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri 
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
Reviewed-by: Niels de Vos

cluster/dht: Pass the correct xdata in fremovexattr fop

2017-05-03T20:42:59+00:00

        Backport of: https://review.gluster.org/17126

Change-Id: Id84bc87e48f435573eba3b24d3fb3c411fd2445d
BUG: 1440635
Signed-off-by: Krutika Dhananjay 
Reviewed-on: https://review.gluster.org/17148
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Smoke: Gluster Build System 
Reviewed-by: Niels de Vos

cluster/dht: Pass the req dict instead of NULL in dht_attr2()

2017-04-29T11:27:28+00:00

        Backport of: https://review.gluster.org/17085

This bug was causing VMs to pause during rebalance. When qemu winds
down a STAT, shard fills the trusted.glusterfs.shard.file-size attribute
in the req dict which DHT doesn't wind its STAT fop with upon detecting
the file has undergone migration. As a result shard doesn't find the
value to this key in the unwind path, causing it to fail the STAT
with EINVAL.

Also, the same bug exists in other fops too, which is also fixed in
this patch.

Change-Id: I56273b1a65347dabd38bc6bdd12d618f68287a00
BUG: 1440635
Signed-off-by: Krutika Dhananjay 
Reviewed-on: https://review.gluster.org/17121
Smoke: Gluster Build System 
Reviewed-by: Raghavendra G 
CentOS-regression: Gluster Build System 
NetBSD-regression: NetBSD Build System

afr: don't do a post-op on a brick if op failed

2017-04-29T11:26:24+00:00

Problem:
In afr-v2, self-blaming xattrs are not there by design. But if the FOP
failed on a brick due to an error other than ENOTCONN (or even due to
ENOTCONN, but we regained connection before postop was wound), we wind
the post-op also on the failed brick, leading to setting self-blaming
xattrs on that brick. This can lead to undesired results like healing of
files in split-brain etc.

Fix:
If a fop failed on a brick on which pre-op was successful, do not
perform post-op on it. This also produces the desired effect of not
resetting the dirty xattr on the brick, which is how it should be
because if the fop failed on a brick, there is no reason to clear the
dirty bit which actually serves as an indication of the failure.

> Reviewed-on: https://review.gluster.org/16976
> Smoke: Gluster Build System 
> NetBSD-regression: NetBSD Build System 
> CentOS-regression: Gluster Build System 
> Reviewed-by: Pranith Kumar Karampuri 

Change-Id: I5f1caf4d1b39f36cf8093ccef940118638caa9c4
BUG: 1443319
Signed-off-by: Ravishankar N 
Reviewed-on: https://review.gluster.org/17082
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri

cluster/dht: Modify local->loc.gfid in thread safe manner

2017-04-07T12:08:00+00:00

	Backport of https://review.gluster.org/16986

Problem:
local->loc.gfid in dht_lookup_directory() will be null-gfid for a fresh lookup.
dht_lookup_dir_cbk() updates local->loc.gfid while in other thread dht_lookup_directory()
is still winding lookup calls to subvolumes so there is a chance of partial gfid being
seen by EC.

We saw in 12x(4+2) volume, ec is receiving an loc where the gfid has last 10 bytes matching
with the gfid of the directory and the first 4 bytes are all-zeros. This is leading to EC
erroring out the lookup with EINVAL which leads to NFS failing lookup with EIO.

snip from gdb:
$37 = (dht_local_t *) 0x7fde5de5b3cc
(gdb) p /x $37->loc.gfid
$39 = {0x3b, 0x82, 0x10, 0x5e, 0x40, 0x65, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5,
0x6c, 0x2c, 0xb8, 0x56}
(gdb) fr 7
state=) at ec-generic.c:837
837	                ec_lookup_rebuild(fop->xl->private, fop, cbk);
(gdb) p /x fop->loc[0].gfid
$40 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x43, 0x14, 0xa0, 0xc6, 0x8, 0xf5, 0x6c,
0x2c, 0xb8, 0x56}

snip from log:
[2017-01-29 03:22:30.132328] W [MSGID: 122019]
[ec-helpers.c:354:ec_loc_gfid_check] 0-butcher-disperse-4: Mismatching GFID's
in loc [2017-01-29 03:22:30.132709] W [MSGID: 112199]
[nfs3-helpers.c:3515:nfs3_log_newfh_res] 0-nfs-nfsv3:
/linux-4.9.5/Documentation => (XID: b27b9474, MKDIR: NFS: 5(I/O error), POSIX:
5(Input/output error)), FH: exportid 00000000-0000-0000-0000-000000000000, gfid
00000000-0000-0000-0000-000000000000, mountid
00000000-0000-0000-0000-000000000000 [Invalid argument]

Fix:
update local->loc.gfid in last-call to make sure there are no races.

 >BUG: 1438411
 >Change-Id: Ifcb7e911568c1f1f83123da6ff0cf742b91800a0
 >Signed-off-by: Pranith Kumar K 

BUG: 1438424
Change-Id: If039956205cfac5e798c2c90e92a9a47b404e804
Signed-off-by: Pranith Kumar K 
Reviewed-on: https://review.gluster.org/16988
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Raghavendra G

cluster/ec: Add/Modify description for eager-lock option

2017-04-07T11:59:33+00:00

This patch provides description for disperse.eager-lock
option for disperse volume.

It also modifies the description for cluster.eager-lock
option to indicate that this option is only for replica
volume.

>Change-Id: Ie73298947fcaaa6aaf825978bc2d27ceaff386d2
>BUG: 1327171
>Signed-off-by: Ashish Pandey 
>Reviewed-on: http://review.gluster.org/13999
>NetBSD-regression: NetBSD Build System 
>Smoke: Gluster Build System 
>Reviewed-by: Ravishankar N 
>CentOS-regression: Gluster Build System 
>Reviewed-by: Pranith Kumar Karampuri 

BUG: 1435645
Change-Id: I48b091e002b5c3308d6fbf2feb024a7f2fe08969
Signed-off-by: Sunil Kumar Acharya 
Reviewed-on: https://review.gluster.org/16943
Smoke: Gluster Build System 
CentOS-regression: Gluster Build System 
NetBSD-regression: NetBSD Build System 
Reviewed-by: Xavier Hernandez

cluster/afr: Undo pending xattrs only on the up bricks

2017-04-07T11:56:55+00:00

Problem:
While doing conservative merge, even if a brick is down, it will reset
the pending xattr on that. When that brick comes up, as part of the
heal, it will consider this brick as the source and removes the entries
on the other bricks, which leads to data loss.

Fix:
Undo pending only for the bricks which are up.

> Change-Id: I18436fa0bb1faa5f60531b357dea3f6b20446303
> BUG: 1433571
> Signed-off-by: karthik-us 
> Reviewed-on: https://review.gluster.org/16913
> Reviewed-by: Pranith Kumar Karampuri 
> Smoke: Gluster Build System 
> NetBSD-regression: NetBSD Build System 
> CentOS-regression: Gluster Build System 
> Reviewed-by: Ravishankar N 
(cherry picked from commit f91596e6566c605e70a31a60523d11f78a097c3c)

Change-Id: Id20c9ce53ee59f005d977494903247e2a8024ed1
BUG: 1436231
Signed-off-by: karthik-us 
Reviewed-on: https://review.gluster.org/16956
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Ravishankar N 
Reviewed-by: Pranith Kumar Karampuri