glusterfs.git/tests/bugs, branch v3.11.0

features/shard: Handle offset in appending writes

2017-05-29T14:12:20+00:00

When a file is opened with append, all writes are appended at the end of file
irrespective of the offset given in the write syscall. This needs to be
considered in shard size update function and also for choosing which shard to
write to.

At the moment shard piggybacks on queuing from write-behind
xlator for ordering of the operations. So if write-behind is disabled and
two parallel appending-writes come both of which can increase the file size
beyond shard-size the file will be corrupted.

 >BUG: 1455301
 >Change-Id: I9007e6a39098ab0b5d5386367bd07eb5f89cb09e
 >Signed-off-by: Pranith Kumar K 
 >Reviewed-on: https://review.gluster.org/17387
 >Smoke: Gluster Build System 
 >Reviewed-by: Krutika Dhananjay 
 >NetBSD-regression: NetBSD Build System 
 >CentOS-regression: Gluster Build System 

BUG: 1456225
Change-Id: I9007e6a39098ab0b5d5386367bd07eb5f89cb09e
Signed-off-by: Pranith Kumar K 
Reviewed-on: https://review.gluster.org/17404
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Shyamsundar Ranganathan

nl-cache: In case of nameless operations do not cache

2017-05-25T15:30:34+00:00

Issue:
In nameless lookup/other fops, parent inode will be NULL, when we try
to add the cache to the NULL inode, it causes a crash.

Hence handle the scenario of nameless fops, and do not cache/serve
the nameless fops.

>Reviewed-on: https://review.gluster.org/17316
>Smoke: Gluster Build System 
>NetBSD-regression: NetBSD Build System 
>Reviewed-by: Pranith Kumar Karampuri 
>CentOS-regression: Gluster Build System 
>(cherry picked from commit 284cd8851bfe60984d2f11b5c52fe3204ff43b06)

Change-Id: I3b90f882ac89e6aaf3419db89e6f890797f37700
BUG: 1454569
Signed-off-by: Poornima G 
Reviewed-on: https://review.gluster.org/17361
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Smoke: Gluster Build System 
Reviewed-by: Shyamsundar Ranganathan

rda, glusterd: Change the max of rda-cache-limit to INFINITY

2017-05-22T15:05:43+00:00

Issue:
The max value of rda-cache-limit is 1GB before this patch.
When parallel-readdir is enabled, there will be many instances of
readdir-ahead, hence the rda-cache-limit depends on the number of
instances. Eg: On a volume with distribute count 4, rda-cache-limit
when parallel-readdir is enabled, will be 4GB instead of 1GB.
Consider a followinf sequence of operations:
- Enable parallel readdir
- Set rda-cache-limit to lets say 3GB
- Disable parallel-readdir, this results in one instance of readdir-ahead
  and the rda-cache-limit will be back to 1GB, but the current value is 3GB
  and hence the mount will stop working as 3GB > max 1GB.

Solution:
To fix this, we can limit the cache to 1GB even when parallel-readdir
is enabled. But there is no necessity to limit the cache to 1GB, it
can be increased if the system has enough resources. Hence getting rid
of the rda-cache-limit max value is more apt. If we just change the
rda-cache-limit max to INFINITY, we will render older(<3.11) clients
broken, when the rda-cache-limit is set to > 1GB (as the older clients
still expect a value < 1GB). To safely change the max value of
rda-cache-limit to INFINITY, add a check in glusted to verify all
the clients are > 3.11 if the value exceeds 1GB.

>Reviewed-on: https://review.gluster.org/17338
>Smoke: Gluster Build System 
>Reviewed-by: Atin Mukherjee 
>NetBSD-regression: NetBSD Build System 
>CentOS-regression: Gluster Build System 
>(cherry picked from commit e43b40296956d132c70ffa3aa07b0078733b39d4)

Change-Id: Id0cdda3b053287b659c7bf511b13db2e45b92032
BUG: 1453152
Signed-off-by: Poornima G 
Reviewed-on: https://review.gluster.org/17354
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Shyamsundar Ranganathan

glusterd: Don't spawn new glusterfsds on node reboot with brick-mux

2017-05-22T14:17:23+00:00

With brick multiplexing enabled, upon a node reboot new bricks were
not being attached to the first spawned brick process even though
there wasn't any compatibility issues.

The reason for this is that upon glusterd restart after a node
reboot, since brick services aren't running, glusterd starts the
bricks in a "no-wait" mode. So after a brick process is spawned for
the first brick, there isn't enough time for the corresponding pid
file to get populated with a value before the compatibilty check is
made for the next brick.

This commit solves this by iteratively waiting for the pidfile to be
populated in the brick compatibility comparison stage before checking
if the brick process is alive.

> Reviewed-on: https://review.gluster.org/17307
> Reviewed-by: Atin Mukherjee 
> Smoke: Gluster Build System 
> NetBSD-regression: NetBSD Build System 
> CentOS-regression: Gluster Build System 

(cherry picked from commit 13e7b3b354a252ad4065f7b2f0f805c40a3c5d18)

Change-Id: Ibd1f8e54c63e4bb04162143c9d70f09918a44aa4
BUG: 1453086
Signed-off-by: Samikshan Bairagya 
Reviewed-on: https://review.gluster.org/17351
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Atin Mukherjee

Fixes quota aux mount failure

2017-05-17T23:26:20+00:00

The aux mount is created on the first limit/remove_limit/list command
and it remains until volume is stopped / deleted / (quota is disabled)
, where we do a lazy unmount. If the process is uncleanly terminated,
then the mount entry remains and we get (Transport disconnected) error
on subsequent attempts to run quota list/limit-usage/remove commands.

Second issue, There is also a risk of inadvertent rm -rf on the
/var/run/gluster causing data loss for the user. Ideally, /var/run is
a temp path for application use and should not cause any data loss to
persistent storage.

Solution:
1) unmount the aux mount after each use.
2) clean stale mount before mounting, if any.

One caveat with doing mount/unmount on each command is that we cannot
use same mount point for both list and limit commands.
The reason for this is that list command needs mount to be accessible
in cli after response from glusterd, So it could be unmounted by a
limit command if executed in parallel (had we used same mount point)
Hence we use separate mount points for list and limit commands.

>Reviewed-on: https://review.gluster.org/16938
>NetBSD-regression: NetBSD Build System 
>Smoke: Gluster Build System 
>Reviewed-by: Manikandan Selvaganesh 
>CentOS-regression: Gluster Build System 
>Reviewed-by: Raghavendra G 
>Reviewed-by: Atin Mukherjee 
>(cherry picked from commit 2ae4b4058691b324535d802f4e6d24cce89a10e5)

Change-Id: I4f9e39da2ac2b65941399bffb6440db8a6ba59d0
BUG: 1449775
Signed-off-by: Sanoj Unnikrishnan 
Reviewed-on: https://review.gluster.org/17240
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Atin Mukherjee

glusterd: Make reset-brick work correctly if brick-mux is on

2017-05-16T00:29:48+00:00

Reset brick currently kills of the corresponding brick process.
However, with brick multiplexing enabled, stopping the brick
process would render all bricks attached to it unavailable. To
handle this correctly, we need to make sure that the brick process
is terminated only if brick-multiplexing is disabled. Otherwise,
we should send the GLUSTERD_BRICK_TERMINATE rpc to the respective
brick process to detach the brick that is to be reset.

> Reviewed-on: https://review.gluster.org/17128
> Smoke: Gluster Build System 
> NetBSD-regression: NetBSD Build System 
> CentOS-regression: Gluster Build System 
> Reviewed-by: Atin Mukherjee 

(cherry picked from commit 74383e3ec6f8244b3de9bf14016452498c1ddcf0)

Change-Id: I69002d66ffe6ec36ef48af09b66c522c6d35ac58
BUG: 1449933
Signed-off-by: Samikshan Bairagya 
Reviewed-on: https://review.gluster.org/17245
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Atin Mukherjee

afr: include quorum type and count when dumping afr priv

2017-05-12T13:36:40+00:00

Squash of  https://review.gluster.org/17196 and
           https://review.gluster.org/17215

Dump the client quorum type ('auto', 'fixed' or 'none'). If it is 'fixed',
also dump the quorum-count. This information will be available in the client
statedump and in
//.meta/graphs/active/testvol-replicate-X/private.

Also added a test-case.

Change-Id: I91367c5250b26efb35e5f7d7c397def09cc77cbc
BUG: 1449921
Signed-off-by: Ravishankar N 
Reviewed-on: https://review.gluster.org/17243
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Shyamsundar Ranganathan

glusterd: socketfile & pidfile related fixes for brick multiplexing feature

2017-05-10T14:05:52+00:00

Problem: While brick-muliplexing is on after restarting glusterd, CLI is
         not showing pid of all brick processes in all volumes.

Solution: While brick-mux is on all local brick process communicated through one
          UNIX socket but as per current code (glusterd_brick_start) it is trying
          to communicate with separate UNIX socket for each volume which is populated
          based on brick-name and vol-name.Because of multiplexing design only one
          UNIX socket is opened so it is throwing poller error and not able to
          fetch correct status of brick process through cli process.
          To resolve the problem write a new function glusterd_set_socket_filepath_for_mux
          that will call by glusterd_brick_start to validate about the existence of socketpath.
          To avoid the continuous EPOLLERR erros in  logs update socket_connect code.

Test:     To reproduce the issue followed below steps
          1) Create two distributed volumes(dist1 and dist2)
          2) Set cluster.brick-multiplex is on
          3) kill glusterd
          4) run command gluster v status
          After apply the patch it shows correct pid for all volumes

> BUG: 1444596
> Change-Id: I5d10af69dea0d0ca19511f43870f34295a54a4d2
> Signed-off-by: Mohit Agrawal 
> Reviewed-on: https://review.gluster.org/17101
> Smoke: Gluster Build System 
> Reviewed-by: Prashanth Pai 
> NetBSD-regression: NetBSD Build System 
> CentOS-regression: Gluster Build System 
> Reviewed-by: Atin Mukherjee 
> (cherry picked from commit 21c7f7baccfaf644805e63682e5a7d2a9864a1e6)

Change-Id: Ia95b9d36e50566b293a8d6350f8316dafc27033b
BUG: 1449004
Signed-off-by: Mohit Agrawal 
Reviewed-on: https://review.gluster.org/17212
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
Reviewed-by: Atin Mukherjee 
Reviewed-by: Prashanth Pai 
CentOS-regression: Gluster Build System

afr: don't do a post-op on a brick if op failed

2017-04-19T02:29:25+00:00

Problem:
In afr-v2, self-blaming xattrs are not there by design. But if the FOP
failed on a brick due to an error other than ENOTCONN (or even due to
ENOTCONN, but we regained connection before postop was wound), we wind
the post-op also on the failed brick, leading to setting self-blaming
xattrs on that brick. This can lead to undesired results like healing of
files in split-brain etc.

Fix:
If a fop failed on a brick on which pre-op was successful, do not
perform post-op on it. This also produces the desired effect of not
resetting the dirty xattr on the brick, which is how it should be
because if the fop failed on a brick, there is no reason to clear the
dirty bit which actually serves as an indication of the failure.

Change-Id: I5f1caf4d1b39f36cf8093ccef940118638caa9c4
BUG: 1438255
Signed-off-by: Ravishankar N 
Reviewed-on: https://review.gluster.org/16976
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Pranith Kumar Karampuri

dht: Add readdir-ahead in rebalance graph if parallel-readdir is on

2017-04-18T06:16:11+00:00

Issue:
The value of linkto xattr is generally the name of the dht's
next subvol, this requires that the next subvol of dht is not
changed for the life time of the volume. But with parallel
readdir enabled, the readdir-ahead loaded below dht, is optional.
The linkto xattr for first subvol, when:
- parallel readdir is enabled : "-readdir-head-0"
- plain distribute volume : "-client-0"
- distribute replicate volume : "-afr-0"

The value of linkto xattr is "-readdir-head-0" when
parallel readdir is enabled, and is "-client-0" if
its disabled. But the dht_lookup takes care of healing if it
cannot identify which linkto subvol, the xattr points to.

In dht_lookup_cbk, if linkto xattr is found to be "-client-0"
and parallel readdir is enabled, then it cannot understand the
value "-client-0" as it expects "-readdir-head-0".
In that case, dht_lookup_everywhere is issued and then the linkto file
is unlinked and recreated with the right linkto xattr. The issue is
when parallel readdir is enabled, mount point accesses the file
that is currently being migrated. Since rebalance process doesn't
have parallel-readdir feature, it expects "-client-0"
where as mount expects "-readdir-head-0". Thus at some point
either the mount or rebalance will fail.

Solution:
Enable parallel-readdir for rebalance as well and then do not
allow enabling/disabling parallel-readdir if rebalance is in
progress.

Change-Id: I241ab966bdd850e667f7768840540546f5289483
BUG: 1436090
Signed-off-by: Poornima G 
Reviewed-on: https://review.gluster.org/17056
Smoke: Gluster Build System 
NetBSD-regression: NetBSD Build System 
CentOS-regression: Gluster Build System 
Reviewed-by: Atin Mukherjee 
Reviewed-by: Raghavendra G