glusterfs.git/tests/basic/ec, branch v6.2

cluster/ec: fix fd reopen

2019-05-08T13:54:59+00:00

Currently EC tries to reopen fd's that have been opened while a brick
was down. This is done as part of regular write operations, just after
having acquired the locks, and it's sent as a sub-fop of the main write
fop.

There were two problems:

1. The reopen was attempted on all UP bricks, even if a previous lock
didn't succeed. This is incorrect because most probably the open will
fail.

2. If reopen is sent and fails, the error is propagated to the main
operation, causing it to fail when it shouldn't.

To fix this, we only attempt reopens on bricks where the current fop
owns a lock, and we prevent any error to be propagated to the main
fop.

To implement this behaviour an argument used to indicate the minimum
number of required answers has overloaded to also include some flags. To
make the change consistent, it has been necessary to rename the
argument, which means that a lot of files have been changed. However
there are no functional changes.

This change has also uncovered a problem in discard code, which didn't
correctely process requests of small sizes because no real discard fop
was being processed, only a write of 0's on some region. In this case
some fields of the fop remained uninitialized or with incorrect values.
To fix this, a new function has been created to simulate success on a
fop and it's used in the discard case.

Thanks to Pranith for providing a test script that has also detected an
issue in this patch. This patch includes a small modification of this
script to force data to be written into bricks before stopping them.

Backport of:
> Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
> BUG: bz#1699866
> Signed-off-by: Xavi Hernandez 

Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
Fixes: bz#1699917
Signed-off-by: Xavi Hernandez

tests: run nfs tests only if --enable-gnfs is provided

2019-01-24T15:18:00+00:00

Fixes: bz#1665358
Change-Id: Idbf88ec3ac683733b32c313377eeb72f2819bf0d
Signed-off-by: Amar Tumballi

cluster/ec: Fix failure of tests/basic/ec/ec-1468261.t

2018-09-25T13:57:42+00:00

Problem:
In this test we are relying on eager-lock time
duration of 1 second to delay the post op + unlock phase
of an entry fop so that in this 1 second we can kill 2
bricks and dirty on directory could be set.

Solution:
To fix this issue, we should set the others.eager-lock
option to "ON" explicitly in the beginning of this test.

Change-Id: I19bbb9c15d7bdf96a96b20587c618192d0b740ef
fixes bz#1632161
Signed-off-by: Ashish Pandey

Land part 2 of clang-format changes

2018-09-12T12:22:45+00:00

Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4
Signed-off-by: Nigel Babu

multiple files: calloc -> malloc

2018-09-04T05:09:09+00:00

xlators/cluster/stripe/src/stripe-helpers.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible

xlators/cluster/dht/src/tier.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/dht/src/dht-layout.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/dht/src/dht-helper.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/dht/src/dht-common.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/afr/src/afr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/afr/src/afr-inode-read.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
tests/bugs/replicate/bug-1250170-fsync.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
tests/basic/gfapi/gfapi-async-calls-test.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
tests/basic/ec/ec-fast-fgetxattr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
rpc/xdr/src/glusterfs3.h: Move to GF_MALLOC() instead of GF_CALLOC() when possible
rpc/rpc-transport/socket/src/socket.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
rpc/rpc-lib/src/rpc-clnt.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
extras/geo-rep/gsync-sync-gfid.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-xml-output.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-rpc-ops.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-volume.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-system.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-snapshot.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-peer.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-global.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible

It doesn't make sense to calloc (allocate and clear) memory
when the code right away fills that memory with data.
It may be optimized by the compiler, or have a microscopic
performance improvement.

In some cases, also changed allocation size to be sizeof some
struct or type instead of a pointer - easier to read.
In some cases, removed redundant strlen() calls by saving the result
into a variable.

1. Only done for the straightforward cases. There's room for improvement.
2. Please review carefully, especially for string allocation, with the
terminating NULL string.

Only compile-tested!

updates: bz#1193929
Original-Author: Yaniv Kaul 
Signed-off-by: Yaniv Kaul 
Signed-off-by: Amar Tumballi 

Change-Id: I16274dca4078a1d06ae09a0daf027d734b631ac2

All: run codespell on the code and fix issues.

2018-07-22T14:40:16+00:00

Please review, it's not always just the comments that were fixed.
I've had to revert of course all calls to creat() that were changed
to create() ...

Only compile-tested!

Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5
updates: bz#1193929
Signed-off-by: Yaniv Kaul

afr,ec: Print if the subvolume is up in statedump

2018-07-03T12:48:56+00:00

fixes bz#1597156
Change-Id: I323eb9190e40b12df216698dcdba74a6d336beeb
Signed-off-by: Pranith Kumar K

cluster/ec: send list-node-uuids request to all subvolumes

2018-03-28T18:19:25+00:00

The xattr trusted.glusterfs.list-node-uuids was only sent to a single
subvolume. This was returning null uuids from the other subvolumes as
if they were down.

This fix forces that xattr to be requested from all subvolumes.

Change-Id: If62eb39a6857258923ba625e153d4ad79018ea2f
fixes: bz#1561406
Signed-off-by: Xavi Hernandez

cluster/ec: Add test cases for stripe-cache option

2018-03-20T19:07:15+00:00

Change-Id: I1508a336a7a927b389a19815ef57001cdf29b109
BUG: 1558074
Signed-off-by: Ashish Pandey

cluster/ec: Change default read policy to gfid-hash

2018-03-14T06:22:05+00:00

Problem:
Whenever we read data from file over NFS, NFS reads
more data then requested and caches it. Based on the
stat information it makes sure that the cached/pre-read
data is valid or not.

Consider 4 + 2 EC volume and all the bricks are on
differnt nodes.

In EC, with round-robin read policy, reads are sent on
different set of data bricks. This way, it balances the
read fops to go on all the bricks and avoid heating UP
(overloading) same set of bricks.

Due to small difference in clock speed, it is possible
that we get minor difference for atime, mtime or ctime
for different bricks. That might cause a different stat
returned to NFS based on which NFS will discard
cached/pre-read data which is actually not changed and
could be used.

Solution:
Change read policy for EC as gfid-hash. That will force
all the read to go to same set of bricks.

Change-Id: I825441cc519e94bf3dc3aa0bd4cb7c6ae6392c84
BUG: 1554743
Signed-off-by: Ashish Pandey