<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/tests/basic/ec, branch v6.2</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>cluster/ec: fix fd reopen</title>
<updated>2019-05-08T13:54:59+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2019-04-16T08:49:47+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=bf69fa4727f2b4432d7d54312bc0177cbcf44936'/>
<id>bf69fa4727f2b4432d7d54312bc0177cbcf44936</id>
<content type='text'>
Currently EC tries to reopen fd's that have been opened while a brick
was down. This is done as part of regular write operations, just after
having acquired the locks, and it's sent as a sub-fop of the main write
fop.

There were two problems:

1. The reopen was attempted on all UP bricks, even if a previous lock
didn't succeed. This is incorrect because most probably the open will
fail.

2. If reopen is sent and fails, the error is propagated to the main
operation, causing it to fail when it shouldn't.

To fix this, we only attempt reopens on bricks where the current fop
owns a lock, and we prevent any error to be propagated to the main
fop.

To implement this behaviour an argument used to indicate the minimum
number of required answers has overloaded to also include some flags. To
make the change consistent, it has been necessary to rename the
argument, which means that a lot of files have been changed. However
there are no functional changes.

This change has also uncovered a problem in discard code, which didn't
correctely process requests of small sizes because no real discard fop
was being processed, only a write of 0's on some region. In this case
some fields of the fop remained uninitialized or with incorrect values.
To fix this, a new function has been created to simulate success on a
fop and it's used in the discard case.

Thanks to Pranith for providing a test script that has also detected an
issue in this patch. This patch includes a small modification of this
script to force data to be written into bricks before stopping them.

Backport of:
&gt; Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
&gt; BUG: bz#1699866
&gt; Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;

Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
Fixes: bz#1699917
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently EC tries to reopen fd's that have been opened while a brick
was down. This is done as part of regular write operations, just after
having acquired the locks, and it's sent as a sub-fop of the main write
fop.

There were two problems:

1. The reopen was attempted on all UP bricks, even if a previous lock
didn't succeed. This is incorrect because most probably the open will
fail.

2. If reopen is sent and fails, the error is propagated to the main
operation, causing it to fail when it shouldn't.

To fix this, we only attempt reopens on bricks where the current fop
owns a lock, and we prevent any error to be propagated to the main
fop.

To implement this behaviour an argument used to indicate the minimum
number of required answers has overloaded to also include some flags. To
make the change consistent, it has been necessary to rename the
argument, which means that a lot of files have been changed. However
there are no functional changes.

This change has also uncovered a problem in discard code, which didn't
correctely process requests of small sizes because no real discard fop
was being processed, only a write of 0's on some region. In this case
some fields of the fop remained uninitialized or with incorrect values.
To fix this, a new function has been created to simulate success on a
fop and it's used in the discard case.

Thanks to Pranith for providing a test script that has also detected an
issue in this patch. This patch includes a small modification of this
script to force data to be written into bricks before stopping them.

Backport of:
&gt; Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
&gt; BUG: bz#1699866
&gt; Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;

Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
Fixes: bz#1699917
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: run nfs tests only if --enable-gnfs is provided</title>
<updated>2019-01-24T15:18:00+00:00</updated>
<author>
<name>Amar Tumballi</name>
<email>amarts@redhat.com</email>
</author>
<published>2019-01-11T06:27:07+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=99b3ab0cf3d3389a2ff89c29cfff906cd36693a3'/>
<id>99b3ab0cf3d3389a2ff89c29cfff906cd36693a3</id>
<content type='text'>
Fixes: bz#1665358
Change-Id: Idbf88ec3ac683733b32c313377eeb72f2819bf0d
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixes: bz#1665358
Change-Id: Idbf88ec3ac683733b32c313377eeb72f2819bf0d
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Fix failure of tests/basic/ec/ec-1468261.t</title>
<updated>2018-09-25T13:57:42+00:00</updated>
<author>
<name>Ashish Pandey</name>
<email>aspandey@redhat.com</email>
</author>
<published>2018-09-24T09:20:24+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=5a3d87fd1239500873c31d38838706b64dbb1b91'/>
<id>5a3d87fd1239500873c31d38838706b64dbb1b91</id>
<content type='text'>
Problem:
In this test we are relying on eager-lock time
duration of 1 second to delay the post op + unlock phase
of an entry fop so that in this 1 second we can kill 2
bricks and dirty on directory could be set.

Solution:
To fix this issue, we should set the others.eager-lock
option to "ON" explicitly in the beginning of this test.

Change-Id: I19bbb9c15d7bdf96a96b20587c618192d0b740ef
fixes bz#1632161
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
In this test we are relying on eager-lock time
duration of 1 second to delay the post op + unlock phase
of an entry fop so that in this 1 second we can kill 2
bricks and dirty on directory could be set.

Solution:
To fix this issue, we should set the others.eager-lock
option to "ON" explicitly in the beginning of this test.

Change-Id: I19bbb9c15d7bdf96a96b20587c618192d0b740ef
fixes bz#1632161
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Land part 2 of clang-format changes</title>
<updated>2018-09-12T12:22:45+00:00</updated>
<author>
<name>Gluster Ant</name>
<email>bugzilla-bot@gluster.org</email>
</author>
<published>2018-09-12T12:22:45+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=e16868dede6455cab644805af6fe1ac312775e13'/>
<id>e16868dede6455cab644805af6fe1ac312775e13</id>
<content type='text'>
Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4
Signed-off-by: Nigel Babu &lt;nigelb@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Ia84cc24c8924e6d22d02ac15f611c10e26db99b4
Signed-off-by: Nigel Babu &lt;nigelb@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>multiple files: calloc -&gt; malloc</title>
<updated>2018-09-04T05:09:09+00:00</updated>
<author>
<name>Yaniv Kaul</name>
<email>ykaul@redhat.com</email>
</author>
<published>2018-08-08T18:36:28+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=5276e8f27e3021d5da3d3055caed6f9a1d964c93'/>
<id>5276e8f27e3021d5da3d3055caed6f9a1d964c93</id>
<content type='text'>
xlators/cluster/stripe/src/stripe-helpers.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible

xlators/cluster/dht/src/tier.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/dht/src/dht-layout.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/dht/src/dht-helper.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/dht/src/dht-common.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/afr/src/afr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/afr/src/afr-inode-read.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
tests/bugs/replicate/bug-1250170-fsync.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
tests/basic/gfapi/gfapi-async-calls-test.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
tests/basic/ec/ec-fast-fgetxattr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
rpc/xdr/src/glusterfs3.h: Move to GF_MALLOC() instead of GF_CALLOC() when possible
rpc/rpc-transport/socket/src/socket.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
rpc/rpc-lib/src/rpc-clnt.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
extras/geo-rep/gsync-sync-gfid.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-xml-output.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-rpc-ops.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-volume.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-system.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-snapshot.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-peer.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-global.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible

It doesn't make sense to calloc (allocate and clear) memory
when the code right away fills that memory with data.
It may be optimized by the compiler, or have a microscopic
performance improvement.

In some cases, also changed allocation size to be sizeof some
struct or type instead of a pointer - easier to read.
In some cases, removed redundant strlen() calls by saving the result
into a variable.

1. Only done for the straightforward cases. There's room for improvement.
2. Please review carefully, especially for string allocation, with the
terminating NULL string.

Only compile-tested!

updates: bz#1193929
Original-Author: Yaniv Kaul &lt;ykaul@redhat.com&gt;
Signed-off-by: Yaniv Kaul &lt;ykaul@redhat.com&gt;
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;

Change-Id: I16274dca4078a1d06ae09a0daf027d734b631ac2
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
xlators/cluster/stripe/src/stripe-helpers.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible

xlators/cluster/dht/src/tier.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/dht/src/dht-layout.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/dht/src/dht-helper.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/dht/src/dht-common.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/afr/src/afr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
xlators/cluster/afr/src/afr-inode-read.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
tests/bugs/replicate/bug-1250170-fsync.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
tests/basic/gfapi/gfapi-async-calls-test.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
tests/basic/ec/ec-fast-fgetxattr.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
rpc/xdr/src/glusterfs3.h: Move to GF_MALLOC() instead of GF_CALLOC() when possible
rpc/rpc-transport/socket/src/socket.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
rpc/rpc-lib/src/rpc-clnt.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
extras/geo-rep/gsync-sync-gfid.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-xml-output.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-rpc-ops.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-volume.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-system.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-snapshot.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-peer.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible
cli/src/cli-cmd-global.c: Move to GF_MALLOC() instead of GF_CALLOC() when possible

It doesn't make sense to calloc (allocate and clear) memory
when the code right away fills that memory with data.
It may be optimized by the compiler, or have a microscopic
performance improvement.

In some cases, also changed allocation size to be sizeof some
struct or type instead of a pointer - easier to read.
In some cases, removed redundant strlen() calls by saving the result
into a variable.

1. Only done for the straightforward cases. There's room for improvement.
2. Please review carefully, especially for string allocation, with the
terminating NULL string.

Only compile-tested!

updates: bz#1193929
Original-Author: Yaniv Kaul &lt;ykaul@redhat.com&gt;
Signed-off-by: Yaniv Kaul &lt;ykaul@redhat.com&gt;
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;

Change-Id: I16274dca4078a1d06ae09a0daf027d734b631ac2
</pre>
</div>
</content>
</entry>
<entry>
<title>All: run codespell on the code and fix issues.</title>
<updated>2018-07-22T14:40:16+00:00</updated>
<author>
<name>Yaniv Kaul</name>
<email>ykaul@redhat.com</email>
</author>
<published>2018-07-16T14:03:17+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=621138ce763eda8270d0a4f6d7209fd50ada8787'/>
<id>621138ce763eda8270d0a4f6d7209fd50ada8787</id>
<content type='text'>
Please review, it's not always just the comments that were fixed.
I've had to revert of course all calls to creat() that were changed
to create() ...

Only compile-tested!

Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5
updates: bz#1193929
Signed-off-by: Yaniv Kaul &lt;ykaul@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Please review, it's not always just the comments that were fixed.
I've had to revert of course all calls to creat() that were changed
to create() ...

Only compile-tested!

Change-Id: I7d02e82d9766e272a7fd9cc68e51901d69e5aab5
updates: bz#1193929
Signed-off-by: Yaniv Kaul &lt;ykaul@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>afr,ec: Print if the subvolume is up in statedump</title>
<updated>2018-07-03T12:48:56+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2018-07-02T05:09:56+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=ce69fb659657a6d7ca9f5b44bab282533e3bfd26'/>
<id>ce69fb659657a6d7ca9f5b44bab282533e3bfd26</id>
<content type='text'>
fixes bz#1597156
Change-Id: I323eb9190e40b12df216698dcdba74a6d336beeb
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
fixes bz#1597156
Change-Id: I323eb9190e40b12df216698dcdba74a6d336beeb
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: send list-node-uuids request to all subvolumes</title>
<updated>2018-03-28T18:19:25+00:00</updated>
<author>
<name>Xavi Hernandez</name>
<email>xhernandez@redhat.com</email>
</author>
<published>2018-03-28T09:34:49+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=89577d8b0ad7bd1ee2cec2f0e047591b1cd0f7b8'/>
<id>89577d8b0ad7bd1ee2cec2f0e047591b1cd0f7b8</id>
<content type='text'>
The xattr trusted.glusterfs.list-node-uuids was only sent to a single
subvolume. This was returning null uuids from the other subvolumes as
if they were down.

This fix forces that xattr to be requested from all subvolumes.

Change-Id: If62eb39a6857258923ba625e153d4ad79018ea2f
fixes: bz#1561406
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The xattr trusted.glusterfs.list-node-uuids was only sent to a single
subvolume. This was returning null uuids from the other subvolumes as
if they were down.

This fix forces that xattr to be requested from all subvolumes.

Change-Id: If62eb39a6857258923ba625e153d4ad79018ea2f
fixes: bz#1561406
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Add test cases for stripe-cache option</title>
<updated>2018-03-20T19:07:15+00:00</updated>
<author>
<name>Ashish Pandey</name>
<email>aspandey@redhat.com</email>
</author>
<published>2017-11-28T12:48:12+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=ade6262cb63b96834dce101a4a237c0c8128077e'/>
<id>ade6262cb63b96834dce101a4a237c0c8128077e</id>
<content type='text'>
Change-Id: I1508a336a7a927b389a19815ef57001cdf29b109
BUG: 1558074
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: I1508a336a7a927b389a19815ef57001cdf29b109
BUG: 1558074
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Change default read policy to gfid-hash</title>
<updated>2018-03-14T06:22:05+00:00</updated>
<author>
<name>Ashish Pandey</name>
<email>aspandey@redhat.com</email>
</author>
<published>2018-03-13T08:33:20+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=f32f85c4e6c8128643e1f88fe981a63680e79fe0'/>
<id>f32f85c4e6c8128643e1f88fe981a63680e79fe0</id>
<content type='text'>
Problem:
Whenever we read data from file over NFS, NFS reads
more data then requested and caches it. Based on the
stat information it makes sure that the cached/pre-read
data is valid or not.

Consider 4 + 2 EC volume and all the bricks are on
differnt nodes.

In EC, with round-robin read policy, reads are sent on
different set of data bricks. This way, it balances the
read fops to go on all the bricks and avoid heating UP
(overloading) same set of bricks.

Due to small difference in clock speed, it is possible
that we get minor difference for atime, mtime or ctime
for different bricks. That might cause a different stat
returned to NFS based on which NFS will discard
cached/pre-read data which is actually not changed and
could be used.

Solution:
Change read policy for EC as gfid-hash. That will force
all the read to go to same set of bricks.

Change-Id: I825441cc519e94bf3dc3aa0bd4cb7c6ae6392c84
BUG: 1554743
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Whenever we read data from file over NFS, NFS reads
more data then requested and caches it. Based on the
stat information it makes sure that the cached/pre-read
data is valid or not.

Consider 4 + 2 EC volume and all the bricks are on
differnt nodes.

In EC, with round-robin read policy, reads are sent on
different set of data bricks. This way, it balances the
read fops to go on all the bricks and avoid heating UP
(overloading) same set of bricks.

Due to small difference in clock speed, it is possible
that we get minor difference for atime, mtime or ctime
for different bricks. That might cause a different stat
returned to NFS based on which NFS will discard
cached/pre-read data which is actually not changed and
could be used.

Solution:
Change read policy for EC as gfid-hash. That will force
all the read to go to same set of bricks.

Change-Id: I825441cc519e94bf3dc3aa0bd4cb7c6ae6392c84
BUG: 1554743
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
