<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/tests/basic/ec, branch release-7</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>cluster/ec: fix EIO error for concurrent writes on sparse files</title>
<updated>2019-08-22T05:56:58+00:00</updated>
<author>
<name>Xavi Hernandez</name>
<email>xhernandez@redhat.com</email>
</author>
<published>2019-07-17T12:50:22+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=9710a3c5e76b8b7bab4e7680537852e79b62f64e'/>
<id>9710a3c5e76b8b7bab4e7680537852e79b62f64e</id>
<content type='text'>
EC doesn't allow concurrent writes on overlapping areas, they are
serialized. However non-overlapping writes are serviced in parallel.
When a write is not aligned, EC first needs to read the entire chunk
from disk, apply the modified fragment and write it again.

The problem appears on sparse files because a write to an offset
implicitly creates data on offsets below it (so, in some way, they
are overlapping). For example, if a file is empty and we read 10 bytes
from offset 10, read() will return 0 bytes. Now, if we write one byte
at offset 1M and retry the same read, the system call will return 10
bytes (all containing 0's).

So if we have two writes, the first one at offset 10 and the second one
at offset 1M, EC will send both in parallel because they do not overlap.
However, the first one will try to read missing data from the first chunk
(i.e. offsets 0 to 9) to recombine the entire chunk and do the final write.
This read will happen in parallel with the write to 1M. What could happen
is that half of the bricks process the write before the read, and the
half do the read before the write. Some bricks will return 10 bytes of
data while the otherw will return 0 bytes (because the file on the brick
has not been expanded yet).

When EC tries to recombine the answers from the bricks, it can't, because
it needs more than half consistent answers to recover the data. So this
read fails with EIO error. This error is propagated to the parent write,
which is aborted and EIO is returned to the application.

The issue happened because EC assumed that a write to a given offset
implies that offsets below it exist.

This fix prevents the read of the chunk from bricks if the current size
of the file is smaller than the read chunk offset. This size is
correctly tracked, so this fixes the issue.

Also modifying ec-stripe.t file for Test #13 within it.
In this patch, if a file size is less than the offset we are writing, we
fill zeros in head and tail and do not consider it strip cache miss.
That actually make sense as we know what data that part holds and there is
no need of reading it from bricks.

Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6
Fixes: bz#1739427
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
EC doesn't allow concurrent writes on overlapping areas, they are
serialized. However non-overlapping writes are serviced in parallel.
When a write is not aligned, EC first needs to read the entire chunk
from disk, apply the modified fragment and write it again.

The problem appears on sparse files because a write to an offset
implicitly creates data on offsets below it (so, in some way, they
are overlapping). For example, if a file is empty and we read 10 bytes
from offset 10, read() will return 0 bytes. Now, if we write one byte
at offset 1M and retry the same read, the system call will return 10
bytes (all containing 0's).

So if we have two writes, the first one at offset 10 and the second one
at offset 1M, EC will send both in parallel because they do not overlap.
However, the first one will try to read missing data from the first chunk
(i.e. offsets 0 to 9) to recombine the entire chunk and do the final write.
This read will happen in parallel with the write to 1M. What could happen
is that half of the bricks process the write before the read, and the
half do the read before the write. Some bricks will return 10 bytes of
data while the otherw will return 0 bytes (because the file on the brick
has not been expanded yet).

When EC tries to recombine the answers from the bricks, it can't, because
it needs more than half consistent answers to recover the data. So this
read fails with EIO error. This error is propagated to the parent write,
which is aborted and EIO is returned to the application.

The issue happened because EC assumed that a write to a given offset
implies that offsets below it exist.

This fix prevents the read of the chunk from bricks if the current size
of the file is smaller than the read chunk offset. This size is
correctly tracked, so this fixes the issue.

Also modifying ec-stripe.t file for Test #13 within it.
In this patch, if a file size is less than the offset we are writing, we
fill zeros in head and tail and do not consider it strip cache miss.
That actually make sense as we know what data that part holds and there is
no need of reading it from bricks.

Change-Id: Ic342e8c35c555b8534109e9314c9a0710b6225d6
Fixes: bz#1739427
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Prevent double pre-op xattrops</title>
<updated>2019-06-22T05:06:15+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2019-06-20T11:35:49+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=e2f2f414f855f9a3da269745ae3db2d27baa7f3d'/>
<id>e2f2f414f855f9a3da269745ae3db2d27baa7f3d</id>
<content type='text'>
Problem:
Race:
Thread-1                                    Thread-2
1) Does ec_get_size_version() to perform
pre-op fxattrop as part of write-1
                                           2) Calls ec_set_dirty_flag() in
                                              ec_get_size_version() for write-2.
					      This sets dirty[] to 1
3) Completes executing
ec_prepare_update_cbk leading to
ctx-&gt;dirty[] = '1'
					   4) Takes LOCK(inode-&gt;lock) to check if there are
					      any flags and sets dirty-flag because
				              lock-&gt;waiting_flag is 0 now. This leads to
					      fxattrop to increment on-disk dirty[] to '2'

At the end of the writes the file will be marked for heal even when it doesn't need heal.

Fix:
Perform ec_set_dirty_flag() and other checks inside LOCK() to prevent dirty[] to be marked
as '1' in step 2) above

Updates bz#1593224
Change-Id: Icac2ab39c0b1e7e154387800fbededc561612865
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Race:
Thread-1                                    Thread-2
1) Does ec_get_size_version() to perform
pre-op fxattrop as part of write-1
                                           2) Calls ec_set_dirty_flag() in
                                              ec_get_size_version() for write-2.
					      This sets dirty[] to 1
3) Completes executing
ec_prepare_update_cbk leading to
ctx-&gt;dirty[] = '1'
					   4) Takes LOCK(inode-&gt;lock) to check if there are
					      any flags and sets dirty-flag because
				              lock-&gt;waiting_flag is 0 now. This leads to
					      fxattrop to increment on-disk dirty[] to '2'

At the end of the writes the file will be marked for heal even when it doesn't need heal.

Fix:
Perform ec_set_dirty_flag() and other checks inside LOCK() to prevent dirty[] to be marked
as '1' in step 2) above

Updates bz#1593224
Change-Id: Icac2ab39c0b1e7e154387800fbededc561612865
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>core: improve timer accuracy</title>
<updated>2019-06-17T10:29:50+00:00</updated>
<author>
<name>Xavier Hernandez</name>
<email>jahernan@redhat.com</email>
</author>
<published>2018-01-19T11:18:13+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=7cc20c1eb09c041b331c4add6c24212fbdcda3b4'/>
<id>7cc20c1eb09c041b331c4add6c24212fbdcda3b4</id>
<content type='text'>
Also fixed some issues on test ec-1468261.t.

Change-Id: If156f86af986d9eed13cdd1f15c5a7214cd11706
Updates: bz#1193929
Signed-off-by: Xavier Hernandez &lt;jahernan@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Also fixed some issues on test ec-1468261.t.

Change-Id: If156f86af986d9eed13cdd1f15c5a7214cd11706
Updates: bz#1193929
Signed-off-by: Xavier Hernandez &lt;jahernan@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: Test openfd heal doesn't truncate files</title>
<updated>2019-05-24T03:45:39+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2019-05-06T09:57:44+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=0b83b8af60916dcdb850f15da7e7b406809dd1d5'/>
<id>0b83b8af60916dcdb850f15da7e7b406809dd1d5</id>
<content type='text'>
fixes bz#1706603
Change-Id: I0bfd30f787f157b7a54f71088f767ccfd7621208
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
fixes bz#1706603
Change-Id: I0bfd30f787f157b7a54f71088f767ccfd7621208
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: improve and fix some test scripts</title>
<updated>2019-05-09T09:24:10+00:00</updated>
<author>
<name>Xavier Hernandez</name>
<email>jahernan@redhat.com</email>
</author>
<published>2018-01-19T11:18:13+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=aee9e3d27f56e4c0c2f981f20b15189eb7ffce51'/>
<id>aee9e3d27f56e4c0c2f981f20b15189eb7ffce51</id>
<content type='text'>
Change-Id: Iceefe22af754096c599dc570d4894d14fce4deae
Updates: bz#1193929
Signed-off-by: Xavier Hernandez &lt;xhernandez@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: Iceefe22af754096c599dc570d4894d14fce4deae
Updates: bz#1193929
Signed-off-by: Xavier Hernandez &lt;xhernandez@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: fix fd reopen</title>
<updated>2019-04-23T11:28:58+00:00</updated>
<author>
<name>Xavi Hernandez</name>
<email>xhernandez@redhat.com</email>
</author>
<published>2019-04-12T15:54:44+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=4f0db1373be93e05b6fc451d2f07514704d3c4ca'/>
<id>4f0db1373be93e05b6fc451d2f07514704d3c4ca</id>
<content type='text'>
Currently EC tries to reopen fd's that have been opened while a brick
was down. This is done as part of regular write operations, just after
having acquired the locks, and it's sent as a sub-fop of the main write
fop.

There were two problems:

1. The reopen was attempted on all UP bricks, even if a previous lock
didn't succeed. This is incorrect because most probably the open will
fail.

2. If reopen is sent and fails, the error is propagated to the main
operation, causing it to fail when it shouldn't.

To fix this, we only attempt reopens on bricks where the current fop
owns a lock, and we prevent any error to be propagated to the main
fop.

To implement this behaviour an argument used to indicate the minimum
number of required answers has overloaded to also include some flags. To
make the change consistent, it has been necessary to rename the
argument, which means that a lot of files have been changed. However
there are no functional changes.

This change has also uncovered a problem in discard code, which didn't
correctely process requests of small sizes because no real discard fop
was being processed, only a write of 0's on some region. In this case
some fields of the fop remained uninitialized or with incorrect values.
To fix this, a new function has been created to simulate success on a
fop and it's used in the discard case.

Thanks to Pranith for providing a test script that has also detected an
issue in this patch. This patch includes a small modification of this
script to force data to be written into bricks before stopping them.

Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
Fixes: bz#1699866
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Currently EC tries to reopen fd's that have been opened while a brick
was down. This is done as part of regular write operations, just after
having acquired the locks, and it's sent as a sub-fop of the main write
fop.

There were two problems:

1. The reopen was attempted on all UP bricks, even if a previous lock
didn't succeed. This is incorrect because most probably the open will
fail.

2. If reopen is sent and fails, the error is propagated to the main
operation, causing it to fail when it shouldn't.

To fix this, we only attempt reopens on bricks where the current fop
owns a lock, and we prevent any error to be propagated to the main
fop.

To implement this behaviour an argument used to indicate the minimum
number of required answers has overloaded to also include some flags. To
make the change consistent, it has been necessary to rename the
argument, which means that a lot of files have been changed. However
there are no functional changes.

This change has also uncovered a problem in discard code, which didn't
correctely process requests of small sizes because no real discard fop
was being processed, only a write of 0's on some region. In this case
some fields of the fop remained uninitialized or with incorrect values.
To fix this, a new function has been created to simulate success on a
fop and it's used in the discard case.

Thanks to Pranith for providing a test script that has also detected an
issue in this patch. This patch includes a small modification of this
script to force data to be written into bricks before stopping them.

Change-Id: If272343873369186c2fb8f43c1d9c52c3ea304ec
Fixes: bz#1699866
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: Heal should fail when read/write fails</title>
<updated>2019-04-16T09:07:23+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2019-04-16T08:49:47+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=b3a6d4e32ec5826ce4e389030b99d7b08274ed29'/>
<id>b3a6d4e32ec5826ce4e389030b99d7b08274ed29</id>
<content type='text'>
updates: bz#1699866
Change-Id: I7ccd1fc5fc134eeb6d443c755962a20819320d48
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
updates: bz#1699866
Change-Id: I7ccd1fc5fc134eeb6d443c755962a20819320d48
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>ec: increase line coverage of ec</title>
<updated>2019-04-10T04:32:00+00:00</updated>
<author>
<name>Xavi Hernandez</name>
<email>xhernandez@redhat.com</email>
</author>
<published>2019-04-04T21:57:20+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=398b53adb281e459281f74a9f96a12ee48da7daa'/>
<id>398b53adb281e459281f74a9f96a12ee48da7daa</id>
<content type='text'>
Test ec-cpu-extensions.t has been modified so that it uses a bigger
matrix. This makes use of more functions from ec-code-c.c. Changing
read-policy to round-robin increases even more the functions used,
reaching 100% of line and function coverage for this file.

Change-Id: I26e4d33269cbd67f5d76d862f4cf1e69285e85e1
updates: bz#1193929
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Test ec-cpu-extensions.t has been modified so that it uses a bigger
matrix. This makes use of more functions from ec-code-c.c. Changing
read-policy to round-robin increases even more the functions used,
reaching 100% of line and function coverage for this file.

Change-Id: I26e4d33269cbd67f5d76d862f4cf1e69285e85e1
updates: bz#1193929
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>tests: run nfs tests only if --enable-gnfs is provided</title>
<updated>2019-01-24T15:18:00+00:00</updated>
<author>
<name>Amar Tumballi</name>
<email>amarts@redhat.com</email>
</author>
<published>2019-01-11T06:27:07+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=99b3ab0cf3d3389a2ff89c29cfff906cd36693a3'/>
<id>99b3ab0cf3d3389a2ff89c29cfff906cd36693a3</id>
<content type='text'>
Fixes: bz#1665358
Change-Id: Idbf88ec3ac683733b32c313377eeb72f2819bf0d
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Fixes: bz#1665358
Change-Id: Idbf88ec3ac683733b32c313377eeb72f2819bf0d
Signed-off-by: Amar Tumballi &lt;amarts@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Fix failure of tests/basic/ec/ec-1468261.t</title>
<updated>2018-09-25T13:57:42+00:00</updated>
<author>
<name>Ashish Pandey</name>
<email>aspandey@redhat.com</email>
</author>
<published>2018-09-24T09:20:24+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=5a3d87fd1239500873c31d38838706b64dbb1b91'/>
<id>5a3d87fd1239500873c31d38838706b64dbb1b91</id>
<content type='text'>
Problem:
In this test we are relying on eager-lock time
duration of 1 second to delay the post op + unlock phase
of an entry fop so that in this 1 second we can kill 2
bricks and dirty on directory could be set.

Solution:
To fix this issue, we should set the others.eager-lock
option to "ON" explicitly in the beginning of this test.

Change-Id: I19bbb9c15d7bdf96a96b20587c618192d0b740ef
fixes bz#1632161
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
In this test we are relying on eager-lock time
duration of 1 second to delay the post op + unlock phase
of an entry fop so that in this 1 second we can kill 2
bricks and dirty on directory could be set.

Solution:
To fix this issue, we should set the others.eager-lock
option to "ON" explicitly in the beginning of this test.

Change-Id: I19bbb9c15d7bdf96a96b20587c618192d0b740ef
fixes bz#1632161
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
