<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/cluster/ec/src/ec.c, branch release-4.1</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>cluster/ec: Fix pre-op xattrop management</title>
<updated>2018-05-25T02:06:11+00:00</updated>
<author>
<name>Xavi Hernandez</name>
<email>xhernandez@redhat.com</email>
</author>
<published>2018-05-15T09:37:16+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=16b3830c337acacc1155c6d463c3ced852603e12'/>
<id>16b3830c337acacc1155c6d463c3ced852603e12</id>
<content type='text'>
Multiple pre-op xattrop can be simultaneously being processed. On the cbk
it was checked if the fop was waiting for some specific data (like size and
version) and, if so, it was assumed that this answer should contain that
data.

This is not true, since a fop can be waiting for some data, but it may come
from the xattrop of another fop.

This patch differentiates between needing some information and providing it.

This is related to parallel writes. Disabling them fixed the problem, but
also prevented concurrent reads. A change has been made so that disabling
parallel writes still allows parallel reads.

Backport of:
&gt; BUG: 1578325

Fixes: bz#1582056
Change-Id: I74772ad6b80b7b37805da93d5ec3ae099e96b041
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Multiple pre-op xattrop can be simultaneously being processed. On the cbk
it was checked if the fop was waiting for some specific data (like size and
version) and, if so, it was assumed that this answer should contain that
data.

This is not true, since a fop can be waiting for some data, but it may come
from the xattrop of another fop.

This patch differentiates between needing some information and providing it.

This is related to parallel writes. Disabling them fixed the problem, but
also prevented concurrent reads. A change has been made so that disabling
parallel writes still allows parallel reads.

Backport of:
&gt; BUG: 1578325

Fixes: bz#1582056
Change-Id: I74772ad6b80b7b37805da93d5ec3ae099e96b041
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Turn ON the stripe-cache option by default</title>
<updated>2018-04-06T07:05:55+00:00</updated>
<author>
<name>Ashish Pandey</name>
<email>aspandey@redhat.com</email>
</author>
<published>2018-04-05T06:39:56+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=351572de9f08ba378d14552538763acec8917e53'/>
<id>351572de9f08ba378d14552538763acec8917e53</id>
<content type='text'>
Change-Id: I0a290396c30c635b13ee73004d20259efb76a954
fixes: bz#1563945
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Change-Id: I0a290396c30c635b13ee73004d20259efb76a954
fixes: bz#1563945
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: send list-node-uuids request to all subvolumes</title>
<updated>2018-03-28T18:19:25+00:00</updated>
<author>
<name>Xavi Hernandez</name>
<email>xhernandez@redhat.com</email>
</author>
<published>2018-03-28T09:34:49+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=89577d8b0ad7bd1ee2cec2f0e047591b1cd0f7b8'/>
<id>89577d8b0ad7bd1ee2cec2f0e047591b1cd0f7b8</id>
<content type='text'>
The xattr trusted.glusterfs.list-node-uuids was only sent to a single
subvolume. This was returning null uuids from the other subvolumes as
if they were down.

This fix forces that xattr to be requested from all subvolumes.

Change-Id: If62eb39a6857258923ba625e153d4ad79018ea2f
fixes: bz#1561406
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The xattr trusted.glusterfs.list-node-uuids was only sent to a single
subvolume. This was returning null uuids from the other subvolumes as
if they were down.

This fix forces that xattr to be requested from all subvolumes.

Change-Id: If62eb39a6857258923ba625e153d4ad79018ea2f
fixes: bz#1561406
Signed-off-by: Xavi Hernandez &lt;xhernandez@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Change default read policy to gfid-hash</title>
<updated>2018-03-14T06:22:05+00:00</updated>
<author>
<name>Ashish Pandey</name>
<email>aspandey@redhat.com</email>
</author>
<published>2018-03-13T08:33:20+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=f32f85c4e6c8128643e1f88fe981a63680e79fe0'/>
<id>f32f85c4e6c8128643e1f88fe981a63680e79fe0</id>
<content type='text'>
Problem:
Whenever we read data from file over NFS, NFS reads
more data then requested and caches it. Based on the
stat information it makes sure that the cached/pre-read
data is valid or not.

Consider 4 + 2 EC volume and all the bricks are on
differnt nodes.

In EC, with round-robin read policy, reads are sent on
different set of data bricks. This way, it balances the
read fops to go on all the bricks and avoid heating UP
(overloading) same set of bricks.

Due to small difference in clock speed, it is possible
that we get minor difference for atime, mtime or ctime
for different bricks. That might cause a different stat
returned to NFS based on which NFS will discard
cached/pre-read data which is actually not changed and
could be used.

Solution:
Change read policy for EC as gfid-hash. That will force
all the read to go to same set of bricks.

Change-Id: I825441cc519e94bf3dc3aa0bd4cb7c6ae6392c84
BUG: 1554743
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
Whenever we read data from file over NFS, NFS reads
more data then requested and caches it. Based on the
stat information it makes sure that the cached/pre-read
data is valid or not.

Consider 4 + 2 EC volume and all the bricks are on
differnt nodes.

In EC, with round-robin read policy, reads are sent on
different set of data bricks. This way, it balances the
read fops to go on all the bricks and avoid heating UP
(overloading) same set of bricks.

Due to small difference in clock speed, it is possible
that we get minor difference for atime, mtime or ctime
for different bricks. That might cause a different stat
returned to NFS based on which NFS will discard
cached/pre-read data which is actually not changed and
could be used.

Solution:
Change read policy for EC as gfid-hash. That will force
all the read to go to same set of bricks.

Change-Id: I825441cc519e94bf3dc3aa0bd4cb7c6ae6392c84
BUG: 1554743
Signed-off-by: Ashish Pandey &lt;aspandey@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: avoid delays in self-heal</title>
<updated>2018-03-14T03:12:27+00:00</updated>
<author>
<name>Xavi Hernandez</name>
<email>jahernan@redhat.com</email>
</author>
<published>2018-02-21T16:47:37+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=7f81067f4522f973e98aa5abbb4d2028da2a2e6f'/>
<id>7f81067f4522f973e98aa5abbb4d2028da2a2e6f</id>
<content type='text'>
Self-heal creates a thread per brick to sweep the index looking for
files that need to be healed. These threads are started before the
volume comes online, so nothing is done but waiting for the next
sweep. This happens once per minute.

When a replace brick command is executed, the new graph is loaded and
all index sweeper threads started. When all bricks have reported, a
getxattr request is sent to the root directory of the volume. This
causes a heal on it (because the new brick doesn't have good data),
and marks its contents as pending to be healed. This is done by the
index sweeper thread on the next round, one minute later.

This patch solves this problem by waking all index sweeper threads
after a successful check on the root directory.

Additionally, the index sweep thread scans the index directory
sequentially, but it might happen that after healing a directory entry
more index entries are created but skipped by the current directory
scan. This causes the remaining entries to be processed on the next
round, one minute later. The same can happen in the next round, so
the heal is running in bursts and taking a lot to finish, specially
on volumes with many directory levels.

This patch solves this problem by immediately restarting the index
sweep if a directory has been healed.

Change-Id: I58d9ab6ef17b30f704dc322e1d3d53b904e5f30e
BUG: 1547662
Signed-off-by: Xavi Hernandez &lt;jahernan@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Self-heal creates a thread per brick to sweep the index looking for
files that need to be healed. These threads are started before the
volume comes online, so nothing is done but waiting for the next
sweep. This happens once per minute.

When a replace brick command is executed, the new graph is loaded and
all index sweeper threads started. When all bricks have reported, a
getxattr request is sent to the root directory of the volume. This
causes a heal on it (because the new brick doesn't have good data),
and marks its contents as pending to be healed. This is done by the
index sweeper thread on the next round, one minute later.

This patch solves this problem by waking all index sweeper threads
after a successful check on the root directory.

Additionally, the index sweep thread scans the index directory
sequentially, but it might happen that after healing a directory entry
more index entries are created but skipped by the current directory
scan. This causes the remaining entries to be processed on the next
round, one minute later. The same can happen in the next round, so
the heal is running in bursts and taking a lot to finish, specially
on volumes with many directory levels.

This patch solves this problem by immediately restarting the index
sweep if a directory has been healed.

Change-Id: I58d9ab6ef17b30f704dc322e1d3d53b904e5f30e
BUG: 1547662
Signed-off-by: Xavi Hernandez &lt;jahernan@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec : EC options for GD2</title>
<updated>2018-01-22T08:53:36+00:00</updated>
<author>
<name>Sunil Kumar Acharya</name>
<email>sheggodu@redhat.com</email>
</author>
<published>2017-09-01T09:34:09+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=10d74166f17fa44c06bd1357e0a4b0b052265425'/>
<id>10d74166f17fa44c06bd1357e0a4b0b052265425</id>
<content type='text'>
Updates #302

Change-Id: I31b4648f7b1a394fceece5cba8120c579c66edd9
Signed-off-by: Sunil Kumar Acharya &lt;sheggodu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Updates #302

Change-Id: I31b4648f7b1a394fceece5cba8120c579c66edd9
Signed-off-by: Sunil Kumar Acharya &lt;sheggodu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>locks: added inodelk/entrylk contention upcall notifications</title>
<updated>2018-01-16T10:37:22+00:00</updated>
<author>
<name>Xavier Hernandez</name>
<email>xhernandez@datalab.es</email>
</author>
<published>2016-06-15T12:42:19+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=7ba7a4b27d124f4ee16fe4776a4670cd5b0160c4'/>
<id>7ba7a4b27d124f4ee16fe4776a4670cd5b0160c4</id>
<content type='text'>
The locks xlator now is able to send a contention notification to
the current owner of the lock.

This is only a notification that can be used to improve performance
of some client side operations that might benefit from extended
duration of lock ownership. Nothing is done if the lock owner decides
to ignore the message and to not release the lock. For forced
release of acquired resources, leases must be used.

Change-Id: I7f1ad32a0b4b445505b09908a050080ad848f8e0
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The locks xlator now is able to send a contention notification to
the current owner of the lock.

This is only a notification that can be used to improve performance
of some client side operations that might benefit from extended
duration of lock ownership. Nothing is done if the lock owner decides
to ignore the message and to not release the lock. For forced
release of acquired resources, leases must be used.

Change-Id: I7f1ad32a0b4b445505b09908a050080ad848f8e0
Signed-off-by: Xavier Hernandez &lt;xhernandez@datalab.es&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Fix possible shift overflow</title>
<updated>2017-12-22T16:19:53+00:00</updated>
<author>
<name>Xavier Hernandez</name>
<email>jahernan@redhat.com</email>
</author>
<published>2017-12-13T16:27:42+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=41120aa8bab4ca4496bb37b8986434be404ae255'/>
<id>41120aa8bab4ca4496bb37b8986434be404ae255</id>
<content type='text'>
A coverity scan has revelaed a potential shift overflow while scanning
the bitmap of available subvolumes. The actual overflow cannot happen,
but I've changed to test used to control the limit to make it explicit.

Change-Id: Ieb55f010bbca68a1d86a93e47822f7c709a26e83
BUG: 789278
Signed-off-by: Xavier Hernandez &lt;jahernan@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
A coverity scan has revelaed a potential shift overflow while scanning
the bitmap of available subvolumes. The actual overflow cannot happen,
but I've changed to test used to control the limit to make it explicit.

Change-Id: Ieb55f010bbca68a1d86a93e47822f7c709a26e83
BUG: 789278
Signed-off-by: Xavier Hernandez &lt;jahernan@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Change [f]getxattr to parallel-dispatch-one</title>
<updated>2017-12-22T09:25:29+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2017-12-06T02:29:53+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=c96a1338fe8139d07a0aa1bc40f0843d033f0324'/>
<id>c96a1338fe8139d07a0aa1bc40f0843d033f0324</id>
<content type='text'>
At the moment in EC, [f]getxattr operations wait to acquire a lock
while other operations are in progress even when it is in the same mount with a
lock on the file/directory. This happens because [f]getxattr operations
follow the model where the operation is wound on 'k' of the bricks and are
matched to make sure the data returned is same on all of them. This consistency
check requires that no other operations are on-going while [f]getxattr
operations are wound to the bricks. We can perform [f]getxattr in
another way as well, where we find the good_mask from the lock that is already
granted and wind the operation on any one of the good bricks and unwind the
answer after adjusting size/blocks to the parent xlator. Since we are taking
into account good_mask, the reply we get will either be before or after a
possible on-going operation. Using this method, the operation doesn't need to
depend on completion of on-going operations which could be taking long time (In
case of some slow disks and writes are in progress etc). Thus we reduce the
time to serve [f]getxattr requests.

I changed [f]getxattr to dispatch-one and added extra logic in
ec_link_has_lock_conflict() to not have any conflicts for fops with
EC_MINIMUM_ONE as fop-&gt;minimum to achieve the effect described above.
Modified scripts to make sure READ fop is received in EC to trigger heals.

Updates gluster/glusterfs#368
Change-Id: I3b4ebf89181c336b7b8d5471b0454f016cdaf296
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
At the moment in EC, [f]getxattr operations wait to acquire a lock
while other operations are in progress even when it is in the same mount with a
lock on the file/directory. This happens because [f]getxattr operations
follow the model where the operation is wound on 'k' of the bricks and are
matched to make sure the data returned is same on all of them. This consistency
check requires that no other operations are on-going while [f]getxattr
operations are wound to the bricks. We can perform [f]getxattr in
another way as well, where we find the good_mask from the lock that is already
granted and wind the operation on any one of the good bricks and unwind the
answer after adjusting size/blocks to the parent xlator. Since we are taking
into account good_mask, the reply we get will either be before or after a
possible on-going operation. Using this method, the operation doesn't need to
depend on completion of on-going operations which could be taking long time (In
case of some slow disks and writes are in progress etc). Thus we reduce the
time to serve [f]getxattr requests.

I changed [f]getxattr to dispatch-one and added extra logic in
ec_link_has_lock_conflict() to not have any conflicts for fops with
EC_MINIMUM_ONE as fop-&gt;minimum to achieve the effect described above.
Modified scripts to make sure READ fop is received in EC to trigger heals.

Updates gluster/glusterfs#368
Change-Id: I3b4ebf89181c336b7b8d5471b0454f016cdaf296
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Add default value for the redundancy option</title>
<updated>2017-12-21T07:04:32+00:00</updated>
<author>
<name>Kamal Mohanan</name>
<email>kmohanan@redhat.com</email>
</author>
<published>2017-12-12T11:47:37+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=8564311e8bb5971823c6b32aa5c9620d80c7cabd'/>
<id>8564311e8bb5971823c6b32aa5c9620d80c7cabd</id>
<content type='text'>
Updates #302

Change-Id: Ifccc6a64c2b2a3b3a0133e6c8042cdf61a0838ce
Signed-off-by: Kamal Mohanan &lt;kmohanan@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Updates #302

Change-Id: Ifccc6a64c2b2a3b3a0133e6c8042cdf61a0838ce
Signed-off-by: Kamal Mohanan &lt;kmohanan@redhat.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
