<feed xmlns='http://www.w3.org/2005/Atom'>
<title>glusterfs.git/xlators/cluster, branch v6.8</title>
<subtitle></subtitle>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/'/>
<entry>
<title>cluster/ec: Change handling of heal failure to avoid crash</title>
<updated>2020-02-28T06:06:57+00:00</updated>
<author>
<name>Ashish Pandey</name>
<email>aspandey@redhat.com</email>
</author>
<published>2019-07-11T11:22:49+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=bd37f5350ac9b85c18353069c36a6ae4e489d100'/>
<id>bd37f5350ac9b85c18353069c36a6ae4e489d100</id>
<content type='text'>
Problem:
ec_getxattr_heal_cbk was called with NULL as second argument
in case heal was failing.
This function was dereferencing "cookie" argument which caused crash.

Solution:
Cookie is changed to carry the value that was supposed to be
stored in fop-&gt;data, so even in the case when fop is NULL in error
case, there won't be any NULL dereference.

Thanks to Xavi for the suggestion about the fix.

Change-Id: I0798000d5cadb17c3c2fbfa1baf77033ffc2bb8c
fixes: bz#1806836
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
ec_getxattr_heal_cbk was called with NULL as second argument
in case heal was failing.
This function was dereferencing "cookie" argument which caused crash.

Solution:
Cookie is changed to carry the value that was supposed to be
stored in fop-&gt;data, so even in the case when fop is NULL in error
case, there won't be any NULL dereference.

Thanks to Xavi for the suggestion about the fix.

Change-Id: I0798000d5cadb17c3c2fbfa1baf77033ffc2bb8c
fixes: bz#1806836
</pre>
</div>
</content>
</entry>
<entry>
<title>afr: prevent spurious entry heals leading to gfid split-brain</title>
<updated>2020-02-28T06:06:10+00:00</updated>
<author>
<name>Ravishankar N</name>
<email>ravishankar@redhat.com</email>
</author>
<published>2020-02-11T09:04:48+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=559fd060c59edec69ba66be7e0a447c8e0408d51'/>
<id>559fd060c59edec69ba66be7e0a447c8e0408d51</id>
<content type='text'>
Problem:
In a hyperconverged setup with granular-entry-heal enabled, if a file is
recreated while one of the bricks is down, and an index heal is triggered
(with the brick still down), entry-self heal was doing a spurious heal
with just the 2 good bricks. It was doing a post-op leading to removal
of the filename from .glusterfs/indices/entry-changes as well as
erroneous setting of afr xattrs on the parent. When the brick came up,
the xattrs were cleared, resulting in the renamed file not getting
healed and leading to gfid split-brain and EIO on the mount.

Fix:
Proceed with entry heal only when shd can connect to all bricks of the replica,
just like in data and metadata heal.

fixes: bz#1804594
Change-Id: I916ae26ad1fabf259bc6362da52d433b7223b17e
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
(cherry picked from commit 06453d77d056fbaa393a137ca277a20e38d2f67e)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
In a hyperconverged setup with granular-entry-heal enabled, if a file is
recreated while one of the bricks is down, and an index heal is triggered
(with the brick still down), entry-self heal was doing a spurious heal
with just the 2 good bricks. It was doing a post-op leading to removal
of the filename from .glusterfs/indices/entry-changes as well as
erroneous setting of afr xattrs on the parent. When the brick came up,
the xattrs were cleared, resulting in the renamed file not getting
healed and leading to gfid split-brain and EIO on the mount.

Fix:
Proceed with entry heal only when shd can connect to all bricks of the replica,
just like in data and metadata heal.

fixes: bz#1804594
Change-Id: I916ae26ad1fabf259bc6362da52d433b7223b17e
Signed-off-by: Ravishankar N &lt;ravishankar@redhat.com&gt;
(cherry picked from commit 06453d77d056fbaa393a137ca277a20e38d2f67e)
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/thin-arbiter: Wait for TA connection before ta-file lookup</title>
<updated>2020-02-26T11:18:54+00:00</updated>
<author>
<name>Ashish Pandey</name>
<email>aspandey@redhat.com</email>
</author>
<published>2020-01-03T11:24:33+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=4989d01be1b3058fa4a32d8bb36e8c9150dc6a8b'/>
<id>4989d01be1b3058fa4a32d8bb36e8c9150dc6a8b</id>
<content type='text'>
Problem:
When we mount a ta volume, as soon as 2 data bricks are connected
we consider that the mount is done and then send a lookup/create on
ta file on ta node. However, this connection with ta node might not
have been completed.
Due to this delay, ta replica id file will not be created and we
will see ENOTCONN error in log file if we do lookup.

Solution:
As we know that this ta node could have a higher latency, we should
wait for reasonable time for connection to happen before sending
lookup/create on replica id file.

fixes: bz#1804546
Change-Id: I36f90865afe617e4e84cee57fec832a16f5dd6cc
(cherry picked from commit a7fa54ddea3fe429f143b37e4de06a93b49d776a)
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
When we mount a ta volume, as soon as 2 data bricks are connected
we consider that the mount is done and then send a lookup/create on
ta file on ta node. However, this connection with ta node might not
have been completed.
Due to this delay, ta replica id file will not be created and we
will see ENOTCONN error in log file if we do lookup.

Solution:
As we know that this ta node could have a higher latency, we should
wait for reasonable time for connection to happen before sending
lookup/create on replica id file.

fixes: bz#1804546
Change-Id: I36f90865afe617e4e84cee57fec832a16f5dd6cc
(cherry picked from commit a7fa54ddea3fe429f143b37e4de06a93b49d776a)
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: skip updating ctx-&gt;loc again when ec_fix_open/opendir</title>
<updated>2020-02-26T11:09:08+00:00</updated>
<author>
<name>Kinglong Mee</name>
<email>kinglongmee@gmail.com</email>
</author>
<published>2019-07-11T10:57:13+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=611961144704672c6a670fc7ad91a6e8000b2c0f'/>
<id>611961144704672c6a670fc7ad91a6e8000b2c0f</id>
<content type='text'>
The ec_manager_open/opendir memsets ctx-&gt;loc which causes
memory/inode leak, and ec_fheal uses ctx-&gt;loc out of fd-&gt;lock
that loc_copy may copy bad data when memset it.

This patch skips updating ctx-&gt;loc when it is initilizaed.
With it, ctx-&gt;loc is filled once, and never updated.

Change-Id: I3bf5ffce4caf4c1c667f7acaa14b451d37a3550a
fixes: bz#1806838
Signed-off-by: Kinglong Mee &lt;mijinlong@horiscale.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The ec_manager_open/opendir memsets ctx-&gt;loc which causes
memory/inode leak, and ec_fheal uses ctx-&gt;loc out of fd-&gt;lock
that loc_copy may copy bad data when memset it.

This patch skips updating ctx-&gt;loc when it is initilizaed.
With it, ctx-&gt;loc is filled once, and never updated.

Change-Id: I3bf5ffce4caf4c1c667f7acaa14b451d37a3550a
fixes: bz#1806838
Signed-off-by: Kinglong Mee &lt;mijinlong@horiscale.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Cluster/afr: Don't treat all bricks having metadata pending as split-brain</title>
<updated>2020-02-25T07:06:51+00:00</updated>
<author>
<name>karthik-us</name>
<email>ksubrahm@redhat.com</email>
</author>
<published>2019-06-06T05:29:42+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=96d326cc917baf6ac44f4deacef6d251ebcdf0ea'/>
<id>96d326cc917baf6ac44f4deacef6d251ebcdf0ea</id>
<content type='text'>
Problem:
We currently don't have a roll-back/undoing of post-ops if quorum is not met.
Though the FOP is still unwound with failure, the xattrs remain on the disk.
Due to these partial post-ops and partial heals (healing only when 2 bricks
are up), we can end up in metadata split-brain purely from the afr xattrs
point of view i.e each brick is blamed by atleast one of the others for
metadata. These scenarios are hit when there is frequent connect/disconnect
of the client/shd to the bricks.

Fix:
Pick a source based on the xattr values. If 2 bricks blame one, the blamed
one must be treated as sink. If there is no majority, all are sources. Once
we pick a source, self-heal will then do the heal instead of erroring out
due to split-brain.
This patch also adds restriction of all the bricks to be up to perform
metadata heal to avoid any metadata loss.

Removed the test case tests/bugs/replicate/bug-1468279-source-not-blaming-sinks.t
as it was doing metadata heal even when only 2 of 3 bricks were up.

Change-Id: I07a9d62f84ceda329dcab1f02a33aeed258dcb09
fixes: bz#1805097
Signed-off-by: karthik-us &lt;ksubrahm@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
We currently don't have a roll-back/undoing of post-ops if quorum is not met.
Though the FOP is still unwound with failure, the xattrs remain on the disk.
Due to these partial post-ops and partial heals (healing only when 2 bricks
are up), we can end up in metadata split-brain purely from the afr xattrs
point of view i.e each brick is blamed by atleast one of the others for
metadata. These scenarios are hit when there is frequent connect/disconnect
of the client/shd to the bricks.

Fix:
Pick a source based on the xattr values. If 2 bricks blame one, the blamed
one must be treated as sink. If there is no majority, all are sources. Once
we pick a source, self-heal will then do the heal instead of erroring out
due to split-brain.
This patch also adds restriction of all the bricks to be up to perform
metadata heal to avoid any metadata loss.

Removed the test case tests/bugs/replicate/bug-1468279-source-not-blaming-sinks.t
as it was doing metadata heal even when only 2 of 3 bricks were up.

Change-Id: I07a9d62f84ceda329dcab1f02a33aeed258dcb09
fixes: bz#1805097
Signed-off-by: karthik-us &lt;ksubrahm@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/dht: Correct fd processing loop</title>
<updated>2019-12-30T07:11:08+00:00</updated>
<author>
<name>N Balachandran</name>
<email>nbalacha@redhat.com</email>
</author>
<published>2019-10-01T12:07:15+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=ff1eae7f882b8f12380e0c35a9a73b672583cd4c'/>
<id>ff1eae7f882b8f12380e0c35a9a73b672583cd4c</id>
<content type='text'>
The fd processing loops in the
dht_migration_complete_check_task and the
dht_rebalance_inprogress_task functions were unsafe
and could cause an open to be sent on an already freed
fd. This has been fixed.

&gt; Change-Id: I0a3c7d2fba314089e03dfd704f9dceb134749540
&gt; Fixes: bz#1757399
&gt; Signed-off-by: N Balachandran &lt;nbalacha@redhat.com&gt;
&gt; (cherry picked from commit 9b15867070b0cc241ab165886292ecffc3bc0aed)

Change-Id: I0a3c7d2fba314089e03dfd704f9dceb134749540
Fixes: bz#1786983
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The fd processing loops in the
dht_migration_complete_check_task and the
dht_rebalance_inprogress_task functions were unsafe
and could cause an open to be sent on an already freed
fd. This has been fixed.

&gt; Change-Id: I0a3c7d2fba314089e03dfd704f9dceb134749540
&gt; Fixes: bz#1757399
&gt; Signed-off-by: N Balachandran &lt;nbalacha@redhat.com&gt;
&gt; (cherry picked from commit 9b15867070b0cc241ab165886292ecffc3bc0aed)

Change-Id: I0a3c7d2fba314089e03dfd704f9dceb134749540
Fixes: bz#1786983
Signed-off-by: Mohit Agrawal &lt;moagrawa@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Update lock-&gt;good_mask on parent fop failure</title>
<updated>2019-10-30T08:18:19+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2019-08-02T06:35:09+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=b100f7f4dca1971cfdbfb88e0b922bb5885cc7d6'/>
<id>b100f7f4dca1971cfdbfb88e0b922bb5885cc7d6</id>
<content type='text'>
When discard/truncate performs write fop, it should do so
after updating lock-&gt;good_mask to make sure readv happens
on the correct mask

fixes: bz#1739449
Change-Id: Idfef0bbcca8860d53707094722e6ba3f81c583b7
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
When discard/truncate performs write fop, it should do so
after updating lock-&gt;good_mask to make sure readv happens
on the correct mask

fixes: bz#1739449
Change-Id: Idfef0bbcca8860d53707094722e6ba3f81c583b7
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Fix reopen flags to avoid misbehavior</title>
<updated>2019-10-30T08:18:19+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2019-07-29T08:38:37+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=46e83e48ad37bf5c2a54d3bb5aa6e09d3d0382ff'/>
<id>46e83e48ad37bf5c2a54d3bb5aa6e09d3d0382ff</id>
<content type='text'>
Problem:
when a file needs to be re-opened O_APPEND and O_EXCL
flags are not filtered in EC.

- O_APPEND should be filtered because EC doesn't send O_APPEND below EC for
open to make sure writes happen on the individual fragments instead of at the
end of the file.

- O_EXCL should be filtered because shd could have created the file so even
when file exists open should succeed

- O_CREAT should be filtered because open happens with gfid as parameter. So
open fop will create just the gfid which will lead to problems.

Fix:
Filter out these two flags in reopen.

Change-Id: Ia280470fcb5188a09caa07bf665a2a94bce23bc4
Fixes: bz#1739450
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Problem:
when a file needs to be re-opened O_APPEND and O_EXCL
flags are not filtered in EC.

- O_APPEND should be filtered because EC doesn't send O_APPEND below EC for
open to make sure writes happen on the individual fragments instead of at the
end of the file.

- O_EXCL should be filtered because shd could have created the file so even
when file exists open should succeed

- O_CREAT should be filtered because open happens with gfid as parameter. So
open fop will create just the gfid which will lead to problems.

Fix:
Filter out these two flags in reopen.

Change-Id: Ia280470fcb5188a09caa07bf665a2a94bce23bc4
Fixes: bz#1739450
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: Always read from good-mask</title>
<updated>2019-10-30T08:18:18+00:00</updated>
<author>
<name>Pranith Kumar K</name>
<email>pkarampu@redhat.com</email>
</author>
<published>2019-07-18T05:55:31+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=eb1e0b17d292c0f7ceda5256da186d96a364b7f6'/>
<id>eb1e0b17d292c0f7ceda5256da186d96a364b7f6</id>
<content type='text'>
There are cases where fop-&gt;mask may have fop-&gt;healing added
and readv shouldn't be wound on fop-&gt;healing. To avoid this
always wind readv to lock-&gt;good_mask

updates: bz#1739449
Change-Id: I2226ef0229daf5ff315d51e868b980ee48060b87
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
There are cases where fop-&gt;mask may have fop-&gt;healing added
and readv shouldn't be wound on fop-&gt;healing. To avoid this
always wind readv to lock-&gt;good_mask

updates: bz#1739449
Change-Id: I2226ef0229daf5ff315d51e868b980ee48060b87
Signed-off-by: Pranith Kumar K &lt;pkarampu@redhat.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>cluster/ec: inherit healing from lock when it has info</title>
<updated>2019-10-30T08:18:18+00:00</updated>
<author>
<name>Kinglong Mee</name>
<email>kinglongmee@gmail.com</email>
</author>
<published>2019-07-08T13:13:28+00:00</published>
<link rel='alternate' type='text/html' href='http://dev.gluster.org/cgit/glusterfs.git/commit/?id=be7efe61efeb1552c30cb96ece6c11446a913484'/>
<id>be7efe61efeb1552c30cb96ece6c11446a913484</id>
<content type='text'>
If lock has info, fop should inherit healing mask from it.
Otherwise, fop cannot inherit right healing when changed_flags is zero.

Change-Id: Ife80c9169d2c555024347a20300b0583f7e8a87f
updates: bz#1739449
Signed-off-by: Kinglong Mee &lt;mijinlong@horiscale.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
If lock has info, fop should inherit healing mask from it.
Otherwise, fop cannot inherit right healing when changed_flags is zero.

Change-Id: Ife80c9169d2c555024347a20300b0583f7e8a87f
updates: bz#1739449
Signed-off-by: Kinglong Mee &lt;mijinlong@horiscale.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
